Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stablise Python Builds. #4558

Merged
merged 3 commits into from
Jul 6, 2021
Merged

Stablise Python Builds. #4558

merged 3 commits into from
Jul 6, 2021

Conversation

davinchia
Copy link
Contributor

@davinchia davinchia commented Jul 6, 2021

What

Attempt to stablise the python test/publish builds. This is a subset of the changes from #4539 that have been pulled out to keep that PR clean.

A week or so ago we started running into the following errors while trying to test or publish python connectors:

* Where:
Build file '/home/runner/work/airbyte/airbyte/airbyte-integrations/connectors/source-zendesk-talk/build.gradle' line: 2
* What went wrong:
An exception occurred applying plugin request [id: 'airbyte-python']
> Failed to apply plugin 'airbyte-python'.
   > A problem occurred configuring project ':airbyte-integrations:connectors:source-zoom-singer'.
      > Could not open cp_proj generic class cache for build file '/home/runner/work/airbyte/airbyte/airbyte-integrations/connectors/source-zoom-singer/build.gradle' (/home/runner/.gradle/caches/6.7.1/scripts/4ffd4lk4g399iqzx3xoc30xgr).
         > java.lang.StackOverflowError (no error message)

This error is happening when Gradle first configures itself, as it is reading all the various build files and trying to build an internal dependency graph. No real building has happened yet.

At that time, our python builds were running on the ubuntu-latest github runners. Switching to other types of ubuntu runners did not fix the problem.

The best clue I could find was this stackoverflow. The most promising answer here is to increase the heap, which doesn't seem realistic since the Github runners are memory constrained. I was also unable to reproduce this locally or on any ubuntu ec2 machines I spun up.

Since I know gradle configures itself fine on our ec2 runners, I moved the build to run on an ec2 instance instead of the github instances. This solved our Gradle configure issue but presents another problem with the install of ciso8601 (see this), which is a package some of the singer sources use to parse date time. This is because ciso8601 doesn't provide wheels and thus requires very specific C libs to be installed on the instance. The github runners have an extensive list of pre-installed software that fulfills these requirements that our minimal ec2 AMIs do not.

Things I tried and failed:

  • Installing the required binaries on top of our minimal AMI.
  • Installing 'clang' instead of 'gcc', since I noticed the Github instances bundle 'clang'.
  • Trying to generate an identical AMI to the Github instances - sadly they use a custom DSL on top of packer. The current build scripts using Azure and aren't compatible with AWS.

At every point in this process, I validated my hypothesis on a parallel ec2 Ubuntu instance (funny enough, this always worked) built on top of our initial minimal AMI. So I inadvertently created an AMI that contains all the build dependencies Airbyte requires. This is the final AMI we use here. I think this is an okay medium-term solution. The main risk I see from using our own AMI is 1) updates 2) making sure the AMI image can always be built. This ticket contains all this context + is a follow up to address this when we have more time.

This exercise has made me realise we have 3 distinct build runtimes:

  1. Java code - natively handled by Gradle so 'just works'.
  2. Python CDK connectors - all dependencies are explicitly stated so easy to manage.
  3. Python Singer connectors - dependencies are pulled in during compile time so tough to manage.

I've ran a test from each connector 'type' to prove this set up works. Note: the shopify test is failing from actual failures and not compile/build failures.

Added follows up in #4559 .

Pre-merge Checklist

Expand the checklist which is relevant for this PR.

Connector checklist

  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • Secrets are annotated with airbyte_secret in the connector's spec
  • Credentials added to Github CI if needed and not already present. instructions for injecting secrets into CI.
  • Unit & integration tests added as appropriate (and are passing)
    • Community members: please provide proof of this succeeding locally e.g: screenshot or copy-paste acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • /test connector=connectors/<name> command as documented here is passing.
    • Community members can skip this, Airbyters will run this for you.
  • Code reviews completed
  • Documentation updated
    • README.md
    • docs/SUMMARY.md if it's a new connector
    • Created or updated reference docs in docs/integrations/<source or destination>/<name>.
    • Changelog in the appropriate page in docs/integrations/.... See changelog example
    • docs/integrations/README.md contains a reference to the new connector
    • Build status added to build page
  • Build is successful
  • Connector version bumped like described here
  • New Connector version released on Dockerhub by running the /publish command described here
  • No major blockers
  • PR merged into master branch
  • Follow up tickets have been created
  • Associated tickets have been closed & stakeholders notified

Connector Generator checklist

  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed.

@davinchia
Copy link
Contributor Author

davinchia commented Jul 6, 2021

/test connector=connectors/source-intercom-singer

🕑 connectors/source-intercom-singer https://github.com/airbytehq/airbyte/actions/runs/1003130737
✅ connectors/source-intercom-singer https://github.com/airbytehq/airbyte/actions/runs/1003130737

@davinchia
Copy link
Contributor Author

davinchia commented Jul 6, 2021

/test connector=connectors/destination-postgres

🕑 connectors/destination-postgres https://github.com/airbytehq/airbyte/actions/runs/1003158445
✅ connectors/destination-postgres https://github.com/airbytehq/airbyte/actions/runs/1003158445

@davinchia
Copy link
Contributor Author

davinchia commented Jul 6, 2021

/test connector=connectors/source-shopify

🕑 connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/1003161705
❌ connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/1003161705

@davinchia davinchia marked this pull request as ready for review July 6, 2021 04:42
echo "Building $path"
./gradlew --no-daemon -no-build-cache "$(_to_gradle_path "$path" clean)"
./gradlew --no-daemon -no-build-cache "$(_to_gradle_path "$path" build)"
./gradlew --no-daemon "$(_to_gradle_path "$path" clean)"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't affect anything, since the connectors test/publish builds aren't using any caching and are always started on a new instance. I feel that putting this here is more confusing than not so I'm removing it.

@davinchia davinchia merged commit c83b134 into master Jul 6, 2021
@davinchia davinchia deleted the davinchia/fix-python-build branch July 6, 2021 05:53
@davinchia davinchia mentioned this pull request Jul 6, 2021
26 tasks
@jrhizor
Copy link
Contributor

jrhizor commented Jul 6, 2021

Thanks @davinchia

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants