Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

airbyte-ci: disable gradle step caching on source-postgres #31332

Conversation

alafanechere
Copy link
Contributor

@alafanechere alafanechere commented Oct 12, 2023

What

We want to disable the dagger/gradle caching on source-postgres gradle steps in order to collect unbiased CI performance metrics for source-postgres.
This change will be reverted once the project is over.

How

  • Burst dagger layers in GradleTask with a CACHEBUSTER env var. The layers are bursted on each new pipeline, but not on each gradle task execution within the pipeline
  • Create pipeline specific gradle-cache volume name to guarantee that the gradle remote cache volume (powered by dagger) is never used for source-postgres.

@alafanechere alafanechere requested a review from a team as a code owner October 12, 2023 09:09
@vercel
Copy link

vercel bot commented Oct 12, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Oct 12, 2023 2:23pm

Copy link
Contributor Author

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@github-actions
Copy link
Contributor

Coverage report for source-postgres

There is no coverage information present for the Files changed

Total Project Coverage 71.72% 🍏

@airbyte-oss-build-runner
Copy link
Collaborator

source-postgres test report (commit 5ade428ff0) - ✅

⏲️ Total pipeline duration: 29mn27s

Step Result
Build connector tar
Build source-postgres docker image for platform(s) linux/x86_64
Java Connector Unit Tests
Java Connector Integration Tests
Acceptance tests
Validate metadata for source-postgres
Connector version semver check
QA checks

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-postgres test

@alafanechere alafanechere force-pushed the augustin/10-12-airbyte-ci_disable_gradle_step_caching_on_source-postgres branch from 5ade428 to 50e8ef9 Compare October 12, 2023 11:21
Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer a smaller and less risky change, which is going to make it easer to undo after Q4a.

return self.context.dagger_client.cache_volume("gradle-cache")
# TODO: remove this once we finish the project to boost source-postgres CI performance.
# We should use a static gradle-cache volume name.
cache_volume_name = hacks.get_gradle_cache_volume_name(self.context, self.logger)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that these volumes outlive the pipeline execution (which I have doubts about) isn't this dangerous? Won't this leave a bunch of garbage lying around in the dagger engine?

Irrespective of whether I'm right or not about cache volume persistence, it's simpler to reimplement this PR by modifying _get_gradle_command instead:

    def _get_gradle_command(self, task: str) -> List[str]:
        return sh_dash_c(
            [
                # The gradle command is chained in between a couple of rsyncs which load from- and store to the cache volume.
                # TODO: remove test after Q4a2023 (github issue #123456)
                f"(test '{self.context.connector.technical_name}' != 'source-postgres' && rsync -a --stats /root/gradle-cache/ /root/.gradle || true)",
                f"./gradlew {' '.join(self.DEFAULT_GRADLE_TASK_OPTIONS)} {task}",
                f"(test '{self.context.connector.technical_name}' != 'source-postgres' && rsync -a --stats /root/.gradle/ /root/gradle-cache || true)",
            ]
        )

I can't foresee us wanting to disable caching for anything other than source-postgres.

Copy link
Contributor Author

@alafanechere alafanechere Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we want to avoid to get reproducible metrics is the persistent (remote) caching which could potentially make other runs feed the cache and bias the results on master.
If we use a completely different cache volume while keeping rsync we get unbiased performance with intra-pipeline gradle caching which is working correctly today.

The volumes will outlive pipeline execution, but the dagger-engine on our infra is run on a K8S that is not supposed to idle forever. In other word we have disk renewal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're right. OK!

@@ -395,6 +395,7 @@ This command runs the Python tests for a airbyte-ci poetry package.
## Changelog
| Version | PR | Description |
| ------- | ---------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- |
| 1.7.1 | [#TBD](https://github.com/airbytehq/airbyte/pull/TBD) | Disable Gradle step caching on source-postgres. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/TBD/31332/

@alafanechere alafanechere merged commit fd57881 into master Oct 12, 2023
20 checks passed
@alafanechere alafanechere deleted the augustin/10-12-airbyte-ci_disable_gradle_step_caching_on_source-postgres branch October 12, 2023 16:29
ariesgun pushed a commit to ariesgun/airbyte that referenced this pull request Oct 20, 2023
ariesgun pushed a commit to ariesgun/airbyte that referenced this pull request Oct 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants