-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
connectors-ci: get modified files from PR API #26889
connectors-ci: get modified files from PR API #26889
Conversation
762b28a
to
04b17f5
Compare
This reverts commit 1d6f5bd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pros - faster
Cons - our pipelines are even more resilient on Github... but they already are, so 🤷
@@ -24,6 +24,7 @@ | |||
|
|||
if TYPE_CHECKING: | |||
from ci_connector_ops.pipelines.contexts import ConnectorContext | |||
from github import PullRequest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting some curl/API work in this PR. I guess this just handles it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lib is pretty good at wrapping the GitHub API for typed interaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is dope, I didn't know that the pull request context has a list of changed files. Learn something new every day :D
git_branch: str, git_revision: str, diffed_branch: str, is_local: bool, ci_context: CIContext, pull_request: PullRequest | ||
) -> List[str]: | ||
"""Get the list of modified files in the current git branch. | ||
If the current branch is master, it will return the list of modified files in the current commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Maybe clarify that this "current commit" is the squash commit of the branch whose merge kicked off the run
The reason I mention this is that if we decided to switch from squash commits to a full history (e.g. rebase instead of merge), I'm not sure if this "last commit" logic would work anymore. Its an edge case not worth writing more code for, but a reminder of the context that allows us to do it this way may be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I improved the docstring, thank you for the suggestion.
True, but we can easily fallback on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love some more speed!
What
In our connectors CI pipeline we originally get the list of modified files with
git diff
operation.These operations are quite expensive, especially in the CI because we run the
git diff
operation in a dagger container that is checkouting airbyte's repo.In a pull request context the GitHub API provides the list of modified files in the pull request.
We can use this list instead of running
git diff
to speed up pipelines running on PRs.The current pull request is an attempt at doing so.
I also did an additional operation:
We previously alway ran two different git diff commands: one to get the list of modified files in branch, the other one to get the list of modified files in the last commit.
I slightly changed this logic: we get the list of modified files in the last commits only when we're on master. When we're on a branch we git diff the full branch.
Before on source-pokeapi: total run duration 3min
After change on source-pokeapi: total run duration 2mn
How
get_modified_files
User impact
Faster test run on PR: expect 1 minute speed gain.