Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up dealing with non-reachable git repos #5658

Conversation

deivid-rodriguez
Copy link
Contributor

@deivid-rodriguez deivid-rodriguez commented Sep 8, 2022

While working on a spec failure, I observed the failing spec was also soooooo slow:

Top 1 slowest examples (48.37 seconds, 100.0% of total time):
  Dependabot::EndToEndJob bundler git dependencies updates dependencies correctly
    48.37 seconds ./spec/dependabot/integration_spec.rb:296

Further investigation revealed that this spec was trying to do the following up to 76 times, on a non-existing github repository:

  • Try fetch https://github.com/dependabot/dummy-git-dependency.git/info/refs?service=git-upload-pack.
  • Since the previous fetch failed, try run git ls-remote https://github.com/dependabot/dummy-git-dependency.git locally.

This is done every time the update checker needs to fetch remote git tags, and if this fails, the checker considers that there are no remote tags, but will keep on retrying the same thing and failing once and again.

This commit changes the commit fetcher to memoize the tags the first time they are fetched. With this change, the spec is now much faster:

Top 1 slowest examples (4.26 seconds, 100.0% of total time):
  Dependabot::EndToEndJob bundler git dependencies updates dependencies correctly
    4.26 seconds ./spec/dependabot/integration_spec.rb:296

I expect this to also improve things in production, of course.

@deivid-rodriguez deivid-rodriguez requested a review from a team as a code owner September 8, 2022 12:45
@deivid-rodriguez deivid-rodriguez force-pushed the deivid-rodriguez/improve-performance-when-dealing-with-non-reachable-git-repos branch from 54a1e4a to 1b68436 Compare September 8, 2022 12:46
Copy link
Member

@landongrindheim landongrindheim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes a lot of sense! I'm assuming we're okay with not picking up the (likely) small number of cases where a tag is found on retry.

@deivid-rodriguez
Copy link
Contributor Author

My understanding is that GitDependenciesNotReachable would be a non-retryable error, but we could make the logic more fine grained to make sure: let tag fetcher raise the original exception if its a RepoNotFound error, and memoize only in that case.

However, I'm not sure what the result will be if the update checker gets inconsistent results about the available remote tags 😬.

While working on a spec failure, I observed the failing spec was also
soooooo slow:

```
Top 1 slowest examples (48.37 seconds, 100.0% of total time):
  Dependabot::EndToEndJob bundler git dependencies updates dependencies correctly
    48.37 seconds ./spec/dependabot/integration_spec.rb:296
```

Further investigation revealed that this spec was trying to do the
following up to 76 times, on a non-existing github repository:

* Try fetch `https://github.com/dependabot/dummy-git-dependency.git/info/refs?service=git-upload-pack`.
* Since the previous fetch failed, try run `git ls-remote https://github.com/dependabot/dummy-git-dependency.git` locally.

This is done every time the update checker needs to fetch remote git
tags, and if this fails, the checker considers that there are no remote
tags, but we'll keep on retrying the same thing and failing once and
again.

This commit changes the commit fetcher to memoize the tags the first
time they are fetched. With this change, the spec is now much faster:

```
Top 1 slowest examples (4.26 seconds, 100.0% of total time):
  Dependabot::EndToEndJob bundler git dependencies updates dependencies correctly
    4.26 seconds ./spec/dependabot/integration_spec.rb:296
```

I expect this to also improve things in production, of course.
@deivid-rodriguez deivid-rodriguez force-pushed the deivid-rodriguez/improve-performance-when-dealing-with-non-reachable-git-repos branch from 1b68436 to 4257c57 Compare September 8, 2022 17:40
@deivid-rodriguez deivid-rodriguez merged commit 39f1703 into main Sep 8, 2022
@deivid-rodriguez deivid-rodriguez deleted the deivid-rodriguez/improve-performance-when-dealing-with-non-reachable-git-repos branch September 8, 2022 17:55
@pavera pavera mentioned this pull request Oct 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants