Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retries and performance improvements #9

Merged
merged 6 commits into from
Aug 8, 2022
Merged

Retries and performance improvements #9

merged 6 commits into from
Aug 8, 2022

Conversation

briantist
Copy link
Owner

Closes #7
Closes #8

Related:

We now attempt to read the collection_info property before trying to read the manifest out of the tarball via artifactory (#5), and we fall back to reading it in case the property doesn't exist. This is just a holdover in case there are any collections in artifactory that were uploaded with the first version of galactory (which probably only affects me).

The fallback will be removed in the next version, so find any of those collections, and if they're upstreams-made-local, maybe just delete them and let them get repopulated. If they're local, republish them.

This reduces a potentially expensive connection.

But I went even further in terms of improving performance. As I've used this it's gotten slower and slower as the number of collections increased, and I realized the reason is that when iterating, even when we know the specific collection and version we want, we're kind of just browsing the list, and for each collection we're asking for a stat and its properties, which are two separate HTTP requests, before we can eliminate the collection as a candidate. This was really slowing things down.

So I've added a fast detection mode, on by default (and not user configurable at this time), that uses the naming convention of the file to try to determine the things we can use to prevent furthe requests and skip quickly.

For example, briantist-whatever-0.1.0.tar.gz can be split to namespace: briantist and collection: whatever and version: 0.1.0 and that's enough info to not match, we can skip before making any additional requests.

If the collection isn't eliminated with that, we proceed with the rest of the screening as before, except that I've split the stat and properties requests now, doing stat first, then the conditional that might skip, then ask for properties, then the conditional that might skip on that. So a small improvement there in addition to the above.

In testing, this is a LOT faster, and I think it will also help alleviate some of the connection errors I was seeing that led to #7 and #8.

@briantist briantist added the enhancement New feature or request label Aug 8, 2022
@briantist briantist self-assigned this Aug 8, 2022
@briantist briantist merged commit db35a31 into main Aug 8, 2022
@briantist briantist deleted the retries branch August 8, 2022 23:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add retries to upstream requests Add retries to Artifactory calls
1 participant