Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added caching mechanism for Galaxy API requests #71904

Merged
merged 8 commits into from Nov 9, 2020

Conversation

jborean93
Copy link
Contributor

@jborean93 jborean93 commented Sep 24, 2020

SUMMARY

Try to cache the results when calling the Galaxy API to speed up the install command. A lot of the time in this step is spent querying the available versions, as well as the metadata for each collection. By caching this information we should dramatically decrease the time it takes to install a collection, especially as time goes on and more versions are added.

For example here is the time difference for just community.general (will be a popular collection)

# devel
$ time ansible-galaxy collection install community.general --force-with-deps
Starting galaxy collection install process
Process install dependency map
Starting collection install process
Installing 'community.general:1.1.0' to '/home/jborean/.ansible/collections/ansible_collections/community/general'
Downloading https://galaxy.ansible.com/download/community-general-1.1.0.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178230ikw_3f2n/tmp296nllvf
community.general (1.1.0) was installed successfully
Installing 'google.cloud:1.0.0' to '/home/jborean/.ansible/collections/ansible_collections/google/cloud'
Downloading https://galaxy.ansible.com/download/google-cloud-1.0.0.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178230ikw_3f2n/tmp296nllvf
google.cloud (1.0.0) was installed successfully
Installing 'ansible.posix:1.1.1' to '/home/jborean/.ansible/collections/ansible_collections/ansible/posix'
Downloading https://galaxy.ansible.com/download/ansible-posix-1.1.1.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178230ikw_3f2n/tmp296nllvf
ansible.posix (1.1.1) was installed successfully
Installing 'ansible.netcommon:1.2.1' to '/home/jborean/.ansible/collections/ansible_collections/ansible/netcommon'
Downloading https://galaxy.ansible.com/download/ansible-netcommon-1.2.1.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178230ikw_3f2n/tmp296nllvf
ansible.netcommon (1.2.1) was installed successfully
Installing 'community.kubernetes:1.0.0' to '/home/jborean/.ansible/collections/ansible_collections/community/kubernetes'
Downloading https://galaxy.ansible.com/download/community-kubernetes-1.0.0.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178230ikw_3f2n/tmp296nllvf
community.kubernetes (1.0.0) was installed successfully

real    0m56.223s
user    0m9.620s
sys     0m0.441s

# this branch first run
$ time ansible-galaxy collection install community.general --force-with-deps
Starting galaxy collection install process
Process install dependency map
Starting collection install process
Installing 'community.general:1.1.0' to '/home/jborean/.ansible/collections/ansible_collections/community/general'
Downloading https://galaxy.ansible.com/download/community-general-1.1.0.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178569i9r2y68o/tmp2um7ipg0
community.general (1.1.0) was installed successfully
Installing 'google.cloud:1.0.0' to '/home/jborean/.ansible/collections/ansible_collections/google/cloud'
Downloading https://galaxy.ansible.com/download/google-cloud-1.0.0.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178569i9r2y68o/tmp2um7ipg0
google.cloud (1.0.0) was installed successfully
Installing 'ansible.posix:1.1.1' to '/home/jborean/.ansible/collections/ansible_collections/ansible/posix'
Downloading https://galaxy.ansible.com/download/ansible-posix-1.1.1.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178569i9r2y68o/tmp2um7ipg0
ansible.posix (1.1.1) was installed successfully
Installing 'ansible.netcommon:1.2.1' to '/home/jborean/.ansible/collections/ansible_collections/ansible/netcommon'
Downloading https://galaxy.ansible.com/download/ansible-netcommon-1.2.1.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178569i9r2y68o/tmp2um7ipg0
ansible.netcommon (1.2.1) was installed successfully
Installing 'community.kubernetes:1.0.0' to '/home/jborean/.ansible/collections/ansible_collections/community/kubernetes'
Downloading https://galaxy.ansible.com/download/community-kubernetes-1.0.0.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178569i9r2y68o/tmp2um7ipg0
community.kubernetes (1.0.0) was installed successfully

real    0m58.259s
user    0m9.558s
sys     0m0.347s

# this branch 2nd run
$ time ansible-galaxy collection install community.general --force-with-deps
Starting galaxy collection install process
Process install dependency map
Starting collection install process
Installing 'community.general:1.1.0' to '/home/jborean/.ansible/collections/ansible_collections/community/general'
Downloading https://galaxy.ansible.com/download/community-general-1.1.0.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178855x84i97ly/tmpcr6_7xuo
community.general (1.1.0) was installed successfully
Installing 'google.cloud:1.0.0' to '/home/jborean/.ansible/collections/ansible_collections/google/cloud'
Downloading https://galaxy.ansible.com/download/google-cloud-1.0.0.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178855x84i97ly/tmpcr6_7xuo
google.cloud (1.0.0) was installed successfully
Installing 'ansible.posix:1.1.1' to '/home/jborean/.ansible/collections/ansible_collections/ansible/posix'
Downloading https://galaxy.ansible.com/download/ansible-posix-1.1.1.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178855x84i97ly/tmpcr6_7xuo
ansible.posix (1.1.1) was installed successfully
Installing 'ansible.netcommon:1.2.1' to '/home/jborean/.ansible/collections/ansible_collections/ansible/netcommon'
Downloading https://galaxy.ansible.com/download/ansible-netcommon-1.2.1.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178855x84i97ly/tmpcr6_7xuo
ansible.netcommon (1.2.1) was installed successfully
Installing 'community.kubernetes:1.0.0' to '/home/jborean/.ansible/collections/ansible_collections/community/kubernetes'
Downloading https://galaxy.ansible.com/download/community-kubernetes-1.0.0.tar.gz to /home/jborean/.ansible/tmp/ansible-local-178855x84i97ly/tmpcr6_7xuo
community.kubernetes (1.0.0) was installed successfully

real    0m36.794s
user    0m9.300s
sys     0m0.412s

We could even speed this up further by caching the actual collection artifacts but I'm unsure whether we actually want that behaviour or not.

ISSUE TYPE
  • Feature Pull Request
COMPONENT NAME

ansible-galaxy

@ansibot ansibot added affects_2.11 core_review In order to be merged, this PR must follow the core review workflow. feature This issue/PR relates to a feature request. needs_triage Needs a first human triage before being processed. support:core This issue/PR relates to code supported by the Ansible Engineering Team. labels Sep 24, 2020
@ansibot

This comment has been minimized.

@ansibot ansibot added needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. and removed core_review In order to be merged, this PR must follow the core review workflow. labels Sep 24, 2020
@samdoran samdoran removed the needs_triage Needs a first human triage before being processed. label Sep 24, 2020
@ansibot ansibot added core_review In order to be merged, this PR must follow the core review workflow. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. support:community This issue/PR relates to code supported by the Ansible community. and removed needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. core_review In order to be merged, this PR must follow the core review workflow. labels Sep 25, 2020
@jborean93 jborean93 marked this pull request as ready for review September 26, 2020 03:57
@jborean93
Copy link
Contributor Author

I've had another look at the latest_version/highest_version mechanism and it looks like Galaxy and AH do sort this value so it's not just the last one uploaded. This means the version check to see whether we should invalid the cache won't work if an older semver was published. I still think it's worth caching the data and just rely on the 24 hour fallback or a manual removal of the cache for this scenario. It shouldn't be too common for an older version to be published after a newer one.

Copy link
Contributor

@s-hertel s-hertel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I'd prefer not to cache artifacts, or at least not by default.

lib/ansible/cli/galaxy.py Outdated Show resolved Hide resolved
lib/ansible/galaxy/api.py Outdated Show resolved Hide resolved
lib/ansible/galaxy/api.py Outdated Show resolved Hide resolved
@ansibot ansibot added needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. core_review In order to be merged, this PR must follow the core review workflow. and removed core_review In order to be merged, this PR must follow the core review workflow. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. labels Oct 1, 2020
@ansibot ansibot added the stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. label Oct 14, 2020
Copy link
Contributor

@s-hertel s-hertel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@ansibot ansibot removed the stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. label Oct 29, 2020
@ansibot ansibot added the needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. label Oct 29, 2020
@samdoran samdoran added the ci_verified Changes made in this PR are causing tests to fail. label Oct 31, 2020
@jborean93
Copy link
Contributor Author

After investigating the test failures there are a few problems that need to be solved before this can be merged in:

  • Pulp has a bug in the updated_at field where the value incorrectly represents the date from the first published collection
    • This causes the test failure
    • Automation Hub is currently using a version of Pulp that is affected by this
    • Pulp 0.5.0 should have fixed this problem but it isn't included in any galaxy_ng releases yet
    • This needs to be present in galaxy_ng, Automation Hub, and the pulp-oci-images container before we can use this field
  • Even with the update from above the updated_at field only represents the update date of the highest collection version

The first could be bypassed by using the latest/highest_version field as that works as expected but that has the same issue as the 2nd point. Publishing a collection at an older version will not be picked up resulting in the cli not knowing about it until the TTL (24 hours) is up. I want to wait an see if both of these issues will be fixed anytime soon in all iterations of the API before trying to merge something in.

@chouseknecht
Copy link
Contributor

@jborean93

pulp_ansible 0.5.0 is included in the 4.2.0rc2 release of galaxy_ng. It can be seen referenced in setup.py.

Should land on c.rh.c by the end of the week, or end of next week at the latest.

@jborean93
Copy link
Contributor Author

Thanks for the info, I’ll look at getting the pulp oci contain images updated (if they haven’t been already) so it uses the rc2 release and then test again in CI.

@jborean93
Copy link
Contributor Author

Has been updated pulp/pulp-oci-images#34, will need to update ansible-test to pull in the new image once it's been built.

@jborean93
Copy link
Contributor Author

Looks like Pulp 0.5.0 has fixed the updated_at issue. #72469 updates the image used in CI which will get these tests working again.

lib/ansible/galaxy/api.py Show resolved Hide resolved
lib/ansible/galaxy/api.py Outdated Show resolved Hide resolved
lib/ansible/galaxy/api.py Show resolved Hide resolved
test/units/galaxy/test_api.py Outdated Show resolved Hide resolved
test/units/galaxy/test_api.py Outdated Show resolved Hide resolved
@ansibot ansibot added core_review In order to be merged, this PR must follow the core review workflow. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. and removed ci_verified Changes made in this PR are causing tests to fail. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. core_review In order to be merged, this PR must follow the core review workflow. labels Nov 4, 2020
@jborean93
Copy link
Contributor Author

The updated_at logic has been changed with pulp/pulp_ansible#416 and backported with pulp/pulp_ansible#420. This makes sure the updated_at field shows the time when the collection has been last modified and not just when a new highest version has been uploaded.

@jborean93
Copy link
Contributor Author

The changes required to make this work in Pulp are now in place and CI has been updated to test this properly. I'm going to merge this in so we can get some more people to try it out and hopefully get some more feedback.

@jborean93 jborean93 merged commit de5858f into ansible:devel Nov 9, 2020
1 check passed
@jborean93 jborean93 deleted the galaxy-caching branch November 9, 2020 20:50
@ansible ansible locked and limited conversation to collaborators Dec 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
affects_2.11 core_review In order to be merged, this PR must follow the core review workflow. feature This issue/PR relates to a feature request. support:community This issue/PR relates to code supported by the Ansible community. support:core This issue/PR relates to code supported by the Ansible Engineering Team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants