Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Github graphql api to avoid rate limiting #104

Merged
merged 7 commits into from
Feb 8, 2021

Conversation

cognifloyd
Copy link
Member

This adjusts the index regeneration so that it retrieves the list of all
pack versions (repo tags) in as few API calls as possible.

At 1 call per pack per pack deployment, multiple pack deployments can
easily hit the rate limit. Using graphql reduces the number of calls
that count toward the rate limit (today) to 2 per pack deployment.
Tags are apparently a very inexpensive API request, so github is only
counting 1 call per page. More complex queries can count as more than
one call per page, so this is very fortunate for our use case.

Graphql limits each page to 100 results, so with about 160 packs, that
is only 2 pages. So, we won't hit a 3rd page until we get over 200
packs.

This adjusts the index regeneration so that it retrieves the list of all
pack versions (repo tags) in as few API calls as possible.

At 1 call per pack per pack deployment, multiple pack deployments can
easily hit the rate limit. Using graphql reduces the number of calls
that count toward the rate limit (today) to 2 per pack deployment.
Tags are apparently a very inexpensive API request, so github is only
counting 1 call per page. More complex queries can count as more than
one call per page, so this is very fortunate for our use case.

Graphql limits each page to 100 results, so with about 160 packs, that
is only 2 pages. So, we won't hit a 3rd page until we get over 200
packs.
@cognifloyd
Copy link
Member Author

I've tested my changes manually. Both get_available_versions and get_available_versions_for_pack work.
The only way to test this E2E is to merge it and then deploy a pack update that triggers index regeneration.

@cognifloyd
Copy link
Member Author

I've tested the gh install block and modified it so it works.
I haven't tested E2E yet, as my tests are failing due to CircleCI issues.

@cognifloyd cognifloyd force-pushed the use-graphql-for-index branch 2 times, most recently from 4f9766f to 9808cb2 Compare February 7, 2021 03:24
Also fix git clone "fatal: Too many arguments." error
.circle/deployment Outdated Show resolved Hide resolved
Using the dependencies job requires updating .circleci/config.yml in all
packs, so we continue installing it in the dependencies script until it
has been deployed everywhere and is thus safe to remove.
Copy link
Contributor

@blag blag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small change. Otherwise looks good.

.circle/index.py Outdated Show resolved Hide resolved
@cognifloyd cognifloyd force-pushed the use-graphql-for-index branch 2 times, most recently from 1f1acf0 to d36f7a3 Compare February 8, 2021 20:34
Co-authored-by: blag <blag@users.noreply.github.com>
@blag blag merged commit 41391a7 into StackStorm-Exchange:master Feb 8, 2021
@arm4b
Copy link
Member

arm4b commented Feb 9, 2021

The new CI error messages we're getting for Exchange Packs after this PR:

+ /home/circleci/virtualenv/bin/python /home/circleci/ci/utils/pack_content.py --input . --output /home/circleci/index/v1/packs/jmx
++ git -C /home/circleci/index status -s
+ [[ -n  M v1/packs/jmx/pack.yaml ]]
+ [[ 0 == \1 ]]
+ /home/circleci/virtualenv/bin/python /home/circleci/ci/.circle/index.py --glob '~/index/v1/packs/*/pack.yaml' --output '~/index/v1/'
gh: Bad credentials (HTTP 401)
Traceback (most recent call last):
  File "/home/circleci/ci/.circle/index.py", line 217, in <module>
    get_available_versions()
  File "/home/circleci/ci/.circle/index.py", line 166, in get_available_versions
    repos = page["data"]["repositoryOwner"]["repositories"]["nodes"]
KeyError: 'data'

@blag Is that because of expired creds?

@cognifloyd I guess we need a bit better error handling there.

@blag
Copy link
Contributor

blag commented Feb 9, 2021

Yes it is. And yes we do.

@blag
Copy link
Contributor

blag commented Feb 11, 2021

Captured in #109.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants