Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to prevent 403 when accessing the GitHub API #168

Merged
merged 6 commits into from
Jan 4, 2023
Merged

Try to prevent 403 when accessing the GitHub API #168

merged 6 commits into from
Jan 4, 2023

Conversation

paulo-ferraz-oliveira
Copy link
Collaborator

@paulo-ferraz-oliveira paulo-ferraz-oliveira commented Dec 20, 2022

Closes #167.

Edit: I wanted to open this on my fork, but got duped by GitHub once more and opened it here. It's a draft, for the time being.

  • Make sure we're not doing JSON.parse if the response is not JSON (or otherwise catch that exception to make it more usable for consumers)

@paulo-ferraz-oliveira paulo-ferraz-oliveira changed the title Repeat request on error (while logging it) Repeat HTTP external requests on error Dec 20, 2022
@paulo-ferraz-oliveira paulo-ferraz-oliveira marked this pull request as ready for review December 20, 2022 22:18
@paulo-ferraz-oliveira
Copy link
Collaborator Author

Moving to "Ready to review", to open it to reviewers, but don't merge just yet. I wanna make sure that running CI repeatedly does not surface any issues. (and I'm still missing a self-review)

@kivra-fabber
Copy link

Side note from one user perspective: with this patch applied, we have not seen any setup-beam 403 issues in our CI piplines, of which we have run many dozen (some using the action in multiple jobs) since applying. I believe the GITHUB_TOKEN usage is the thing that did the trick, so the retries are some kind of extra fallback.

If you want to get some testing done on a branch that only makes changes to use the GITHUB_TOKEN I'd be happy to get that tested. Perhaps it can even use the GitHub context to avoid having to thread the secret through the environment an extra time:
https://docs.github.com/en/actions/learn-github-actions/contexts#github-context

@paulo-ferraz-oliveira
Copy link
Collaborator Author

@kivra-fabber, thanks for your feedback. It wouldn't hurt to merge, I guess, and expect further feedback from more consumers. We can even remove the retry bits.

@kivra-fabber
Copy link

kivra-fabber commented Jan 2, 2023

If not for any other reasons I think it would be a simpler fix (just using the GITHUB_TOKEN) to discuss in this PR and merge, and then if we see later that it was not enough we can look at a solution for retrying. If you read it from the GitHub context (${{ github.token }}), the fix is even simpler I think.

I think that separate discussion in the original issue, about not even using the API if the user says the version is "locked" (I think?) also is a good thing™️. But can also be a separate fix with its own discussion.

I won't interfere with how you would like to approach this, but I will happily help test and review the PR if it just uses the token. Testing the retries is a bit harder, and like starbelly I think any retry if added should use the timing information in the response headers.

@starbelly
Copy link
Member

@kivra-fabber, thanks for your feedback. It wouldn't hurt to merge, I guess, and expect further feedback from more consumers. We can even remove the retry bits.

💯 agree that we should just do the github token part, and then work on retry on a separate PR.

@paulo-ferraz-oliveira paulo-ferraz-oliveira changed the title Repeat HTTP external requests on error Prevent 403 when accessing the GitHub API Jan 3, 2023
@paulo-ferraz-oliveira
Copy link
Collaborator Author

@kivra-fabber, I've updated this to get rid of all non-env.-changed -related code. If you want, give it a go.

@kivra-fabber
Copy link

@paulo-ferraz-oliveira nice! I'll get it bumped soon in the repo we had issues in. Given that the first iteration of your fix remediated our problems I'm not going to be as fast on it this time, but I expect we can be testing it this week.

@paulo-ferraz-oliveira
Copy link
Collaborator Author

@lpil, also tagging you here to see if you wanna test this on top of Gleam. 😁

Copy link
Member

@starbelly starbelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -423,7 +423,10 @@ async function get(url0, pageIdxs) {
.get(
url,
{
headers: { 'user-agent': 'setup-beam' },
headers: {
'user-agent': 'setup-beam',
Copy link
Member

@starbelly starbelly Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we a need get_json function I suppose without the header (seemingly) for the content type a 401 will occur.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon some investigation it's not the headers, but it consistently fails for windows, all the other envs are fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange it worked at all. I skipped specifying the GITHUB_TOKEN env var in my gleam fork. So, this will have to be optional. That is, we can only conditionally add the header otherwise we'll definitely break what's out there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we a need get_json function I suppose without the header (seemingly) for the content type a 401 will occur.

I implemented jsonParse which should output the non-JSON, the exception, and error out.

This seems to affect Windows builds, so might not be the best solution,
but we're trying to isolate the issue to move forward
We still throw the error, but in a more controlled manner
Copy link
Member

@starbelly starbelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @paulo-ferraz-oliveira !!!

@paulo-ferraz-oliveira paulo-ferraz-oliveira changed the title Prevent 403 when accessing the GitHub API Try to prevent 403 when accessing the GitHub API Jan 4, 2023
@paulo-ferraz-oliveira paulo-ferraz-oliveira merged commit 37ac1c5 into erlef:main Jan 4, 2023
@paulo-ferraz-oliveira paulo-ferraz-oliveira deleted the fix/too-many-requests-to-github branch January 4, 2023 07:28
@kivra-fabber
Copy link

Oh nice, released already 👍

I think the optional bit inside the JS can be avoided by updating https://github.com/erlef/setup-beam/blob/main/action.yml to pass ${{ github.token }} into the env of the runs script (but I haven't really authored a published GitHub action like this so not sure). As far as I can tell the token is always available there.

@starbelly
Copy link
Member

I think the optional bit inside the JS can be avoided by updating https://github.com/erlef/setup-beam/blob/main/action.yml to pass ${{ github.token }} into the env of the runs script (but I haven't really authored a published GitHub action like this so not sure). As far as I can tell the token is always available there.

I'll double check that, if we can avoid people having to modify their workflow, all the better.

@paulo-ferraz-oliveira
Copy link
Collaborator Author

if we can avoid people having to modify their workflow, all the better

I was unaware the token would sometimes not be present, but maybe I overlooked that since I added it to the tests... 😞

@starbelly
Copy link
Member

@kivra-fabber #172 Thanks for an awesome suggestion 👍

@pguyot pguyot mentioned this pull request May 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Frequent failures when HTTP-calling GitHub
3 participants