Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub badges showing "Unable to select next Github token from pool" #9978

Closed
jcs090218 opened this issue Feb 18, 2024 · 16 comments
Closed

GitHub badges showing "Unable to select next Github token from pool" #9978

jcs090218 opened this issue Feb 18, 2024 · 16 comments
Labels
operations Hosting, monitoring, and reliability for the production badge servers

Comments

@jcs090218
Copy link

Are you experiencing an issue with...

shields.io

馃悶 Description

Similar to #8907.

馃敆 Link to the badge

My repo https://github.com/emacs-eask/cli getting the error.

馃挕 Possible Solution

No response

@jcs090218 jcs090218 added the question Support questions, usage questions, unconfirmed bugs, discussions, ideas label Feb 18, 2024
@WSTxda
Copy link

WSTxda commented Feb 18, 2024

same issue for me

repo: https://github.com/WSTxda/MicroG-RE

but it's happen randomly

@jcs090218
Copy link
Author

Interesting... the repository https://github.com/emacs-eask/cli appears to no longer have the issue, yet now my other repository https://github.com/jcs090218/JCSUnity (which was previously functional) is encountering the same error. It appears to occur randomly. 馃

@m4heshd
Copy link

m4heshd commented Feb 19, 2024

This seems to be an issue related to GitHub's image caching system ("camo").

If you check https://github.com/m4heshd/ufc-ripper , the first badge is broken but if you visit the actual URL pointing to the image, it works.

@WSTxda
Copy link

WSTxda commented Feb 19, 2024

This seems to be an issue related to GitHub's image caching system ("camo").

If you check https://github.com/m4heshd/ufc-ripper , the first badge is broken but if you visit the actual URL pointing to the image, it works.

Screenshot_2024-02-18-22-08-56-525_com.android.chrome.jpg

Actually this doesn't work in the direct link either

@m4heshd
Copy link

m4heshd commented Feb 19, 2024

Actually this doesn't work in the direct link either

Works for me. 馃

image

Tried both with and without the browser cache.

@WSTxda
Copy link

WSTxda commented Feb 19, 2024

Captura de tela 2024-02-18 221446

desktop also

i think its just randomly

@m4heshd
Copy link

m4heshd commented Feb 19, 2024

@WSTxda Hmm.. it might be request-based. Can you try with your browser's guest profile?

@jcs090218
Copy link
Author

It's very random. I can see some of them work and some of them not... 馃槙

@007revad
Copy link

007revad commented Feb 19, 2024

I just refreshed the readme page for one of my repos for a second it correctly showed the latest release version but then changed back to "Unable to select next GitHub token from pool"

I just tried a different browser and it correctly shows the latest release version.

I also tried an incognito Chrome window and it correctly shows the latest release version.

@pwbriggs
Copy link

pwbriggs commented Feb 19, 2024

I'm seeing this error intermittently, too. Could the token pool be exhausted? (@espadrine)

Whether or not that's the cause of these errors, if you haven't already given shields.io a token, you can do that here: https://img.shields.io/github-auth. not actually the issue, see below.

@chris48s
Copy link
Member

Not exactly sure what happened here, but it looks like 3 of the servers in the cluster got themselves into a state where they hit a token that had maxed its rate limit and weren't able to recover and pull another token from the pool.

This is why you're seeing it intermittently: Depending on what server your request got sent to at the load balancer, you may or may not have seen the issue.

I've just restarted all of the VMs. This should clear up as everything drops out of cache.

We haven't actually run out of tokens. We have many more than we need. Thanks

@chris48s
Copy link
Member

OK. This problem is back. Restarting cleared it for about an hour, but now it looks like most of the VMs in the cluster have hit the same state again. Not sure what is going on, but I'm aware of it.

@gecoombs gecoombs mentioned this issue Feb 19, 2024
7 tasks
@chris48s
Copy link
Member

OK. We're still seeing this issue intermittently.
It first started about 8-9pm UK time yesterday. Up to that point, everything was fine.
We do a lot of traffic on the GH API, but there hasn't been a big increase and the number of tokens we have in the pool should easily cover our usage many times over.
There haven't been any recent changes to the code for managing the token pool.
I'm tempted to assume there has been a change upstream to the GitHub API. One thing we've seen before in the past is API responses intermittently missing x-ratelimit-* headers. However, I've tried pulling down some tokens from the pool and running thousands of requests against my local copy to see if I can get a local repro: nothing :(
For the moment, I am stumped.

@chris48s chris48s added operations Hosting, monitoring, and reliability for the production badge servers and removed question Support questions, usage questions, unconfirmed bugs, discussions, ideas labels Feb 19, 2024
@imfx77
Copy link

imfx77 commented Feb 19, 2024

It's weird, on the same repo some of the badges work, and some don't.
Guess depends to which server goes the request, because when I force refresh the repo page the broken badges change randomly.

image
image

@chris48s
Copy link
Member

chris48s commented Feb 20, 2024

This seems to have magically fixed itself. I didn't change anything, but about 9 hours ago, the Token pool is exhausted errors just.. stopped. The last one happened at Feb 19, 2024 10:04:02 PM UTC

Given this both happened and fixed itself with no intervention from us, I assume this was some kind of incident on the GitHub API side, but I'm still none the wiser what happened or how we could try and be more resilient to it. I never managed to reproduce it.

I'm going to leave this issue open for the moment and continue to monitor it.

@chris48s
Copy link
Member

This hasn't recurred, so I'm going to close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
operations Hosting, monitoring, and reliability for the production badge servers
Projects
None yet
Development

No branches or pull requests

7 participants