Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: gitmirror or maintnerd consumed all our Gerrit quota #23853

Closed
bradfitz opened this issue Feb 15, 2018 · 8 comments

Comments

Projects
None yet
3 participants
@bradfitz
Copy link
Member

commented Feb 15, 2018

Maintner is blocked:

2018/02/15 16:25:01 gerrit go.googlesource.com/sublime-config: sync: git fetch origin: exit status 128, fatal: remote error: Daily ls-remote rate limit exceeded for IP 35.188.125.9
2018/02/15 16:25:01 IS TEMP ERROR? *errors.errorString git fetch origin: exit status 128, fatal: remote error: Daily ls-remote rate limit exceeded for IP 35.188.125.9
2018/02/15 16:25:01 Temporary error from gerrit go.googlesource.com/sublime-config: git fetch origin: exit status 128, fatal: remote error: Daily ls-remote rate limit exceeded for IP 35.188.125.9
2018/02/15 16:25:02 gerrit go.googlesource.com/term: sync: git fetch origin: exit status 128, fatal: remote error: Daily ls-remote rate limit exceeded for IP 35.188.125.9
2018/02/15 16:25:02 IS TEMP ERROR? *errors.errorString git fetch origin: exit status 128, fatal: remote error: Daily ls-remote rate limit exceeded for IP 35.188.125.9
2018/02/15 16:25:02 Temporary error from gerrit go.googlesource.com/term: git fetch origin: exit status 128, fatal: remote error: Daily ls-remote rate limit exceeded for IP 35.188.125.9

So now all bots are down.

GerritBot consumed all of our Gerrit quota.

@andybons, please stop GerritBot and/or increase our Gerrit quota.

@bradfitz bradfitz added the NeedsFix label Feb 15, 2018

@bradfitz bradfitz added this to the Soon milestone Feb 15, 2018

@gopherbot gopherbot added the Builders label Feb 15, 2018

@andybons andybons changed the title x/build: GerritBot consumed all our Gerrit quota and starved maintner x/build: gitmirror or maintnerd consumed all our Gerrit quota Feb 15, 2018

@andybons

This comment has been minimized.

Copy link
Member

commented Feb 15, 2018

Our ls-remote quota has been reset. After investigating with @bradfitz, the requests were coming from either mainternd or gitmirror, not GerritBot.

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Feb 15, 2018

Sorry, I by default blame whatever changed last. I'm still debugging.

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Feb 15, 2018

Looking at logs in the Cloud Logging console (which captures all the GKE pod output), started at "2018-02-15T13:27:16.527663505Z" , or 5:27am Pacific: "2018-02-15 05:27:14.000 PST"

I'll look at logs just before that to see if it's obvious what went crazy.

@andybons

This comment has been minimized.

Copy link
Member

commented Feb 15, 2018

Based on the error (presenting the IP address), we’re not authenticating our git requests somewhere. If we do this then we’ll get better error logging and much higher quota.

@golang golang deleted a comment from timendez Feb 15, 2018

@golang golang deleted a comment from andybons Feb 15, 2018

@golang golang deleted a comment from timendez Feb 15, 2018

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Feb 15, 2018

@andybons, will do. We historically never did, because we were bound to 1 IP address and got high quota for that IP. Now that we bounce around k8s nodes, I'll make them authenticate.

@gopherbot

This comment has been minimized.

Copy link

commented Feb 16, 2018

Change https://golang.org/cl/94836 mentions this issue: internal/gitauth: new package to write out git cookies file

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Feb 16, 2018

From another bug:

We recently upgraded our Kubernetes cluster and our 10 random container jobs re-laid themselves out onto our 4 physical nodes (i.e. 4 egress IP addresses)

We got lucky before and our 3 gerrit-hitting jobs were using different nodes (different IPs).

But after this latest upgrade, all 3 are on the same node, so we burn through that IP's quota by the end of the day.

That's the current theory.

I'm adding auth now.

gopherbot pushed a commit to golang/build that referenced this issue Feb 16, 2018

internal/gitauth: new package to write out git cookies file
Then use it from gitmirror and maintnerd.

Updates golang/go#23853

Change-Id: I8112f004638667894676c04fa218a7ced10422ac
Reviewed-on: https://go-review.googlesource.com/94836
Reviewed-by: Andrew Bonventre <andybons@golang.org>
@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Apr 6, 2018

This happened.

@bradfitz bradfitz closed this Apr 6, 2018

@bradfitz bradfitz added the Soon label May 17, 2018

@golang golang locked and limited conversation to collaborators May 17, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.