Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
x/website: elevated rate of 502s #30619
I am seeing a relatively high frequently of 502s on the golang.org website:
It seems to be happening on previously deployed versions of golang.org, not just the current one.
The last error that https://tip.golang.org/_tipstatus ran into is also 502 related, but when cloning from go.googlesource.com/go:
I can reliably get frequent 502s from golang.org and its previously deployed versions right now, but not other sites like tour.golang.org, so it seems contained to the main website from what I can tell so far.
I remember last time something like this happened it was because the website was misconfigured regarding its use of index, and ended up pegging the CPU at 100%, etc. That's just for reference; I don't know what's happening now, so this needs more investigation.
Edit: Another recent possibly related issue on my mind is from CL 141718.
From the GCP console, I see there's now about one 5xx code out of every 50 responses (so 2%). The elevated 5xx rates began at roughly 9:06 PM eastern and have stayed consistently there.
Edit: I can also see that CPU usage, memory usage, traffic, etc., are all normal. Only the 99% percentile latency has gone up at the same time as the 502 rate went up.
I've restarted the instance and it seems better now; I can't reproduce the 502s anymore.
It's still not as smooth as pre-9:06 pm; some occasional requests to
Edit: As of 12:36 am eastern, the rate of 502s and latency have returned to their nominal pre-9:06 pm levels and have stayed there since.
We've done more investigation here, and it turned out the root cause was external to the golang.org server. It was a temporary issue affecting another networking component that has since been resolved.
I've watched the golang.org server, and other than the elevated 502 rate during the affected period on Tuesday night (9:06 pm to 12:36 am), the issue has not re-occurred:
Closing, since there's nothing more to do here. Huge thanks to @broady for helping investigate and uncovering the source of the problem!