Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: GitHub rate limit exceeded for various tools #31919

Closed
andybons opened this issue May 8, 2019 · 11 comments
Closed

x/build: GitHub rate limit exceeded for various tools #31919

andybons opened this issue May 8, 2019 · 11 comments
Assignees
Milestone

Comments

@andybons
Copy link
Member

@andybons andybons commented May 8, 2019

gopherbot can't make any calls to GitHub and gerritbot logs are showing that our quota is getting drained very quickly.

@bradfitz @dmitshur

@andybons andybons added this to the Unreleased milestone May 8, 2019
@andybons
Copy link
Member Author

@andybons andybons commented May 8, 2019

I've disabled gopherbot, gitmirror, and gerritbot for now. There is a repeated error from gopherbot I'm looking into.

@dmitshur
Copy link
Member

@dmitshur dmitshur commented May 8, 2019

repeated error from gopherbot

Issue #28320 might be related.

@gopherbot gopherbot added the Builders label May 8, 2019
@gopherbot
Copy link

@gopherbot gopherbot commented May 8, 2019

Change https://golang.org/cl/176037 mentions this issue: internal/gophers: move Filippo's Gerrit email back to being first

@andybons
Copy link
Member Author

@andybons andybons commented May 8, 2019

maintnerd looks to be the reason why our quota keeps getting exhausted, as it got stuck in a loop. We're out of quota again so it's another hour or so at least before that and any other dependent services are available.

@dmitshur
Copy link
Member

@dmitshur dmitshur commented May 8, 2019

We've identified this is connected to the incorrect timing information recorded on some GitHub comments posted a few hours ago. GitHub has said the issue has been fixed, and they're in process of correcting inaccurate timestamps:

https://twitter.com/githubstatus/status/1126171459255185414

The current inaccurate timestamps are causing maintnerd to take incorrect actions and use up its rate limit quota. After another hour passes (unless GitHub corrects the incorrect timestamps even sooner), those comments will no longer be in the future and maintnerd should be able to run okay. So we're currently waiting a bit more.

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented May 8, 2019

@dmitshur, did you figure out what the problem in maintnerd was?

I saw this log line a bunch:

func (p *githubRepoPoller) syncComments(ctx context.Context) error {
        for {
                nums := p.issueNumbersWithStaleCommentSync()
                if len(nums) == 0 {
                        return nil
                }
                remain := len(nums)
                for _, num := range nums {
                        p.logf("comment sync: %d issues remaining; syncing issue %v", remain, num)

But I never saw the log from the code that immediately follows:

                        if err := p.syncCommentsOnIssue(ctx, num); err != nil {
                                p.logf("comment sync on issue %d: %v", num, err)
                                return err
                        }
                        remain--
                }
        }
}

So issueNumbersWithStaleCommentSync kept returning the same number, but then it ran without returning an error, over and over. It is just:

        for n, gi := range p.gr.issues {
                if !gi.commentsSynced() {
                        issueNums = append(issueNums, n)
                }
        }

And commentsSynced is just:

func (gi *GitHubIssue) commentsSynced() bool {
        if gi.NotExist {
                // Issue doesn't exist, so can't sync its non-issues, 
                // so consider it done.
                return true
        }
        return gi.commentsSyncedAsOf.After(gi.Updated)
}

That commentsSyncedAsOf field of GithubIssue is:

        commentsSyncedAsOf time.Time                   // as of server's Date header

That's written to from:

func (c *Corpus) processGithubIssueMutation(m *maintpb.GithubIssueMutation) {
....
        if m.CommentStatus != nil && m.CommentStatus.ServerDate != nil {
                if serverDate, err := ptypes.Timestamp(m.CommentStatus.ServerDate); err == nil {
                        gi.commentsSyncedAsOf = serverDate.UTC()
                }

But why didn't make progress before in ...

func (p *githubRepoPoller) syncCommentsOnIssue(ctx context.Context, issueNum int32) error {

Because of this GitHubIssue field?

    commentsUpdatedTil time.Time       // max comment modtime seen

And syncCommentsOnIssue does:

        since := issue.commentsUpdatedTil
        p.c.mu.RUnlock()

        owner, repo := p.gr.id.Owner, p.gr.id.Repo
        morePages := true // at least try the first. might be empty.                                                                                                                                                    
        for morePages {
                ics, res, err := p.githubDirect.Issues.ListComments(ctx, owner, repo, int(issueNum), &github.IssueListCommentsOptions{
                        Since:       since,
                        Direction:   "asc",
                        Sort:        "updated",
                        ListOptions: github.ListOptions{PerPage: 100},
                })

.... but, uh, where is commentsUpdatedTil ever updated?

bradfitz@go:~/src/golang.org/x/build$ git grep commentsUpdatedTil
maintner/github.go:     commentsUpdatedTil time.Time                   // max comment modtime seen
maintner/github.go:     if gi.commentsUpdatedTil.After(ret) {
maintner/github.go:             ret = gi.commentsUpdatedTil
maintner/github.go:     since := issue.commentsUpdatedTil

Never?

Um.

@dmitshur
Copy link
Member

@dmitshur dmitshur commented May 8, 2019

I misread one of the lines as gi.commentsUpdatedTil = ret, so I thought it was possible for it to get updated. You're right, it always has the zero value and never gets set to anything else. That was the case since the CL that added it (CL 38137). This issue must've been elsewhere.

gopherbot pushed a commit to golang/build that referenced this issue May 8, 2019
mergeIDs makes the assumption that the first email seen for a
gopher is the one used for Gerrit. In the last update to gophers.go,
the first email for Filippo was changed to be invalid, causing
errors when automatic assignment was attempted in gopherbot.

Update golang/go#31919

Change-Id: Ief9115d1aed8ab998e92cfed792444d1a6463df8
Reviewed-on: https://go-review.googlesource.com/c/build/+/176037
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@dmitshur
Copy link
Member

@dmitshur dmitshur commented May 8, 2019

We're in the process of turning on maintnerd and other related services now.

@dmitshur dmitshur removed the Soon label May 8, 2019
@dmitshur
Copy link
Member

@dmitshur dmitshur commented May 8, 2019

We've recovered all services from this outage by now.

@gopherbot
Copy link

@gopherbot gopherbot commented Sep 12, 2019

Change https://golang.org/cl/195062 mentions this issue: internal/gophers: restore valid Gerrit emails (again)

gopherbot pushed a commit to golang/build that referenced this issue Sep 18, 2019
This is a cherry-pick of CL 135456 that restores Gerrit emails
for some people that were incorrectly changed in CL 165639, with
manual no-op addPerson line merges to address code review comments.

The cherry-pick applied very cleanly with just two minor merge
conflicts: one due to a Filippo's email already being fixed in
CL 176037, and another due to a close new entry.

Add tests to catch this from happening again, until the source
of the problem is resolved in issue golang/go#34259.

Updates golang/go#34259
Updates golang/go#28320
Updates golang/go#31919
Updates golang/go#27517
Updates golang/go#27631

Change-Id: Ia03a2b94403334d3f571ac5623e12d3bfd6f1e4f
Reviewed-on: https://go-review.googlesource.com/c/build/+/195062
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
codebien added a commit to codebien/build that referenced this issue Nov 13, 2019
This is a cherry-pick of CL 135456 that restores Gerrit emails
for some people that were incorrectly changed in CL 165639, with
manual no-op addPerson line merges to address code review comments.

The cherry-pick applied very cleanly with just two minor merge
conflicts: one due to a Filippo's email already being fixed in
CL 176037, and another due to a close new entry.

Add tests to catch this from happening again, until the source
of the problem is resolved in issue golang/go#34259.

Updates golang/go#34259
Updates golang/go#28320
Updates golang/go#31919
Updates golang/go#27517
Updates golang/go#27631

Change-Id: Ia03a2b94403334d3f571ac5623e12d3bfd6f1e4f
Reviewed-on: https://go-review.googlesource.com/c/build/+/195062
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@dmitshur
Copy link
Member

@dmitshur dmitshur commented Feb 27, 2020

This specific outage has been resolved, and this issue isn't actionable, so I'll close it. We can open new issues if there is specific work that needs to be done.

@dmitshur dmitshur closed this Feb 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.