Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: GitHub rate limit exceeded for various tools #31919

Open
andybons opened this issue May 8, 2019 · 9 comments

Comments

Projects
None yet
4 participants
@andybons
Copy link
Member

commented May 8, 2019

gopherbot can't make any calls to GitHub and gerritbot logs are showing that our quota is getting drained very quickly.

@bradfitz @dmitshur

@andybons andybons added this to the Unreleased milestone May 8, 2019

@andybons

This comment has been minimized.

Copy link
Member Author

commented May 8, 2019

I've disabled gopherbot, gitmirror, and gerritbot for now. There is a repeated error from gopherbot I'm looking into.

@dmitshur

This comment has been minimized.

Copy link
Member

commented May 8, 2019

repeated error from gopherbot

Issue #28320 might be related.

@gopherbot gopherbot added the Builders label May 8, 2019

@gopherbot

This comment has been minimized.

Copy link

commented May 8, 2019

Change https://golang.org/cl/176037 mentions this issue: internal/gophers: move Filippo's Gerrit email back to being first

@andybons

This comment has been minimized.

Copy link
Member Author

commented May 8, 2019

maintnerd looks to be the reason why our quota keeps getting exhausted, as it got stuck in a loop. We're out of quota again so it's another hour or so at least before that and any other dependent services are available.

@dmitshur

This comment has been minimized.

Copy link
Member

commented May 8, 2019

We've identified this is connected to the incorrect timing information recorded on some GitHub comments posted a few hours ago. GitHub has said the issue has been fixed, and they're in process of correcting inaccurate timestamps:

https://twitter.com/githubstatus/status/1126171459255185414

The current inaccurate timestamps are causing maintnerd to take incorrect actions and use up its rate limit quota. After another hour passes (unless GitHub corrects the incorrect timestamps even sooner), those comments will no longer be in the future and maintnerd should be able to run okay. So we're currently waiting a bit more.

@bradfitz

This comment has been minimized.

Copy link
Member

commented May 8, 2019

@dmitshur, did you figure out what the problem in maintnerd was?

I saw this log line a bunch:

func (p *githubRepoPoller) syncComments(ctx context.Context) error {
        for {
                nums := p.issueNumbersWithStaleCommentSync()
                if len(nums) == 0 {
                        return nil
                }
                remain := len(nums)
                for _, num := range nums {
                        p.logf("comment sync: %d issues remaining; syncing issue %v", remain, num)

But I never saw the log from the code that immediately follows:

                        if err := p.syncCommentsOnIssue(ctx, num); err != nil {
                                p.logf("comment sync on issue %d: %v", num, err)
                                return err
                        }
                        remain--
                }
        }
}

So issueNumbersWithStaleCommentSync kept returning the same number, but then it ran without returning an error, over and over. It is just:

        for n, gi := range p.gr.issues {
                if !gi.commentsSynced() {
                        issueNums = append(issueNums, n)
                }
        }

And commentsSynced is just:

func (gi *GitHubIssue) commentsSynced() bool {
        if gi.NotExist {
                // Issue doesn't exist, so can't sync its non-issues, 
                // so consider it done.
                return true
        }
        return gi.commentsSyncedAsOf.After(gi.Updated)
}

That commentsSyncedAsOf field of GithubIssue is:

        commentsSyncedAsOf time.Time                   // as of server's Date header

That's written to from:

func (c *Corpus) processGithubIssueMutation(m *maintpb.GithubIssueMutation) {
....
        if m.CommentStatus != nil && m.CommentStatus.ServerDate != nil {
                if serverDate, err := ptypes.Timestamp(m.CommentStatus.ServerDate); err == nil {
                        gi.commentsSyncedAsOf = serverDate.UTC()
                }

But why didn't make progress before in ...

func (p *githubRepoPoller) syncCommentsOnIssue(ctx context.Context, issueNum int32) error {

Because of this GitHubIssue field?

    commentsUpdatedTil time.Time       // max comment modtime seen

And syncCommentsOnIssue does:

        since := issue.commentsUpdatedTil
        p.c.mu.RUnlock()

        owner, repo := p.gr.id.Owner, p.gr.id.Repo
        morePages := true // at least try the first. might be empty.                                                                                                                                                    
        for morePages {
                ics, res, err := p.githubDirect.Issues.ListComments(ctx, owner, repo, int(issueNum), &github.IssueListCommentsOptions{
                        Since:       since,
                        Direction:   "asc",
                        Sort:        "updated",
                        ListOptions: github.ListOptions{PerPage: 100},
                })

.... but, uh, where is commentsUpdatedTil ever updated?

bradfitz@go:~/src/golang.org/x/build$ git grep commentsUpdatedTil
maintner/github.go:     commentsUpdatedTil time.Time                   // max comment modtime seen
maintner/github.go:     if gi.commentsUpdatedTil.After(ret) {
maintner/github.go:             ret = gi.commentsUpdatedTil
maintner/github.go:     since := issue.commentsUpdatedTil

Never?

Um.

@dmitshur

This comment has been minimized.

Copy link
Member

commented May 8, 2019

I misread one of the lines as gi.commentsUpdatedTil = ret, so I thought it was possible for it to get updated. You're right, it always has the zero value and never gets set to anything else. That was the case since the CL that added it (CL 38137). This issue must've been elsewhere.

gopherbot pushed a commit to golang/build that referenced this issue May 8, 2019

internal/gophers: move Filippo's Gerrit email back to being first
mergeIDs makes the assumption that the first email seen for a
gopher is the one used for Gerrit. In the last update to gophers.go,
the first email for Filippo was changed to be invalid, causing
errors when automatic assignment was attempted in gopherbot.

Update golang/go#31919

Change-Id: Ief9115d1aed8ab998e92cfed792444d1a6463df8
Reviewed-on: https://go-review.googlesource.com/c/build/+/176037
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@dmitshur

This comment has been minimized.

Copy link
Member

commented May 8, 2019

We're in the process of turning on maintnerd and other related services now.

@dmitshur dmitshur removed the Soon label May 8, 2019

@dmitshur

This comment has been minimized.

Copy link
Member

commented May 8, 2019

We've recovered all services from this outage by now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.