Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/cmd/coordinator: use maintner to start trybots, not polling gerrit API #20806

Open
bradfitz opened this issue Jun 27, 2017 · 4 comments

Comments

@bradfitz
Copy link
Member

commented Jun 27, 2017

Currently Trybots start by a goroutine loop in the coordinator (findTryWorkLoop) running a Gerrit query every 60 seconds:

func findTryWorkLoop() {
        if errTryDeps != nil {
                return
        }       
        ticker := time.NewTicker(60 * time.Second)
        for {
                if err := findTryWork(); err != nil {
                        log.Printf("failed to find trybot work: %v", err)
                }
                <-ticker.C
        }
}

func findTryWork() error {
        query := "label:Run-TryBot=1 label:TryBot-Result=0 status:open"
...
        cis, err := gerritClient.QueryChanges(context.Background(), query, gerrit.QueryChangesOpt{
                Fields: []string{"CURRENT_REVISION", "CURRENT_COMMIT", "CURRENT_FILES"},
        })
...

This sucks for two reasons:

  1. It runs every 60 seconds, wasting on average 30 seconds and at worst 60 seconds of our overall 5 minute goal for Trybots.
  2. If a new CL is uploaded in the meantime, Gerrit removes +1 votes and the CL is treated as if it's no longer desired to be a trybot job. (see confusion in @shurcooL's https://golang.org/cl/46715).

We should fix both at the same time by watching maintner instead and using the live Gerrit data and knowing the history of the vote.

/cc @andybons @shurcooL

@gopherbot gopherbot added this to the Unreleased milestone Jun 27, 2017

@gopherbot gopherbot added the Builders label Jun 27, 2017

@dmitshur

This comment has been minimized.

Copy link
Member

commented Jun 27, 2017

It runs every 60 minutes

60 seconds, yes?

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jun 27, 2017

Yes, 60 seconds

@SCKelemen

This comment was marked as off-topic.

Copy link
Contributor

commented Jul 7, 2017

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Apr 2, 2019

This is mostly done, but it turns out Gerrit doesn't have enough information in its NoteDb git structure for us to do our own vote counting reliably.

So the implementation of the MaintnerService.GoFindTryWork ends up using a mix of maintner & Gerrit API calls to get this data.

If we wanted to hard-code the Run-TryBot vote policy in maintner we could get rid of the API call altogether but I had got sidetracked trying to make label counting work in the generic case. The problem is that Gerrit has this option:

https://gerrit-review.googlesource.com/Documentation/config-labels.html#label_copyAllScoresOnTrivialRebase

label.Label-Name.copyAllScoresOnTrivialRebase

If true, all scores for the label are copied forward when a new patch set is uploaded that is a trivial rebase. A new patch set is considered as trivial rebase if the commit message is the same as in the previous patch set and if it has the same code delta as the previous patch set. This is the case if the change was rebased onto a different parent, or if the parent did not change at all.

But it doesn't record in NoteDb whether something's trivial and computing that by hand is not trivial. So when we're walking the NoteDb structure and counting votes, we don't know whether to reset something unless we have the same "trivial rebase" policy as Gerrit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.