Try run details are not being added to a bug #201

tomrittervg · 2021-09-30T16:03:58Z

Observed in https://bugzilla.mozilla.org/show_bug.cgi?id=1733013

The problem sequence is:

Library has new release A.
We notice it, process it, file a bug, send in a try run
We cron again, and maybe we need to retrigger jobs so we do
Library has new release B
We cron again, and see the new library release.
We do not find an existing job for this new release.
We process the new job

At this point whether or not we process the new release we will not make any follow up comment on the old release's bug. If we do process the new release we will close the old bug. If we don't, we leave it up but don't comment.

If we have never summarized the try run results, we should do so and add them to the bug - whether or not the bug is open or it is the newest release. Before summarizing them, we should retrigger them if needed. (Not retriggering them may make some sense but it's easier to not complicated the control flow and I think retriggering them also makes some sense.)

tomrittervg · 2021-09-30T16:27:02Z

The whole clean_up_old_job stuff is already messy and this has the distinct possibility of making it messier. So we need to plan this patch out a bit.

We've got a few aspects of Updatebot this interacts with:

The clean_up_old_job logic inside of _process_new_job
The notion that we can only have one Updatebot job 'running' (i.e. not in the DONE state at any one time)
The notion of STATUS and OUTCOMES
Especially the special status of RELINQUISHED

Here are some assertions we've made that we're going to start violating. We'll need to change our code to accept and deal with these new violations:

Multiple Updatebot jobs may be 'running' (i.e. Not Done.)
- However, only one of them (call if N) should be considered 'tip'. When we create a new job (call it M), we should close N's bug in favor of M's.
A job that has been relinquished may still need to be processed.
- Relinquished can (and should) be thought of as "Do not modify this bug's Open/Close state". Not "Do nothing with this job".
A job can have the outcome of ABORTED
- We will no longer abort jobs.

tomrittervg · 2022-06-27T20:31:22Z

There seems to be three aspects to this work.

Change what we do w.r.t. prior jobs when we see a new revision to process.
Change how we find existing jobs that we need to take action on.
Decide what things we will do with an existing job depending on its state.

(2) seems easy. For each library that has Updatebot tasks, we can just ask the database for all the existing jobs for that library, and iterate through them, taking actions. (What actions to be decided by (3).) We should do this before we look for any new revisions, that way we don't have to exempt a job we just created.

(3) To start with, let's outline the actions we could take for a job:

we could close its bugzilla bug and mark it a duplicate of a newer bug.
we can the firefox versions affected by it and mark that on the bugzilla bug
if we're in a two-platform configuration, we can trigger the second set of platforms to run
we can retrigger failed tests
we can summarize the try run and post it as a comment
we can flag a patch for review
we can flag a bug for needinfo
we can assign a bug
we can abandon a phabricator revision
we can update its state in the database
we can do nothing to it

And let's define how we know a job's state:

the bug could be open or closed.
there could be a new library revision that has superseeded it (but we won't know that unless we check for a new revision before processing existing jobs...)
we could have marked its state in the database in some way

The problem we want to avoid is what Relinquished was designed to solve: we absolutely don't want Updatebot and Humans getting into a resolution-flipping loop where Updatebot closes something, a human re-opens it, and Updatebot sees it's open and tries to close it. I do not think we can reliably use Bugzilla status to determine this, so we need to have some indicator in the Updatebot database. For the sake of continuity, I'm going to refer to these jobs as Relinquished jobs for now...

The rules I am seeing emerge from this are:

a. If we have a new revision, and the prior job is not Relinquished and the job's bug is Open, we should abandon the patch, close the bug, supersede it by a new bug, and mark it as Relinquished in the database. (An abandoned patch can be commandeered and reclaimed, although this will require a second human to signoff on the patch.)
b. If a job's bug is closed, the only two things we should do are to summarize any existing try runs if we haven't already and abandon the revision (if the revision is still open). We should not trigger any new taskcluster jobs, update affected versions, mark it as a dupe, or place any flags. Summarizing the try runs ideally includes adding text saying that we would have done something had it not been closed (like retrigger failures.)
c. If a job's bug is open, we could mark FF versions as affected, trigger new taskcluster jobs, summarize the try run and post a comment, place flags (review, needinfo), and assign it.

Relinquished status only affects whether or not we will do the two-step process of closing a bug and abandoning a revision. It will not affect any other aspect of behavior. A job that has been relinquished can still do all the actions indicated in (b) or (c).

tomrittervg · 2022-07-14T19:08:07Z

I got most of this written, but then hit an unfortunate situation.

Job A runs, sends it to try.
A new job, Job B, runs before Job A has try results.
Job B supersedes Job A, closes its bug, abandons the phab revision, updates Job A's outcome to aborted and relinquishes Job A.
Job A now has a status of 'Pending' but an outcome of 'Aborted'.

We can't change its status - we need it to process the try run results later.

I think the solution to this is to remove the 'Aborted' outcome. It seems like it would be helpful to know that a job was actively superseded by another bug, but in practice I don't think we care about that outcome at all.

tomrittervg added the Medium label Sep 30, 2021

tomrittervg added this to the v1 milestone Sep 30, 2021

tomrittervg added Complicated++ and removed Medium labels Dec 20, 2021

tomrittervg removed this from the v1 milestone Dec 20, 2021

tomrittervg mentioned this issue Mar 23, 2022

Why didn't updatebot produce a summary for 1758482 #231

Closed

tomrittervg mentioned this issue Jun 27, 2022

Do not resolve fixed bugs as duplicates #252

Closed

tomrittervg mentioned this issue Jun 29, 2022

The reviewer is not set in phab #254

Closed

tomrittervg mentioned this issue Jul 20, 2022

20220 07 13 status refactor #262

Merged

tomrittervg closed this as completed Aug 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try run details are not being added to a bug #201

Try run details are not being added to a bug #201

tomrittervg commented Sep 30, 2021

tomrittervg commented Sep 30, 2021

tomrittervg commented Jun 27, 2022

tomrittervg commented Jul 14, 2022

Try run details are not being added to a bug #201

Try run details are not being added to a bug #201

Comments

tomrittervg commented Sep 30, 2021

tomrittervg commented Sep 30, 2021

tomrittervg commented Jun 27, 2022

tomrittervg commented Jul 14, 2022