Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/maintner: reports inconsistent world state (e.g., issue state vs issue events) during short windows of time #28226

Open
dmitshur opened this Issue Oct 16, 2018 · 5 comments

Comments

Projects
None yet
4 participants
@dmitshur
Copy link
Member

dmitshur commented Oct 16, 2018

Problem

A program that fetches a maintner corpus and tries to use its data to make decisions may make a mistake, because the world view is inconsistent during short windows of time. Even though the windows are short, it's guaranteed to happen for any daemon that loops over doing corpus updates and making decisions immediately after.

The most visible high-level example of this is #21312.

Cause

This happens because there are effectively two GitHub data sources that are not synchronized:

  1. changes to GitHub state (e.g., issue N now has labels X, Y, Z)
  2. GitHub-generated events (e.g., issue N has had an "unlabeled" event)

To give a concrete example of an inconsistent state that maintner can report, consider when an issue has just been unlabeled. The first mutation received and processed by a corpus.Update call will be that the issue no longer has that label.

The mutation reporting that there has been an unlabeled event on the same issue may come in a few seconds later. Until it does, it will appear that the issue does not have said label and it has never been unlabeled (e.g., !gi.HasLabel("Documentation") && !gi.HasEvent("unlabeled") will be true). Which is not the reality (if one considers the reality to be one where the unlabeled event and its effect to happen simultaneously).

Details

These are two distinct mutations received and processed by corpus.Update method:

received mutation at time t0:
github_issue: <
  owner: "golang"
  repo: "go"
  number: 28103
  updated: <
    seconds: 1539629204
  >
  remove_label: 223401461
>

... (short window during which the issue doesn't have a label,
     but the accompanying "unlabeled" event hasn't been received yet;
     aka an inconsistent world state)

received mutation at time t1:
github_issue: <
  owner: "golang"
  repo: "go"
  number: 28103
  event: <
    id: 1904921842
    event_type: "unlabeled"
    actor_id: 1924134
    created: <
      seconds: 1539629204
    >
    label: <
      name: "Builders"
    >
  >
  event: <
    id: 1904921913
    event_type: "labeled"
    actor_id: 8566911
    created: <
      seconds: 1539629206
    >
    label: <
      name: "Builders"
    >
  >
  event_status: <
    server_date: <
      seconds: 1539629209
    >
  >
>

There is more relevant information in #21312 (comment).

/cc @bradfitz

@gopherbot

This comment has been minimized.

Copy link

gopherbot commented Oct 16, 2018

Change https://golang.org/cl/142362 mentions this issue: cmd/gopherbot: reduce gardening reaction time

@dmitshur dmitshur changed the title x/build/maintner: reports inconsistent world state during short windows of time x/build/maintner: reports inconsistent world state (e.g., issue state vs issue events) during short windows of time Oct 20, 2018

@orthros

This comment has been minimized.

Copy link

orthros commented Oct 29, 2018

When working on other issues, I saw that GitHub introduced a "unified" timeline for events on an issue, the Timeline Api. I understand that it is still in beta (since 2016) and would be a major, but it might help fix this issue by providing a single source of truth on a GitHubIssue

@dmitshur

This comment has been minimized.

Copy link
Member Author

dmitshur commented Oct 29, 2018

@orthros Thanks for pointing that out. The Timeline API can indeed be helpful for eliminating races between issue comments, events, and PR reviews (for #21086).

Something to be mindful of is that it may not, on its own, be enough to solve the most important race: between the issue state (whether it's open or closed, which labels it has applied) and events. Unless we use the events to deduct the state, rather than querying state separately. (But that can be done independently of using the Timeline API.)

Also, for information, the Timeline API is indeed in preview, and in my experience using it, it had some data gap edge cases where I had to fall back to querying reviews separately (e.g., see here). It may have been resolved by now, but it's worth being aware of. It seems there are 2 Timeline APIs in GitHub API v4 (PullRequestTimelineConnection and PullRequestTimelineItemsConnection, the latter being a part of a preview API), in addition to the Timeline API in GitHub API v3 (https://developer.github.com/v3/issues/timeline/).

@andybons

This comment has been minimized.

Copy link
Member

andybons commented Dec 22, 2018

It’s not just short windows of time. There are some issues that have events missing within the maintner corpus. This makes it impossible to create an accurate milestone burndown chart where you want to query for the state of an issue at a particular time window. (/cc @griesemer).

A few examples of issues in maintner that have incomplete event lists:

=== Issue events for golang.org/issues/28559
             labeled	milestone:         	label:Testing
             labeled	milestone:         	label:help wanted
             labeled	milestone:         	label:OS-OpenBSD
             labeled	milestone:         	label:Builders
             labeled	milestone:         	label:NeedsInvestigation
          milestoned	milestone:   Go1.12	label:

It does not record the final “closed” event: https://api.github.com/repos/golang/go/issues/28559/events

=== Issue events for golang.org/issues/28306
           mentioned	milestone:         	label:
          subscribed	milestone:         	label:
           mentioned	milestone:         	label:
          subscribed	milestone:         	label:
            assigned	milestone:         	label:
             labeled	milestone:         	label:Documentation
             labeled	milestone:         	label:NeedsInvestigation
          milestoned	milestone:   Go1.12	label:
             renamed	milestone:         	label:

The above event log is missing a few milestone-related events: https://api.github.com/repos/golang/go/issues/28306/events

@dmitshur

This comment has been minimized.

Copy link
Member Author

dmitshur commented Dec 22, 2018

@andybons That sounds like a valid issue that is related, but not the same as this one. I see these two issues:

  1. short windows of time where world state is incorrect due to separate sources of data not being synchronized (the issue described in the original report)
  2. some issue events are permanently missing (issue you described)

Mind opening a separate issue for it? The reason I suggest that is because I expect the fix for one will not resolve the other, and vice versa. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.