Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to improve synchronization and testability #45

Draft
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

bryanburgers
Copy link
Contributor

@bryanburgers bryanburgers commented Jun 7, 2019

A start to the task of refactoring to improve synchronization and add more testing to Homu. More information about the target end goal can be found at the Proof-of-Concept PR #44

  • Test command parsing logic (Test the command parsing logic #32)
  • Test authorization code
  • Pull PullReqState into its own file
  • Use Timeline events to perform the initial synchronization
  • Add tests for timeline event synchronization
  • Track state for multiple tries
  • Migrate webhook handling code to use same API as initial synchronization

@bryanburgers bryanburgers changed the title Refactor synchronization Refactor to improve synchronization and testability Jun 7, 2019
@bryanburgers
Copy link
Contributor Author

Making progress! I now have Homu synchronizing its initial state from GitHub Timeline Events in a pretty accurate way. A few differences yet where I need to track down what the actual expected state should be.

Current Homu:

image

Local Homu with these changes:

image

Anybody can pull this branch and try it locally. Some notes about that:

  • It only does the synchronization. Because it won't be set up for webhooks, it will not keep in sync at this point.
  • You can use any access token from any user. I'd suggest not using an access token for a user that has access to rust-lang/rust, just in case there's a bug somewhere that would affect the real functioning of bors.

Pull code that creates a comment on a pull request out of the
authorization check, and add tests around the authorization checks.

After pulling out the code that creates a comment, we still need to know
what the text of the comment will be, so change from a function that
returns a True on success and a False on failure to a function that
returns a True on success and raises an Exception (with the failure
comment) on failure.
Remove the free function `db_query` in favor of a `LockingDatabase`
class that wraps a database connection.

This is done so that, in order for a class to use the database, it only
needs its `db` instance, instead of needing both a `db` instance and a
`db_query` function.

This allows us to break out classes into separate files.
Extract PullReqState into its own file for code readability.

Because PullReqState shares some constants with main and server, extract
those constants into a `consts.py` file as well.
For some critical comments, Homu adds extra information about its state
in the form of a JSON blob to the comment that isn't visible to the user
but is visible in the source for the comment. For example, Homu may
leave a comment like the following, where the JSON blob is not visible
because of the `<!-- .. -->` markdown/html comments.

    ⌛ Trying commit abcdef with merge 012345...

    <!-- homu: {"type":"TryBuildStarted","head_sha":"abcdef","merge_sha":"012345"} -->

This change parses this extra information out of the comments and makes
it available to the initial synchronization algorithm.
Create the general structure for `process_event` and its testing, and
get a long way toward testing approval comments.
Break up the "status" field into multiple orthogonal state fields:

* build state (whether the primary is running or has succeeded or
  failed)
* try state (whether the most recent try is running or has succeeded or
  failed)
* approval state (whether the pull request is approved)

Previously (and still) these were mostly possible to determine by
looking at `state.get_status()` and `state.try_`, but storing them
separately helps make state changes more explicit.

Also, keep track of the current github synchronization cursor in the
pull request state, so that we can use it later.
Test that issuing a `@bors retry` command moves the state from 'pending'
to '' for pending pull requests. This is frequently used as a way to
yield the current build to a different pull request.
Homu creates a message when a try or build timeout occurs. Handle this
to keep the state properly updated.
@bryanburgers
Copy link
Contributor Author

bryanburgers commented Jul 10, 2019

Status update

I've been running a "follower" against the rust-lang/rust repo using this new sync method for a couple of days now. Every minute or so, it grabs all of the PRs from the API and resyncs them from the previous sync point to current.

https://homu.burgers.io/queue/rust

Takeaways:

  • I'm frequently seeing it stay in sync! Which means that this method is definitely working as expected.
  • This has actually been a very effective way to find edge cases that I wasn't aware of. Whenever the follower doesn't match the current version (https://buildbot2.rust-lang.org/homu/queue/rust), I figure out why and adjust the synchronization code.
    • Unfortunately, that won't work so well when I need to start reacting to changes and issuing comments, at which point I'll need to drop back down to test repos
  • current version often doesn't have ALL the open PRs. It appears that on initial sync, it might only synchronize the newest 100 open pull requests (?)
    • This doesn't seem to matter much in practice. Old PRs end up showing up after they get updated anyway.
  • I often see differences in "Mergeable" and "Assignee" columns. It looks like current version doesn't always stay in sync for those fields
  • Rate limit: GitHub v4 has a 5000 "cost" rate limit. Despite pulling changes every minute, I've only ever used 398 of those (still have 4602 remaining) before the rate limit resets. So this method doesn't appear like it will run into issues running up against the rate limit.

@Mark-Simulacrum
Copy link
Member

I think GitHub returns at most 100 objects from any query, so we're probably not asking for the next page on some query.

@bryanburgers
Copy link
Contributor Author

@Mark-Simulacrum

homu/homu/main.py

Line 1495 in abd0083

for pull in repo.iter_pulls(state='open'):

Right. https://github3py.readthedocs.io/en/stable-0.9/repos.html#github3.repos.repo.Repository.iter_pulls suggests that it returns all available pull requests, but it seems likely we're just getting the first page of results.

But at this point, I don't believe it's worth investigating.

@Mark-Simulacrum
Copy link
Member

Somewhat agreed, though I think we'd want to investigate before going forward -- missing PRs in the queue are annoying because you can't easily tell if we have all of them by comparing the number, you have to check one by one.

@bryanburgers
Copy link
Contributor Author

Status update: I've been incredibly busy the last couple of weeks with other things and haven't been able to get back to working on this.

From what I see, this change is not something I can introduce in pieces, so it will take time to get it to a production state.

I hope to be able to return to this mid-August.

Add more of the state to the Repository class, and make each
PullReqState reference it's Repository and get information from there.
Include a history of all of the tries and all of the builds when parsing
the history of a pull request.
@bryanburgers
Copy link
Contributor Author

Slowly but surely trying to keep working on this.

I now integrated a history of tries and builds, based on the GitHub history and bors' past comments, so we can have something like the following image (but with more UI work).

image

https://homu.burgers.io/results/rust/58281

I'll keep running https://homu.burgers.io/queue/rust while I work on this. From what I can tell checking in every once in a while, it's a pretty accurate representation of the state of the world. And is almost always more accurate with respect to mergeability and assignees.

Unfortunately, it's getting long and is going to be a huge pain to review, and a huge risk to switch over.

Previously, we discretely set `status`, `try`, `build_state`, and
`try_state` on each event. With the addition of the run histories, we
can now glean all of this information from those histories instead of
tracking them independently.

The results appear to all be identical on the current Homu queue except
for one edge case: a pull request was tried, approved, then unapproved.

- Using the previous method, the status would be '', because approval
  changes the status from 'success (try)' to '', and unapproval doesn't
  change the status.

- Using the current method, the status would be 'success (try)', because
  a successful try has occured for the relevant commit hash and the PR
  isn't approved.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants