Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When updating, update only one PR at a time #128

Open
ghost opened this issue Jun 17, 2019 · 5 comments
Open

When updating, update only one PR at a time #128

ghost opened this issue Jun 17, 2019 · 5 comments
Labels
enhancement Feature requests or new functionality

Comments

@ghost
Copy link

ghost commented Jun 17, 2019

In the current implementation, when the target branch is updated (master, for example), all PRs with master as target branch will be updated (implementation here).

This causes CI reruns on all PRs if they're in the PR-for-merging pipeline. A better approach would be to update only the oldest PR. This will create a predictable pipeline of PRs to be merged.

One issue with this approach is what happens if the updating-PR fails during CI tests, but this could be improved by listening to "CI check failed" event and proceeding to the next PR.

I can submit a PR if agreed. This could be configurable also.

@bluekeyes
Copy link
Member

This is an interesting idea and would definitely help improve the predictability of merges and reduce the load on CI systems caused by PR updates. But as you pointed out, I think we'll need to carefully consider how to make sure the "pipeline" is not blocked by a single PR that fails CI tests, is not approved, or otherwise fails to merge.

In particular, Bulldozer does not currently track any internal state between events, so there's no place to keep track of PRs that are part of an update-and-merge pipeline. To keep the application simple, we'd like to avoid adding an external state store at this time. Do you think there's a way to track or compute the necessary information based only on data in GitHub? Tracking state in memory is also not ideal, as we'd like to keep the current ability to run more than one instance of Bulldozer at the same time.

@bluekeyes bluekeyes added the enhancement Feature requests or new functionality label Jun 21, 2019
@spion
Copy link

spion commented Jun 25, 2019

In this mode the bot could work as follows:

On a master update:

  • Find the oldest PR that is ready for merge, unmerged and don't have a failed CI status, try to update it to latest master.

ON a successful build of a ready PR

  • Merge that PR (triggers master update)

On a failed build of a ready PR

  • Trigger "master update" (since this PR was marked as failed, the "loop" will skip it and go to the next one)

@skakim
Copy link

skakim commented Jan 25, 2022

+1 on this (as commented in #259).

On a PR that fails the CI: It should be discarded from the queue, or put back at the end of it.

@merrywhether
Copy link

merrywhether commented Jun 15, 2022

Trying to envision what the MVP of this change would look like, so mental mapping here as I try to grok the repo. This approach avoids adding any state to bulldozer and favors simplicity to increase the odds of being able to land a change, even if slightly imperfect. The future "merge queue" functionality coming to GH would provide a more exhaustive solution to this problem, so this could be the simpler counterpart to that for people who don't want to get into setting up Actions (or for people on GHE who won't see the new queue feature for a while).

Basic change targets:

  1. A new config option under update (maybe strategy?) that allows users to specify their choice. The simplest first two would be oldest and newest (indicating only update oldest eligible PR or newest eligible PR, respectively).
  2. It seems like the ListOpenPullRequestsForRef array in the push handler is the area we would target for this change. Conceptually (there's a few ways to plumb this, so talking high-level), we'd want to break the ensuing for-loop somehow after we detect that we've successfully triggered an update on the appropriate PR.
  3. The three other sites that can trigger updates would need to add update-strategy-is-chosen to their bail-out if-check in addition to update-is-disabled:

Notes:

  • The PR list options include sorting. Using that info would eliminate the need for additional API calls.
  • Could DisableUpdateFeature just be set true when a strategy is defined, overloading the flag internally? (Something likeDisable: opts.Disable || opts.Strategy but valid code). Is it a safe assumption that an update strategy choice means that updates will only run when their target branch emits a pull event?

Rough initial change idea:

  1. Add strategy key to update config
  2. When in strategy mode, fetch PRs in appropriately sorted order in response to pull events
  3. When in strategy mode, break out of update loop after first successful PR update (as signaled by UpdatePullRequest returning true

Addressing the above suggestions:

  • target branch update: addressed directly
  • successful build of ready PR: should already be covered by existing merge functionality
  • failed build of ready PR: unsolved, but this is a little complicated since we don't easily know how an update was triggered without increasing the complexity of the API calls. Like a new PR opened against head of target branch may fail CI, but it shouldn't necessarily trigger trying to update/merge other PRs. Without this, a PR updated via the strategy option would stop the implicit merge "queue", but I'm not sure this should block an MVP of this (which would also give the opportunity for getting a feel for the basic functionality)
  • on PR that fails CI: unsolved, but skipping these forever after isn't necessarily safe either. If a change merged into the target branch broke other PRs, it might be appropriate to merge that fix directly as a standalone PR, meaning that previous failures might now be fixed.

Questions:

  • Does this approach seem minimally viable?
  • If so, do the change targets seem basically correct?

@bluekeyes
Copy link
Member

Thanks for outlining this approach, @merrywhether. I like the idea of a strategy option and the newest and oldest strategies sound straightforward and are something I'm willing to support. Having the default strategy be all (the current behavior) could avoid some complexity about being "in strategy mode" or not.

Regarding the limitations of this behavior, I'll let others chime in if there are concerns. We don't use the update feature much internally at Palantir, so I don't have a good sense of how useful this will be in the minimal version.

One suggestion that might address the limitations is adding newest-mergeable and oldest-mergeable to the strategy set. This would effectively run Bulldozer's existing ShouldMergePR logic (or a subset of it, like the status checks) against each PR in order, updating the first one that passes the mergeability test. This should avoid the whole "queue" getting stuck due to one failed PR but means PRs that fail require manual attention, even if they would have passed with another update.

This increases the number of API calls, but many of them should be shared with the existing update checks (and therefore cached) and overall, I expect the reduction of load from updating fewer PRs to be more significant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests or new functionality
Projects
None yet
Development

No branches or pull requests

4 participants