Improve pull request review experience by tagging them based on size and contents#5402
Conversation
|
Thanks for putting this together. There has been some chat about this in dask/community#60 but I believe this never reached a conclusion. For reference, a few people are currently organizing a manual walk-through, see dask/community#188 |
|
I really like the idea of auto-labelling of PRs based on what files have changed. @jsignell would you find something like this helpful in dask/dask as well? I don't like the idea of the stale bot that automatically closes issues/PRs. From the other conversations that have been happening, it seems the consensus is it's good to have a bot to label old issues, so that a human maintainer can give them more attention. Auto-closing is generally a negative experience for users/contributors. (Based on the title I didn't realize this PR introduced a stale bot action until after I made #5405 - sorry about that) |
|
I agree that closing issues is really annoying and counterproductive, but a large portion of the pull requests here are clearly abandoned. We could indeed label them, but is there any value in having a year+ old merge request full of conflicts open? We're not deleting them, we're only closing them. I agree we should label old issues for review. Happy to remove the stale PR closing if that’s contentious |
|
I've removed the action that closes stale pull requests, shall we merge the tagging one and see how it goes? We can expand the labels in the future as well. |
|
ping @dask/maintenance |
jrbourbeau
left a comment
There was a problem hiding this comment.
Thanks for the PR @orf. While I appreciate the thought you've put into this and generally and am in favor of reducing the burden on maintainers, I don't think the small /tiny labels will be that useful for most folks who review distributed PRs relative to the complexity introduced in this workflow.
For the test-only / docs-only labels, I recommend using GitHub's built-in labeling system similar to what we do in dask/dask
This PR adds two workflows. The first one tags new pull requests based on their size and the files changed. This action attempts to programmatically segment new pull requests into several overlapping buckets:
You can see it in action here: https://github.com/orf/distributed/pull/7. As the pull request is updated the tags also are updated to reflect the contents.
Why? In any project like this reviewer time is limited. We can use the size of the pull request as a proxy for how long it will take to review - it's not perfect, but I think it's a good heuristic to use. In a similar vein, PRs that only impact tests and docs likely require less discussion or are possibly easier to review - a tiny PR that only touches tests is usually pretty simple to review and merge. With this people can now see at a glance which PRs they might be able to look at in a given time frame.
It's also fairly easy to add new tags based on the files changed, for example we could add
dashboard,protocolordiagnosticslabels to indicate changes isolated to those modules.The second action closes stale pull requests. A large percentage of the open pull requests are over a year old, and it's probably a good thing to mark them as stale and close them.