PyTorch AutoLabel Bot
Clone this wiki locally
Page maintainer(s): @janeyx99 Last updated: 8/30/22
PyTorch maintainers use GitHub labeling for several organizational purposes, such as triaging issues to the right module owners, assigning priority to certain tasks, and categorizing pull requests (PRs). Labeling has much potential! Thus, this is a technical guide for how it works and how you can join in and improve our labeling.
Our autolabel bot hinges on Probot webhooks. You can read all about Probot in the GitHub docs, but the docs contain more details than you need to know for developing on the autolabel bot and could confuse you. It suffices to know that Probot allows you to plug into GitHub webhooks, where webhooks notify you of GitHub-related events such as "a PR has been pushed!" or "a label has been added to an issue!" or "someone edited a PR title!".
Our autolabel bot merely tells Probot to add labels when certain events have occurred. For example, on the event that an issue title has been modified to include the phrase
DISABLED test..., we tell the bot to add the
skipped label denoting a skipped test in CI (if you're curious about the disable test infra, see our Continuous Integration wiki).
It is at this point where looking at the code becomes more helpful, so open another tab for the bot code that lives in our open source
test-infra repo: https://github.com/pytorch/test-infra/blob/main/torchci/lib/bot/autoLabelBot.ts. Note all the
app.on("some.event", async(context) => lines. When "some event" occurs, we get a
context, which gives us information about what happened in the form of a
payload. For what payloads look like for differing events, please check out Webhook events & Payloads.
The primary reason for this wiki is to enable you to help us make our autolabel bot better! One of the most impactful tasks our autolabel bot takes on is categorizing PRs for release notes. As our codebase is large and not one person has enough context or time to know all the right mappings, we encourage you to improve our heuristics and submit changes.
Once you have an improvement in mind, please follow the instructions in our bot README. Add test cases in https://github.com/pytorch/test-infra/blob/main/torchci/test/autoLabelBot.test.ts to verify your change. We use nock along with pretend payloads in https://github.com/pytorch/test-infra/tree/main/torchci/test/fixtures to mock API calls. Nock was confusing when I first started out, and looking at existing test cases (hint hint: copy paste is your friend) was helpful. To run the test file, enter the following command once you're in the
torchci directory in the commandline:
yarn test test/autoLabelBot.test.ts
When your change is ready, open up a pull request to our test-infra repo and tag @pytorch/pytorch-dev-infra for a review.
- Why categorize? The purpose of categorizing is so that during the release notes process, commits/PRs are routed to the right module owner whose job is to clean up the commit message and include additional information regarding bc-breaking changes or deprecations.
- When and how? Categorizing a PR to the right module should happen before it is landed and can be done by adding a single
release notes: <module name>label. Labels corresponding to the modules are prefixed with
- Optionally add a single
topic: blahto make the module owners lives easier later.
- For PRs that are not user facing and are not intended to be a part of the release notes, the
topic: not user facingshould be added. In that case, the
release notes: <module name>label is not required.
If you are unsure of which label should be added, one can search existing PRs for examples: https://github.com/pytorch/pytorch/pulls?q=is%3Apr+is%3Aopen+label%3A%22release+notes%3A+nn%22
The current heuristics we use are mainly defined in the
getReleaseNotesCategoryAndTopic function. You can always help extend our heuristics by making conditions more specific, extending our bot to categorize more by looking at existing labels/title patterns/features of the pull request.
Please file an issue and tag @soulitzer/@janeyx99 if you find that your PRs are being mislabeled
PyTorch presented to you with love by the PyTorch Team of contributors