Skip to content

Update clean_release_notes.py to use per-module highlights#4250

Merged
jerryzh168 merged 2 commits intomainfrom
gh/jerryzh168/79/head
Apr 7, 2026
Merged

Update clean_release_notes.py to use per-module highlights#4250
jerryzh168 merged 2 commits intomainfrom
gh/jerryzh168/79/head

Conversation

@jerryzh168
Copy link
Copy Markdown
Contributor

@jerryzh168 jerryzh168 commented Apr 7, 2026

Stack from ghstack (oldest at bottom):

Switch from topic-based categorization (topic: labels) to module-based
categorization (module: labels) to align with pytorch/pytorch and vllm
release notes format. Module sections are discovered dynamically from
PR labels instead of being hardcoded by topic.

Before: https://gist.github.com/jerryzh168/e0afec28fc8957b471ff844418dcf7e3
After: https://gist.github.com/jerryzh168/37594ee89e393cf5b05f4246a2487f45

Switch from topic-based categorization (topic: labels) to module-based
categorization (module: labels) to align with pytorch/pytorch and vllm
release notes format. Module sections are discovered dynamically from
PR labels instead of being hardcoded by topic.

Before: https://gist.github.com/jerryzh168/e0afec28fc8957b471ff844418dcf7e3
After: https://gist.github.com/jerryzh168/730fea7e57f8c9fe9367e2a395b79708

[ghstack-poisoned]
@jerryzh168 jerryzh168 requested a review from vkuzo as a code owner April 7, 2026 20:52
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Apr 7, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/4250

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 8 Pending

As of commit b87cdd9 with merge base 2a8fa55 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 added a commit that referenced this pull request Apr 7, 2026
Switch from topic-based categorization (topic: labels) to module-based
categorization (module: labels) to align with pytorch/pytorch and vllm
release notes format. Module sections are discovered dynamically from
PR labels instead of being hardcoded by topic.

Before: https://gist.github.com/jerryzh168/e0afec28fc8957b471ff844418dcf7e3
After: https://gist.github.com/jerryzh168/730fea7e57f8c9fe9367e2a395b79708

ghstack-source-id: 7371889
Pull Request resolved: #4250
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 7, 2026
@jerryzh168 jerryzh168 requested a review from andrewor14 April 7, 2026 20:52
@jerryzh168 jerryzh168 added the module: not user facing Use this tag if you don't want this PR to show up in release notes label Apr 7, 2026
@andrewor14
Copy link
Copy Markdown
Contributor

@claude review please

Copy link
Copy Markdown
Contributor

@andrewor14 andrewor14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment thread scripts/clean_release_notes.py Outdated
@claude
Copy link
Copy Markdown

claude Bot commented Apr 7, 2026

Claude finished @andrewor14's task in 1m 6s —— View job


Code Review: Update clean_release_notes.py to use per-module highlights

  • Read the PR diff and changed file
  • Analyze the changes
  • Provide review feedback

Overall this is a clean refactor that simplifies the categorization logic. The move from hardcoded topic-based categories with keyword heuristics to dynamically discovered module-based categories is a good direction. A few observations:

Positive changes

  • Eliminates fragile keyword heuristics: The old get_commit_category had brittle substring matching (e.g. " fix", " add", "nightl") that could easily misclassify PRs. The new approach relies solely on explicit labels, which is more reliable.
  • Dynamic module discovery: build_module_categories discovers sections from actual PR labels rather than hardcoding them, so new modules are automatically supported without code changes.
  • Multi-label support: PRs can now appear under multiple module sections (the old code only used the first label), which is more accurate for cross-cutting changes.
  • Well-structured code: build_module_categories and module_label_to_display are clean, well-documented functions.

Suggestions

  1. Capitalization of module display names (echoing @andrewor14's nit): module_label_to_display at line 228 just strips the prefix, so "module: quantization" becomes "quantization" (lowercase). Since these become ## quantization section headers, consider capitalizing them for consistency with the special sections ("BC Breaking", "Deprecations"):

    def module_label_to_display(label: str) -> str:
        return label.removeprefix("module: ").title()

    Or .capitalize() if you only want the first letter uppercased. Fix this →

  2. Per-module highlights in the header: The PR title says "per-module highlights" but I don't see the Highlights section generated per-module — it's still a static template with ### Highlight Feature 1 / ### Highlight Feature 2. If per-module highlights are intended, this might be a future enhancement. Just flagging in case it was meant to be part of this PR.

  3. Potential issue with parse_pr_number regex (pre-existing): At line 143, the regex ".*pytorch/ao/pull/(.*)" uses a greedy (.*) capture group. If a commit title happens to contain a URL-like string, this could capture more than intended. A stricter pattern like r"pytorch/ao/pull/(\d+)" would be safer. This is pre-existing though, not introduced by this PR.

  4. Empty all_pr_numbers edge case (pre-existing): min(all_pr_numbers) at line 153 will raise ValueError if commit_lines is empty. This is unlikely in practice but worth noting.

  5. Rate limiting: Github() without a token (line 154) uses unauthenticated API access, which is limited to 60 requests/hour. This is pre-existing but worth keeping in mind for releases with many PRs.

Verdict

The changes look good. The only actionable nit from this review is #1 (capitalization), which aligns with @andrewor14's existing feedback.


Switch from topic-based categorization (topic: labels) to module-based
categorization (module: labels) to align with pytorch/pytorch and vllm
release notes format. Module sections are discovered dynamically from
PR labels instead of being hardcoded by topic.

Before: https://gist.github.com/jerryzh168/e0afec28fc8957b471ff844418dcf7e3
After: https://gist.github.com/jerryzh168/37594ee89e393cf5b05f4246a2487f45

[ghstack-poisoned]
jerryzh168 added a commit that referenced this pull request Apr 7, 2026
Switch from topic-based categorization (topic: labels) to module-based
categorization (module: labels) to align with pytorch/pytorch and vllm
release notes format. Module sections are discovered dynamically from
PR labels instead of being hardcoded by topic.

Before: https://gist.github.com/jerryzh168/e0afec28fc8957b471ff844418dcf7e3
After: https://gist.github.com/jerryzh168/730fea7e57f8c9fe9367e2a395b79708

ghstack-source-id: e31f830
Pull Request resolved: #4250
@jerryzh168
Copy link
Copy Markdown
Contributor Author

@claude review again

@claude
Copy link
Copy Markdown

claude Bot commented Apr 7, 2026

Claude finished @jerryzh168's task in 54s —— View job


Re-review: Update clean_release_notes.py to use per-module highlights

  • Read the current PR diff
  • Read the full file for context
  • Check if previous review feedback was addressed
  • Provide updated review

The updated version addresses several issues from the previous iteration. Here's what I see:

Previous feedback status

  1. Capitalization of module display namesFixed. The old module_label_to_display returned lowercase names (e.g. "quantization"). The new code uses .title() at line 215 for dynamically discovered modules, and known modules have explicit display names in GITHUB_MODULE_LABEL_TO_CATEGORY (e.g. "Core", "Inference"). @andrewor14's nit is addressed.

  2. parse_pr_number regexFixed. Changed from the greedy re.match(".*pytorch/ao/pull/(.*)", ...) to the stricter re.search(r"pytorch/ao/pull/(\d+)", ...) at line 155. Good improvement.

  3. Empty all_pr_numbers edge caseFixed. Added an early return guard at line 165-166.

  4. Rate limiting / authFixed. Now reads GITHUB_TOKEN from env at line 168-169.

Current code review

The code looks good overall. One minor concern:

Dynamic insertion ordering logic (lines 218-223): When a new module label is discovered, the code rebuilds the entire OrderedDict to insert before "Not User Facing". This works correctly but has a subtle edge case: if the "Not User Facing" category was already removed or doesn't exist in commits_by_category (which can't happen with the current code since it's always initialized at line 197), the new category would never be inserted into new_order. The current code is safe because "Not User Facing" is always present at this point, but it's worth being aware of if the initialization logic changes in the future.

The format_commit regex improvement at line 237 (r" by @[\w-]+ in (https://\S+)") is more precise than the old greedy " by @.* in (.*)" — good change.

Verdict

LGTM. The previous review feedback has been addressed, and the code is in good shape.


View job

@jerryzh168 jerryzh168 changed the base branch from gh/jerryzh168/79/base to main April 7, 2026 22:26
@jerryzh168 jerryzh168 merged commit 1a1debb into main Apr 7, 2026
36 checks passed
brucechanglongxu pushed a commit to brucechanglongxu/ao that referenced this pull request Apr 9, 2026
)

* Update clean_release_notes.py to use per-module highlights

Switch from topic-based categorization (topic: labels) to module-based
categorization (module: labels) to align with pytorch/pytorch and vllm
release notes format. Module sections are discovered dynamically from
PR labels instead of being hardcoded by topic.

Before: https://gist.github.com/jerryzh168/e0afec28fc8957b471ff844418dcf7e3
After: https://gist.github.com/jerryzh168/730fea7e57f8c9fe9367e2a395b79708

[ghstack-poisoned]

* Update on "Update clean_release_notes.py to use per-module highlights"


Switch from topic-based categorization (topic: labels) to module-based
categorization (module: labels) to align with pytorch/pytorch and vllm
release notes format. Module sections are discovered dynamically from
PR labels instead of being hardcoded by topic.

Before: https://gist.github.com/jerryzh168/e0afec28fc8957b471ff844418dcf7e3
After: https://gist.github.com/jerryzh168/37594ee89e393cf5b05f4246a2487f45

[ghstack-poisoned]
brucechanglongxu pushed a commit to brucechanglongxu/ao that referenced this pull request Apr 9, 2026
)

* Update clean_release_notes.py to use per-module highlights

Switch from topic-based categorization (topic: labels) to module-based
categorization (module: labels) to align with pytorch/pytorch and vllm
release notes format. Module sections are discovered dynamically from
PR labels instead of being hardcoded by topic.

Before: https://gist.github.com/jerryzh168/e0afec28fc8957b471ff844418dcf7e3
After: https://gist.github.com/jerryzh168/730fea7e57f8c9fe9367e2a395b79708

[ghstack-poisoned]

* Update on "Update clean_release_notes.py to use per-module highlights"


Switch from topic-based categorization (topic: labels) to module-based
categorization (module: labels) to align with pytorch/pytorch and vllm
release notes format. Module sections are discovered dynamically from
PR labels instead of being hardcoded by topic.

Before: https://gist.github.com/jerryzh168/e0afec28fc8957b471ff844418dcf7e3
After: https://gist.github.com/jerryzh168/37594ee89e393cf5b05f4246a2487f45

[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: not user facing Use this tag if you don't want this PR to show up in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants