Skip to content

Commit

Permalink
contributing: Generate release notes from git log output or GitHub API (
Browse files Browse the repository at this point in the history
#2328)

This adds a new script which generates release notes from git log output or from GitHub API.

Configuration:

* Split into categories or exclude based on commit message or PR title.
* Similar to that of GitHub release notes, but using commit message or PR title prefixes instead of PR labels.
* Additional author files map names and usernames to GitHub usernames to generate GitHub-like output from git log.
* Includes user names from the Subversion to Git+GitHub transition (i.e., roughly 8.0 to 8.2).
* Includes user names for contributors since the 8.0.0 tag (i.e., roughly 8.0 to 8.2).
* Not storing @ sign in emails as in the contributors file (unlike the original files in Subversion to Git repo).


    https://trac.osgeo.org/grass/browser/grass-addons/tools/svn2git/svn2git_users.csv
    https://trac.osgeo.org/grass/browser/grass-addons/tools/svn2git/AUTHORS.txt

The additional Git-GitHub file is created from the commits for 8.0.2 release.
 ( have full emails for these files.)
* Add general raster label

* Ignore version prefix. Add Contributing section to capture improvements for existing or potential contributors. Add i18N as alternative spelling because it was used a lot in the past.

Contributing documentation:

* Use the generate release notes script in the release procedure


Error handling:

* Raise specific error if git log returns nothing
* Capture exception to propage the captured subprocess error output



CI:


* Remove the GH API test due to permission issues. python ./generate_release_notes.py api releasebranch_8_2 8.0.0 8.2.0 gets Resource not accessible by integration (HTTP 403)
* Use current branch, explain parameter choices


* Fetch more commits for testing purposes

* Run the script in additional checks workflow
  • Loading branch information
wenzeslaus committed May 24, 2022
1 parent 233b3ed commit ac9180e
Show file tree
Hide file tree
Showing 7 changed files with 624 additions and 10 deletions.
15 changes: 15 additions & 0 deletions .github/workflows/additional_checks.yml
Expand Up @@ -17,6 +17,8 @@ jobs:
steps:
- name: Checkout repository contents
uses: actions/checkout@v2
with:
fetch-depth: 31

- name: Check for CRLF endings
uses: erclu/check-crlf@v1
Expand All @@ -34,3 +36,16 @@ jobs:
python -m pip install pytest pytest-depends
python utils/generate_last_commit_file.py .
pytest utils/test_generate_last_commit_file.py
- name: Generate release notes using git log
run: |
python -m pip install PyYAML
cd utils
# Git works without any special permissions.
# Using current branch or the branch against the PR is open.
# Using last tag as start (works for release branches) or last 30 commits
# (for main and PRs). End is the current (latest) commit.
python ./generate_release_notes.py log \
${{ github.ref_name }} \
$(git describe --abbrev=0 --tags || git rev-parse HEAD~30) \
""
30 changes: 20 additions & 10 deletions doc/howto_release.md
Expand Up @@ -82,25 +82,35 @@ If RC, then check

### Changelog from GitHub for GH release notes

Using GH API here, see also
- https://cli.github.com/manual/gh_api
- https://docs.github.com/en/rest/reference/repos#generate-release-notes-content-for-a-release
Generate a draft of release notes using a script. The script uses configuration files
which are in the _utils_ directory and the script needs to run there,
so change the current directory:

```bash
gh api repos/OSGeo/grass/releases/generate-notes -f tag_name="8.2.0" -f previous_tag_name=8.2.0 -f target_commitish=releasebranch_8_2 -q .body
cd utils
```

If this fails or is incomplete, also a date may be used (that of the last release):
For major and minor releases, GitHub API gives good results
because it contains contributor handles and can identify new contributors,
so use with the _api_ backend, e.g.:

```bash
# GitHub style (this works best)
git log --pretty=format:"* %s by %an" --after="2022-01-28" | sort
python ./generate_release_notes.py api releasebranch_8_2 8.0.0 $VERSION
```

For micro releases, GitHub API does not give good results because it uses PRs
while the backports are usually direct commits without PRs.
The _git log_ command operates on commits, so use use the _log_ backend:

# trac style (no longer really needed)
git log --oneline --after="2022-01-28" | cut -d' ' -f2- | sed 's+^+* +g' | sed 's+(#+https://github.com/OSGeo/grass/pull/+g' | sed 's+)$++g' | sort -u
```bash
python ./generate_release_notes.py log releasebranch_8_2 8.2.0 $VERSION
```

Importantly, these notes need to be manually sorted into the various categories (modules, wxGUI, library, docker, ...).
The script sorts them into categories defined in _utils/release.yml_.
However, these notes need to be manually edited to collapse related items into one.
Additionally, a _Highlights_ section needs to be added with manually identified new
major features for major and minor releases. For all releases, a _Major_ section
may need to be added showing critical fixes or breaking changes if there are any.

### Changelog file for upload

Expand Down
292 changes: 292 additions & 0 deletions utils/generate_release_notes.py
@@ -0,0 +1,292 @@
#!/usr/bin/env python3

"""Generate release notes using git log or GitHub API
Needs PyYAML, Git, and GitHub CLI.
"""

import argparse
import csv
import json
import re
import subprocess
import sys
from collections import defaultdict

import yaml

PRETTY_TEMPLATE = (
" - hash: %H%n"
" author_name: %aN%n"
" author_email: %aE%n"
" date: %ad%n"
" message: |-%n %s"
)


def remove_excluded_changes(changes, exclude):
"""Return a list of changes with excluded changes removed"""
result = []
for change in changes:
include = True
for expression in exclude["regexp"]:
if re.match(expression, change):
include = False
break
if include:
result.append(change)
return result


def round_down_to_five(value):
"""Round down to the nearest multiple of five"""
base = 5
return value - (value % base)


def split_to_categories(changes, categories):
"""Return dictionary of changes divided into categories
*categories* is a list of dictionaries (mappings) with
keys title and regexp.
"""
by_category = defaultdict(list)
for change in changes:
added = False
for category in categories:
if re.match(category["regexp"], change):
by_category[category["title"]].append(change)
added = True
break
if not added:
by_category["Other Changes"].append(change)
return by_category


def print_category(category, changes, file=None):
"""Print changes for one category from dictionary of changes
If *changes* don't contain a given category, nothing is printed.
"""
items = changes.get(category, None)
if not items:
return
print(f"### {category}", file=file)
for item in sorted(items):
print(f"* {item}", file=file)
print("")


def print_by_category(changes, categories, file=None):
"""Print changes by categories from dictionary of changes"""
for category in categories:
print_category(category["title"], changes, file=file)
print_category("Other Changes", changes, file=file)


def binder_badge(tag):
"""Get mybinder Binder badge from a given tag, hash, or branch"""
binder_image_url = "https://camo.githubusercontent.com/581c077bdbc6ca6899c86d0acc6145ae85e9d80e6f805a1071793dbe48917982/68747470733a2f2f6d7962696e6465722e6f72672f62616467655f6c6f676f2e737667" # noqa
binder_url = f"https://mybinder.org/v2/gh/OSGeo/grass/{tag}?urlpath=lab%2Ftree%2Fdoc%2Fnotebooks%2Fbasic_example.ipynb" # noqa
return f"[![Binder]({binder_image_url})]({binder_url})"


def print_notes(
start_tag, end_tag, changes, categories, before=None, after=None, file=None
):
"""Print notes from given inputs
*changes* is a list of strings. It will be sorted and ordered by category
internally by this function.
"""
num_changes = round_down_to_five(len(changes))
print(
f"The GRASS GIS {end_tag} release provides more than "
f"{num_changes} improvements and fixes "
f"with respect to the release {start_tag}.\n"
)

if before:
print(before)
print("## What's Changed", file=file)
changes_by_category = split_to_categories(changes, categories=categories)
print_by_category(changes_by_category, categories=categories, file=file)
if after:
print(after)
print("")
print(binder_badge(end_tag))


def notes_from_gh_api(start_tag, end_tag, branch, categories, exclude):
"""Generate notes from GitHub API"""
text = subprocess.run(
[
"gh",
"api",
"repos/OSGeo/grass/releases/generate-notes",
"-f",
f"previous_tag_name={start_tag}",
"-f",
f"tag_name={end_tag}",
"-f",
f"target_commitish={branch}",
],
capture_output=True,
text=True,
check=True,
).stdout
body = json.loads(text)["body"]

lines = body.splitlines()
start_whats_changed = lines.index("## What's Changed")
end_whats_changed = lines.index("", start_whats_changed)
raw_changes = lines[start_whats_changed + 1 : end_whats_changed]
changes = []
for change in raw_changes:
if change.startswith("* ") or change.startswith("- "):
changes.append(change[2:])
else:
changes.append(change)
changes = remove_excluded_changes(changes=changes, exclude=exclude)
print_notes(
start_tag=start_tag,
end_tag=end_tag,
changes=changes,
before="\n".join(lines[:start_whats_changed]),
after="\n".join(lines[end_whats_changed + 1 :]),
categories=categories,
)


def csv_to_dict(filename, key, value):
"""Read a CSV file as a dictionary"""
result = {}
with open(filename, encoding="utf-8", newline="") as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
result[row[key]] = row[value]
return result


def notes_from_git_log(start_tag, end_tag, categories, exclude):
"""Generate notes from git log"""
text = subprocess.run(
["git", "log", f"{start_tag}..{end_tag}", f"--pretty=format:{PRETTY_TEMPLATE}"],
capture_output=True,
text=True,
check=True,
).stdout
commits = yaml.safe_load(text)
if not commits:
raise RuntimeError("No commits retrieved from git log (try different tags)")

svn_name_by_git_author = csv_to_dict(
"svn_name_git_author.csv", key="git_author", value="svn_name"
)
github_name_by_svn_name = csv_to_dict(
"svn_name_github_name.csv", key="svn_name", value="github_name"
)
github_name_by_git_author = csv_to_dict(
"git_author_github_name.csv", key="git_author", value="github_name"
)

lines = []
for commit in commits:
if commit["author_email"].endswith("users.noreply.github.com"):
github_name = commit["author_email"].split("@")[0]
if "+" in github_name:
github_name = github_name.split("+")[1]
github_name = f"@{github_name}"
else:
# Emails are stored with @ replaced by a space.
email = commit["author_email"].replace("@", " ")
git_author = f"{commit['author_name']} <{email}>"
if (
git_author not in svn_name_by_git_author
and git_author in github_name_by_git_author
):
github_name = github_name_by_git_author[git_author]
github_name = f"@{github_name}"
else:
try:
svn_name = svn_name_by_git_author[git_author]
github_name = github_name_by_svn_name[svn_name]
github_name = f"@{github_name}"
except KeyError:
github_name = git_author
lines.append(f"{commit['message']} by {github_name}")
lines = remove_excluded_changes(changes=lines, exclude=exclude)
print_notes(
start_tag=start_tag,
end_tag=end_tag,
changes=lines,
after=(
"**Full Changelog**: "
f"https://github.com/OSGeo/grass/compare/{start_tag}...{end_tag}"
),
categories=categories,
)


def create_release_notes(args):
"""Create release notes based on parsed command line parameters"""
end_tag = args.end_tag
if not end_tag:
# git log has default, but the others do not.
end_tag = subprocess.run(
["git", "rev-parse", "--verify", "HEAD"],
capture_output=True,
text=True,
check=True,
).stdout.strip()

with open("release.yml", encoding="utf-8") as file:
config = yaml.safe_load(file.read())["notes"]

if args.backend == "api":
notes_from_gh_api(
start_tag=args.start_tag,
end_tag=end_tag,
branch=args.branch,
categories=config["categories"],
exclude=config["exclude"],
)
else:
notes_from_git_log(
start_tag=args.start_tag,
end_tag=end_tag,
categories=config["categories"],
exclude=config["exclude"],
)


def main():
"""Parse command line arguments and create release notes"""
parser = argparse.ArgumentParser(
description="Generate release notes from git log or GitHub API.",
epilog="Run in utils directory to access the helper files.",
)
parser.add_argument(
"backend", choices=["log", "api"], help="use git log or GitHub API"
)
parser.add_argument(
"branch", help="needed for the GitHub API when tag does not exist"
)
parser.add_argument("start_tag", help="old tag to compare against")
parser.add_argument(
"end_tag",
help=(
"new tag; "
"if not created yet, "
"an empty string for git log will use the current revision"
),
)
args = parser.parse_args()
try:
create_release_notes(args)
except subprocess.CalledProcessError as error:
sys.exit(f"Subprocess '{' '.join(error.cmd)}' failed with: {error.stderr}")


if __name__ == "__main__":
main()
15 changes: 15 additions & 0 deletions utils/git_author_github_name.csv
@@ -0,0 +1,15 @@
git_author,github_name
Jürgen Fischer <jef norbit.de>,jef-n
Owen Smith <ocsmit protonmail.com>,ocsmit
Nicklas Larsson <n_larsson yahoo.com>,nilason
nilason <n_larsson yahoo.com>,nilason
Stefan Blumentrath <stefan.blumentrath gmx.de>,ninsbl
Anna Petrasova <kratochanna gmail.com>,petrasovaa
Māris Nartišs <maris.gis gmail.com>,marisn
Ondrej Pesek <pesej.ondrek gmail.com>,pesekon2
Bas Couwenberg <sebastic xs4all.nl>,sebastic
積丹尼 Dan Jacobson <jidanni jidanni.org>,jidanni
Yann Chemin <ychemin gmail.com>,YannChemin
Alberto Paradís Llop <albertoparadisllop gmail.com>,albertoparadisllop
Andrea Giudiceandrea <andreaerdna libero.it>,agiudiceandrea
Maximilian Stahlberg <viech unvanquished.net>,Viech

0 comments on commit ac9180e

Please sign in to comment.