Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add action to populate the change log from PR titles triggered by @multiqc-bot changelog #2025

Merged
merged 32 commits into from
Sep 27, 2023
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
fadd13b
Script to append to a change log
vladsavelyev Sep 4, 2023
035ec52
GitHub workflow that appends to the changelog (#2026)
vladsavelyev Sep 4, 2023
980f9ca
Only commit/push if changed
vladsavelyev Sep 4, 2023
c1ed0ca
Stip module name from module update change log entry
vladsavelyev Sep 4, 2023
34e3a8b
Merge branch 'master' into changelog-ci
vladsavelyev Sep 4, 2023
f3b9b08
Clean up and rename
vladsavelyev Sep 4, 2023
61abcfe
Simplify CI
vladsavelyev Sep 4, 2023
a3fced0
Fix path to changelog.py
vladsavelyev Sep 4, 2023
261feec
Quast: some update (#2028)
vladsavelyev Sep 4, 2023
63b9f1c
Simplify
vladsavelyev Sep 4, 2023
e28b03f
Check if ownder is ewels
vladsavelyev Sep 4, 2023
fa8c623
Branch master
vladsavelyev Sep 4, 2023
93be31d
Use GITHUB_WORKSPACE to get base path
vladsavelyev Sep 4, 2023
5cdcaff
Try git cloning test data for windows
vladsavelyev Sep 5, 2023
142d986
Revert "Try git cloning test data for windows"
vladsavelyev Sep 5, 2023
389d9c1
Push commit into the branch on review approval
vladsavelyev Sep 5, 2023
3fc9878
Skip if changelog was modified
vladsavelyev Sep 5, 2023
342996f
Merge branch 'master' into changelog-ci
vladsavelyev Sep 24, 2023
fd14ed6
Replace changelog line if exists
vladsavelyev Sep 24, 2023
60e41ea
Pass comment. Replace multi-line new module lines
vladsavelyev Sep 24, 2023
90ad445
Fix
vladsavelyev Sep 24, 2023
856ae2e
Fixes
vladsavelyev Sep 24, 2023
12611ca
Fix
vladsavelyev Sep 26, 2023
bf816d1
Fix
vladsavelyev Sep 26, 2023
fd60022
Determine added or changed module from the changelog
vladsavelyev Sep 26, 2023
1cd848a
Docs
vladsavelyev Sep 26, 2023
49fe082
Docs
vladsavelyev Sep 26, 2023
183bd04
Fix
vladsavelyev Sep 26, 2023
2d8c5fe
Fix
vladsavelyev Sep 26, 2023
9bf2cef
Simplify module updates, replace sections in old versions
vladsavelyev Sep 26, 2023
68a3d12
Sort only module updates
vladsavelyev Sep 26, 2023
4a080f7
Update .github/workflows/changelog.py
vladsavelyev Sep 27, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
211 changes: 211 additions & 0 deletions .github/workflows/changelog.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,211 @@
"""
To be called by a CI action, assumes PR_TITLE and PR_NUMBER, and GITHUB_WORKSPACE environment variables are set.

Adds a line into the CHANGELOG.md:
If a PR title starts with "New module: ", adds a line under the ""### New modules" section.
If a PR starts with a name of an existing module, adds a line under "### Module updates".
Everything else will go under "MultiQC updates" in the changelog, unless "(chore)" or "(docs)" is appended to the title.

Other assumptions:
- CHANGELOG.md has a running section for an ongoing "dev" version (i.e. titled "## MultiQC vX.Ydev").
- Under that section, there are sections "### MultiQC updates", "### New modules" and "### Module updates".
- For module meta info, checks the file multiqc/modules/<module_name>/<module_name>.py.
"""

import os
import re
import sys
from pathlib import Path

REPO_URL = "https://github.com/ewels/MultiQC"

# Assumes the environment is set by the GitHub action.
pr_title = os.environ["PR_TITLE"]
pr_number = os.environ["PR_NUMBER"]
comment = os.environ.get("COMMENT", "")
base_path = Path(os.environ.get("GITHUB_WORKSPACE", ""))

assert pr_title, pr_title
assert pr_number, pr_number

# Trim the PR number added when GitHub squashes commits, e.g. "Module: Updated (#2026)"
pr_title = pr_title.removesuffix(f" (#{pr_number})")

changelog_path = base_path / "CHANGELOG.md"


def find_module_info(module_name):
"""
Helper function to load module meta info. With current setup, can't really just
import the module and call `mod.info`, as the module does the heavy work on
initialization. But that's actually alright: we avoid installing and importing
MultiQC and the action runs faster.
"""
module_name = module_name.lower()
modules_dir = base_path / "multiqc/modules"
py_path = None
for dir_name in os.listdir(modules_dir):
if dir_name.lower() == module_name:
module_dir = modules_dir / dir_name
py_path = module_dir / f"{dir_name}.py"
if not py_path.exists():
print(f"Folder for {module_name} exists, but doesn't have a {py_path} file", file=sys.stderr)
sys.exit(1)
break

if not py_path: # Module not found
return None
with py_path.open("r") as f:
contents = f.read()
if not (m := re.search(r'name="([^"]+)"', contents)):
return None
name = m.group(1)
if not (m := re.search(r'href="([^"]+)"', contents)):
return None
url = m.group(1)
if not (m := re.search(r'info="([^"]+)"', contents)):
if not (m := re.search(r'info="""([^"]+)"""', contents)):
vladsavelyev marked this conversation as resolved.
Show resolved Hide resolved
return None
info = m.group(1)
# Reduce consecutive spaces and newlines.
info = re.sub(r"\s+", " ", info)
return {"name": name, "url": url, "info": info}


# Determine the type of the PR: new module, module update, or core update.
mod = None
section = "### MultiQC updates" # Default section for non-module (core) updates.
if pr_title.lower().startswith("new module: "):
# PR introduces a new module.
section = "### New modules"
module_name = pr_title.split(":")[1].strip()
mod = find_module_info(module_name)
if not mod:
# That should normally never happen because the other CI would fail and block
# merging of the PR.
print(
f"Cannot load a module with name {module_name}",
file=sys.stderr,
)
sys.exit(1)
else:
# Checking if it's an existing module update.
maybe_mod_name = pr_title.split(":")[0]
mod = find_module_info(maybe_mod_name)
if mod is not None:
section = "### Module updates"
pr_title = pr_title.split(":")[1].strip().capitalize()

# Now that we determined the PR type, preparing the change log entry.
pr_link = f"([#{pr_number}]({REPO_URL}/pull/{pr_number}))"
if comment := comment.removeprefix("@multiqc-bot changelog").strip():
new_lines = [
f"- {comment} {pr_link}\n",
]
elif section == "### New modules":
new_lines = [
f"- [**{mod['name']}**]({mod['url']}) {pr_link}\n",
f" - {mod['name']} {mod['info']}\n",
vladsavelyev marked this conversation as resolved.
Show resolved Hide resolved
]
elif section == "### Module updates":
assert mod is not None
new_lines = [
f"- **{mod['name']}**\n",
f" - {pr_title} {pr_link}\n",
]
else:
new_lines = [
f"- {pr_title} {pr_link}\n",
]

# Finally, updating the changelog.
# Read the current changelog lines. We will print them back as is, except for one new
# entry, corresponding to this new PR.
with changelog_path.open("r") as f:
orig_lines = f.readlines()
updated_lines = []

# Find the next line in the change log that matches the pattern "## MultiQC v.*dev"
# If it doesn't exist, exist with code 1 (let's assume that a new section is added
# manually or by CI when a release is pushed).
# Else, find the next line that matches the `section` variable, and insert a new line
# under it (we also assume that section headers are added already).
inside_version_dev = False
while orig_lines:
line = orig_lines.pop(0)

if line.startswith("## "): # Version header, e.g. "## MultiQC v1.10dev"
updated_lines.append(line)

# Parse version from the line ## MultiQC v1.10dev or
# ## [MultiQC v1.15](https://github.com/ewels/MultiQC/releases/tag/v1.15) ...
if not (m := re.match(r".*MultiQC (v\d+\.\d+(dev)?).*", line)):
print(f"Cannot parse version from line {line.strip()}.", file=sys.stderr)
sys.exit(1)
version = m.group(1)

if not inside_version_dev:
if not version.endswith("dev"):
print(
"Can't find a 'dev' version section in the changelog. Make sure "
"it's created, and sections MultiQC updates, New modules and "
"Module updates are added under it.",
file=sys.stderr,
)
sys.exit(1)
inside_version_dev = True
else:
if version.endswith("dev"):
print(
f"Found another 'dev' version section in the changelog, make"
f"sure to change it to a 'release' stable version tag. "
f"Line: {line.strip()}",
file=sys.stderr,
)
sys.exit(1)
# We are past the dev version, so just add back the rest of the lines and break.
updated_lines.extend(orig_lines)
break

elif inside_version_dev and line.lower().startswith(section.lower()): # Section of interest header
if new_lines is None:
print(f"Already added new lines into section {section}, is the section duplicated?", file=sys.stderr)
sys.exit(1)
updated_lines.append(line)
# Collecting lines until the next section.
section_lines = []
while True:
line = orig_lines.pop(0)
if line.startswith("##"):
# Found the next section header, so need to put all the lines we collected.
updated_lines.append("\n")
updated_lines.extend(section_lines)
updated_lines.extend(new_lines)
updated_lines.append("\n")
print(f"Updated {changelog_path} section '{section}' with lines:\n" + "".join(new_lines))
new_lines = None
# Pushing back the next section header line
orig_lines.insert(0, line)
break
elif line.strip():
# if the line already contains a link to the PR, don't add it again.
if line.strip().endswith(pr_link):
existing = line + "".join(orig_lines[: len(new_lines) - 1])
if "".join(new_lines) == existing:
print(f"Found existing identical entry for this pull request #{pr_number}:")
print(existing)
sys.exit(0)
else:
print(f"Found existing entry for this pull request #{pr_number}. It will be replaced:")
print(existing)
for _ in range(len(new_lines) - 1):
orig_lines.pop(0)
else:
section_lines.append(line)
else:
updated_lines.append(line)


# Finally, writing the updated lines back.
with changelog_path.open("w") as f:
f.writelines(updated_lines)
61 changes: 61 additions & 0 deletions .github/workflows/changelog.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
name: Update CHANGELOG.md
on:
issue_comment:
types: [created]

# For manually triggering this workflow
workflow_dispatch:
inputs:
number:
description: PR number
required: true

env:
PR_TITLE: ${{ github.event.pull_request.title }}
PR_NUMBER: ${{ github.event.pull_request.number }}
GH_TOKEN: ${{ github.token }}

jobs:
update_changelog:
runs-on: ubuntu-latest
# Only run if comment is on a PR with the main repo, and if it contains the magic keywords
if: >
github.repository_owner == 'ewels' &&
github.event.issue.pull_request &&
startsWith(github.event.comment.body, '@multiqc-bot changelog')

steps:
- uses: actions/checkout@v3
with:
token: ${{ secrets.MQC_BOT_GITHUB_TOKEN }}

# Action runs on the issue comment, so we don't get the PR by default
# Use the gh cli to check out the PR
- name: Checkout Pull Request
run: gh pr checkout ${{ github.event.issue.number }}
env:
GITHUB_TOKEN: ${{ secrets.nf_core_bot_auth_token }}

- uses: actions/setup-python@v3

- name: Update CHANGELOG.md from the PR title
env:
PR_TITLE: ${{ github.event.pull_request.title }}
PR_NUMBER: ${{ github.event.pull_request.number }}
COMMENT: ${{ github.event.comment.body }}
run: python ${GITHUB_WORKSPACE}/.github/workflows/changelog.py

- name: Check if CHANGELOG.md actually changed
run: |
git diff --exit-code ${GITHUB_WORKSPACE}/CHANGELOG.md || echo "changed=YES" >> $GITHUB_ENV
echo "file changed: ${{ env.changed }}"

- name: Push changes
run: |
git config user.name 'MultiQC Bot'
git config user.email 'multiqc-bot@seqera.io'
git config push.default upstream
git add ${GITHUB_WORKSPACE}/CHANGELOG.md
git status
git commit -m "[automated] Update CHANGELOG.md"
git push
5 changes: 4 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,14 @@

### MultiQC updates

- New super awesome update ([#2026](https://github.com/ewels/MultiQC/pull/2026))

### New Modules

- [**Bracken**](https://ccb.jhu.edu/software/bracken/)
- A highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.
- [**BBDuk**](https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/bbduk-guide/) ([#2026](https://github.com/ewels/MultiQC/pull/2026))
- BBDuk is a tool performing common data-quality-related trimming, filtering, and masking operations with a kmer based approach

### Module updates

Expand Down Expand Up @@ -2119,4 +2123,3 @@ Bugfixes:
- The first public release of MultiQC, after a month of development. Basic
structure in place and modules for FastQC, FastQ Screen, Cutadapt, Bismark,
STAR, Bowtie, Subread featureCounts and Picard MarkDuplicates. Approaching
stability, though still under fairly heavy development.