Skip to content

Conversation

@zeyi2
Copy link
Member

@zeyi2 zeyi2 commented Nov 20, 2025

This PR is not ready for review now.

Closes #167098

@zeyi2 zeyi2 marked this pull request as draft November 20, 2025 05:23
@llvmbot
Copy link
Member

llvmbot commented Nov 20, 2025

@llvm/pr-subscribers-clang-tidy

@llvm/pr-subscribers-github-workflow

Author: mitchell (zeyi2)

Changes

This PR is not ready for review now.

Closes #167098


Full diff: https://github.com/llvm/llvm-project/pull/168827.diff

3 Files Affected:

  • (modified) .github/workflows/containers/github-action-ci-tooling/Dockerfile (+4)
  • (modified) .github/workflows/pr-code-lint.yml (+19-6)
  • (modified) llvm/utils/git/code-lint-helper.py (+180-21)
diff --git a/.github/workflows/containers/github-action-ci-tooling/Dockerfile b/.github/workflows/containers/github-action-ci-tooling/Dockerfile
index b78c99efb9be3..8d02baa05f489 100644
--- a/.github/workflows/containers/github-action-ci-tooling/Dockerfile
+++ b/.github/workflows/containers/github-action-ci-tooling/Dockerfile
@@ -94,6 +94,10 @@ COPY --from=llvm-downloader /llvm-extract/LLVM-${LLVM_VERSION}-Linux-X64/bin/cla
 COPY clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py ${LLVM_SYSROOT}/bin/clang-tidy-diff.py
 
 # Install dependencies for 'pr-code-lint.yml' job
+RUN apt-get update && \
+    DEBIAN_FRONTEND=noninteractive apt-get install -y python3-doc8 && \
+    apt-get clean && \
+    rm -rf /var/lib/apt/lists/*
 COPY llvm/utils/git/requirements_linting.txt requirements_linting.txt
 RUN pip install -r requirements_linting.txt --break-system-packages && \
     rm requirements_linting.txt
diff --git a/.github/workflows/pr-code-lint.yml b/.github/workflows/pr-code-lint.yml
index 5444a29c22205..60c1900000e5e 100644
--- a/.github/workflows/pr-code-lint.yml
+++ b/.github/workflows/pr-code-lint.yml
@@ -30,7 +30,7 @@ jobs:
         uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
         with:
           fetch-depth: 2
-      
+
       - name: Get changed files
         id: changed-files
         uses: tj-actions/changed-files@24d32ffd492484c1d75e0c0b894501ddb9d30d62 # v47.0.0
@@ -39,14 +39,14 @@ jobs:
           skip_initial_fetch: true
           base_sha: 'HEAD~1'
           sha: 'HEAD'
-      
+
       - name: Listed files
         env:
           CHANGED_FILES: ${{ steps.changed-files.outputs.all_changed_files }}
         run: |
           echo "Changed files:"
           echo "$CHANGED_FILES"
-      
+
       # TODO: create special mapping for 'codegen' targets, for now build predefined set
       # TODO: add entrypoint in 'compute_projects.py' that only adds a project and its direct dependencies
       - name: Configure and CodeGen
@@ -71,25 +71,38 @@ jobs:
                 -DLLVM_INCLUDE_TESTS=OFF \
                 -DCLANG_INCLUDE_TESTS=OFF \
                 -DCMAKE_BUILD_TYPE=Release
-          
+
           ninja -C build \
                 clang-tablegen-targets \
                 genconfusable               # for "ConfusableIdentifierCheck.h"
 
-      - name: Run code linter
+      - name: Run clang-tidy linter
         env:
           GITHUB_PR_NUMBER: ${{ github.event.pull_request.number }}
           CHANGED_FILES: ${{ steps.changed-files.outputs.all_changed_files }}
         run: |
           echo "[]" > comments &&
           python3 llvm/utils/git/code-lint-helper.py \
+            --linter clang-tidy \
             --token ${{ secrets.GITHUB_TOKEN }} \
             --issue-number $GITHUB_PR_NUMBER \
             --start-rev HEAD~1 \
             --end-rev HEAD \
             --verbose \
             --changed-files "$CHANGED_FILES"
-      
+
+      - name: Run doc8 linter
+        env:
+          GITHUB_PR_NUMBER: ${{ github.event.pull_request.number }}
+        run: |
+          python3 llvm/utils/git/code-lint-helper.py \
+            --linter doc8 \
+            --token ${{ secrets.GITHUB_TOKEN }} \
+            --issue-number $GITHUB_PR_NUMBER \
+            --start-rev HEAD~1 \
+            --end-rev HEAD \
+            --verbose
+
       - name: Upload results
         uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4 # v5.0.0
         if: always()
diff --git a/llvm/utils/git/code-lint-helper.py b/llvm/utils/git/code-lint-helper.py
index 1232f3ab0d370..fc2068b438209 100755
--- a/llvm/utils/git/code-lint-helper.py
+++ b/llvm/utils/git/code-lint-helper.py
@@ -34,6 +34,8 @@ class LintArgs:
     issue_number: int = 0
     build_path: str = "build"
     clang_tidy_binary: str = "clang-tidy"
+    doc8_binary: str = "doc8"
+    linter: str = None
 
     def __init__(self, args: argparse.Namespace = None) -> None:
         if not args is None:
@@ -46,9 +48,12 @@ def __init__(self, args: argparse.Namespace = None) -> None:
             self.verbose = args.verbose
             self.build_path = args.build_path
             self.clang_tidy_binary = args.clang_tidy_binary
+            self.doc8_binary = args.doc8_binary
+            self.linter = args.linter
 
 
-COMMENT_TAG = "<!--LLVM CODE LINT COMMENT: clang-tidy-->"
+COMMENT_TAG_CLANG_TIDY = "<!--LLVM CODE LINT COMMENT: clang-tidy-->"
+COMMENT_TAG_DOC8 = "<!--LLVM CODE LINT COMMENT: doc8-->"
 
 
 def get_instructions(cpp_files: List[str]) -> str:
@@ -135,13 +140,22 @@ def create_comment_text(warning: str, cpp_files: List[str]) -> str:
 """
 
 
-def find_comment(pr: any) -> any:
+def find_comment(pr: any, args: LintArgs) -> any:
+    comment_tag = get_comment_tag(args.linter)
     for comment in pr.as_issue().get_comments():
-        if COMMENT_TAG in comment.body:
+        if comment_tag in comment.body:
             return comment
     return None
 
 
+def get_comment_tag(linter: str) -> str:
+    if linter == "clang-tidy":
+        return COMMENT_TAG_CLANG_TIDY
+    elif linter == "doc8":
+        return COMMENT_TAG_DOC8
+    raise ValueError(f"Unknown linter: {linter}")
+
+
 def create_comment(
     comment_text: str, args: LintArgs, create_new: bool
 ) -> Optional[dict]:
@@ -150,9 +164,10 @@ def create_comment(
     repo = github.Github(args.token).get_repo(args.repo)
     pr = repo.get_issue(args.issue_number).as_pull_request()
 
-    comment_text = COMMENT_TAG + "\n\n" + comment_text
+    comment_tag = get_comment_tag(args.linter)
+    comment_text = comment_tag + "\n\n" + comment_text
 
-    existing_comment = find_comment(pr)
+    existing_comment = find_comment(pr, args)
 
     comment = None
     if create_new or existing_comment:
@@ -215,7 +230,126 @@ def run_clang_tidy(changed_files: List[str], args: LintArgs) -> Optional[str]:
     return clean_clang_tidy_output(proc.stdout.strip())
 
 
-def run_linter(changed_files: List[str], args: LintArgs) -> tuple[bool, Optional[dict]]:
+
+def clean_doc8_output(output: str) -> Optional[str]:
+    if not output:
+        return None
+
+    lines = output.split("\n")
+    cleaned_lines = []
+    in_summary = False
+
+    for line in lines:
+        if line.startswith("Scanning...") or line.startswith("Validating..."):
+            continue
+        if line.startswith("========"):
+            in_summary = True
+            continue
+        if in_summary:
+            continue
+        if line.strip():
+            cleaned_lines.append(line)
+
+    if cleaned_lines:
+        return "\n".join(cleaned_lines)
+    return None
+
+
+def get_doc8_instructions() -> str:
+    # TODO: use git diff
+    return "doc8 ./clang-tools-extra/docs/clang-tidy/checks/"
+
+
+def create_doc8_comment_text(doc8_output: str) -> str:
+    instructions = get_doc8_instructions()
+    return f"""
+:warning: Documentation linter doc8 found issues in your code. :warning:
+
+<details>
+<summary>
+You can test this locally with the following command:
+</summary>
+
+```bash
+{instructions}
+```
+
+</details>
+
+<details>
+<summary>
+View the output from doc8 here.
+</summary>
+
+```
+{doc8_output}
+```
+
+</details>
+"""
+
+
+def run_doc8(args: LintArgs) -> tuple[int, Optional[str]]:
+    doc8_cmd = [args.doc8_binary, "./clang-tools-extra/docs/clang-tidy/checks/"]
+
+    if args.verbose:
+        print(f"Running doc8: {' '.join(doc8_cmd)}")
+
+    proc = subprocess.run(
+        doc8_cmd,
+        stdout=subprocess.PIPE,
+        stderr=subprocess.PIPE,
+        text=True,
+        check=False,
+    )
+
+    cleaned_output = clean_doc8_output(proc.stdout.strip())
+    if proc.returncode != 0 and cleaned_output is None:
+        # Infrastructure failure
+        return proc.returncode, proc.stderr.strip()
+
+    return proc.returncode, cleaned_output
+
+
+def run_doc8_linter(args: LintArgs) -> tuple[bool, Optional[dict]]:
+    returncode, result = run_doc8(args)
+    should_update_gh = args.token is not None and args.repo is not None
+    comment = None
+
+    if returncode == 0:
+        if should_update_gh:
+            comment_text = (
+                ":white_check_mark: With the latest revision "
+                "this PR passed the documentation linter."
+            )
+            comment = create_comment(comment_text, args, create_new=False)
+        return True, comment
+    else:
+        if should_update_gh:
+            if result:
+                comment_text = create_doc8_comment_text(result)
+                comment = create_comment(comment_text, args, create_new=True)
+            else:
+                comment_text = (
+                    ":warning: The documentation linter failed without printing "
+                    "an output. Check the logs for output. :warning:"
+                )
+                comment = create_comment(comment_text, args, create_new=False)
+        else:
+            if result:
+                print(
+                    "Warning: Documentation linter, doc8 detected "
+                    "some issues with your code..."
+                )
+                print(result)
+            else:
+                print("Warning: Documentation linter, doc8 failed to run.")
+        return False, comment
+
+
+def run_clang_tidy_linter(
+    changed_files: List[str], args: LintArgs
+) -> tuple[bool, Optional[dict]]:
     changed_files = [arg for arg in changed_files if "third-party" not in arg]
 
     cpp_files = filter_changed_files(changed_files)
@@ -255,6 +389,13 @@ def run_linter(changed_files: List[str], args: LintArgs) -> tuple[bool, Optional
 
 if __name__ == "__main__":
     parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "--linter",
+        type=str,
+        choices=["clang-tidy", "doc8"],
+        required=True,
+        help="The linter to run.",
+    )
     parser.add_argument(
         "--token", type=str, required=True, help="GitHub authentication token"
     )
@@ -291,6 +432,12 @@ def run_linter(changed_files: List[str], args: LintArgs) -> tuple[bool, Optional
         default="clang-tidy",
         help="Path to clang-tidy binary",
     )
+    parser.add_argument(
+        "--doc8-binary",
+        type=str,
+        default="doc8",
+        help="Path to doc8 binary",
+    )
     parser.add_argument(
         "--verbose", action="store_true", default=True, help="Verbose output"
     )
@@ -298,32 +445,44 @@ def run_linter(changed_files: List[str], args: LintArgs) -> tuple[bool, Optional
     parsed_args = parser.parse_args()
     args = LintArgs(parsed_args)
 
-    changed_files = []
-    if args.changed_files:
-        changed_files = args.changed_files.split(",")
-
-    if args.verbose:
-        print(f"got changed files: {changed_files}")
-
     if args.verbose:
-        print("running linter clang-tidy")
+        print(f"running linter {args.linter}")
 
-    success, comment = run_linter(changed_files, args)
+    success, comment = False, None
+    if args.linter == "clang-tidy":
+        changed_files = []
+        if args.changed_files:
+            changed_files = args.changed_files.split(",")
+        if args.verbose:
+            print(f"got changed files: {changed_files}")
+        success, comment = run_clang_tidy_linter(changed_files, args)
+    elif args.linter == "doc8":
+        success, comment = run_doc8_linter(args)
 
     if not success:
         if args.verbose:
-            print("linter clang-tidy failed")
+            print(f"linter {args.linter} failed")
 
     # Write comments file if we have a comment
     if comment:
+        import json
         if args.verbose:
-            print(f"linter clang-tidy has comment: {comment}")
+            print(f"linter {args.linter} has comment: {comment}")
 
-        with open("comments", "w") as f:
-            import json
+        existing_comments = []
+        if os.path.exists("comments"):
+            with open("comments", "r") as f:
+                try:
+                    existing_comments = json.load(f)
+                except json.JSONDecodeError:
+                    # File might be empty or invalid, start fresh
+                    pass
 
-            json.dump([comment], f)
+        existing_comments.append(comment)
+
+        with open("comments", "w") as f:
+            json.dump(existing_comments, f)
 
     if not success:
-        print("error: some linters failed: clang-tidy")
+        print(f"error: linter {args.linter} failed")
         sys.exit(1)


# Install dependencies for 'pr-code-lint.yml' job
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y python3-doc8 && \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@vbvictor vbvictor Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you measure how much bigger code-lint container becomes with this change. If it become reasonably bigger, we can probably install doc8 with requirements_linting.txt and see how much bigger it gets.

Note that on average 100mb of container size brings ~1sec more in each job run because of longer downloading. So keeping size small get us benefits in long run.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think doc8 is a pretty light dependency:

Package: python3-doc8
Version: 0.10.1-3
Installed-Size: 95.2 kB
Download-Size: 17.0 kB

The full build process is still running on my computer. So it will take some time before I can give a more accurate analysis. However, given that the only change in the Docker image is the addition of the doc8 package, I don't think it will have a large impact. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Its pretty small then, no need to check.

@zeyi2 zeyi2 force-pushed the clang-tidy-doc8-ci branch from 260cd88 to 8fc6300 Compare November 20, 2025 05:53
- name: Run code linter
- name: Install linter dependencies
run: pip install doc8 --break-system-packages
Copy link
Member Author

@zeyi2 zeyi2 Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that this is completely wrong, this purpose of this is to add doc8 package (seems that modifying Dockerfile doesn't work in permerge CI) so I can test whether the modified python script can find the issue in the repo correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(seems that modifying Dockerfile doesn't work in permerge CI)

Yes, we need to do it in 2 steps:

First make a PR with new dockerfile and push it to main branch. It will become the default of https://github.com/llvm/llvm-project/pkgs/container/ci-ubuntu-24.04-lint.

Then, in this PR you will be able to use Dockerfile with doc8 installed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another approach would be: build CI lint container locally and push it to LLVM github containers via personal token.
I did such thing in #164294.

I have a setup already so can do it myself and then we can test with real container.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do it locally with podman build, podman login <token>, podman push

@github-actions
Copy link

github-actions bot commented Nov 20, 2025

⚠️ Python code formatter, darker found issues in your code. ⚠️

You can test this locally with the following command:
darker --check --diff -r origin/main...HEAD llvm/utils/git/code-lint-helper.py

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from darker here.
--- code-lint-helper.py	2025-11-21 06:27:57.000000 +0000
+++ code-lint-helper.py	2025-11-21 06:31:41.318496 +0000
@@ -308,11 +308,11 @@
 
 
 def run_doc8_linter(args: LintArgs) -> tuple[bool, Optional[dict]]:
     changed_files = []
     if args.changed_files:
-        changed_files = args.changed_files.split(',')
+        changed_files = args.changed_files.split(",")
     doc_files = filter_doc_files(changed_files)
 
     is_success = True
     result = None
 

@github-actions
Copy link

github-actions bot commented Nov 20, 2025

✅ With the latest revision this PR passed the documentation linter.

@zeyi2 zeyi2 force-pushed the clang-tidy-doc8-ci branch from 9fe22f1 to 27c30ab Compare November 20, 2025 06:16
@zeyi2
Copy link
Member Author

zeyi2 commented Nov 20, 2025

doc8 seems to be working right now :)

@github-actions
Copy link

🐧 Linux x64 Test Results

  • 186412 tests passed
  • 4868 tests skipped

@zeyi2 zeyi2 requested a review from vbvictor November 20, 2025 07:14
Comment on lines +99 to +109
- name: Run doc8 linter
env:
GITHUB_PR_NUMBER: ${{ github.event.pull_request.number }}
run: |
python3 llvm/utils/git/code-lint-helper.py \
--linter doc8 \
--token ${{ secrets.GITHUB_TOKEN }} \
--issue-number $GITHUB_PR_NUMBER \
--start-rev HEAD~1 \
--end-rev HEAD \
--verbose
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do it in one step inside code-lint-helper.py. It should be a generic helper to accumulate all linters.
Look at formatting job that run multiple linters under one code-format-helper.py invokation

Comment on lines 246 to 255
for line in lines:
if line.startswith("Scanning...") or line.startswith("Validating..."):
continue
if line.startswith("========"):
in_summary = True
continue
if in_summary:
continue
if line.strip():
cleaned_lines.append(line)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use -q, --quiet only print violations flag of doc8?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use -q, --quiet only print violations flag of doc8?

Thanks for pointing that out, it was a mistake from me. I forgot that this flag exists :)

Comment on lines 262 to 264
def get_doc8_instructions() -> str:
# TODO: use git diff
return "doc8 ./clang-tools-extra/docs/clang-tidy/checks/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should implement it in the end, it should be easy because we have changed_files already passed as argument - just check if one of the file there starts with "/clang-tools-extra/docs/clang-tidy/checks/".
Look at how clang-tidy handle it: --changed-files "$CHANGED_FILES"

parser.add_argument(
"--linter",
type=str,
choices=["clang-tidy", "doc8"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future enhancements: Flake8 and PyLint are great tools for Python linting. But it'll be necessary to tweak configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[clang-tidy][docs] Fix trailing whitespaces and lines longer than 80 characters

4 participants