Skip to content

Conversation

tru
Copy link
Collaborator

@tru tru commented Nov 30, 2023

As part of #73798 there was some discussion about using the format
helper to run from a git-hook. That was not possible for a number of
reasons, but with these changes it can now be installed as a hook and
then run on the local cache in git instead of a diff between revisions.

This also checks for two environment variables DARKER_FORMAT_PATH and
CLANG_FORMAT_PATH where you can specify the path to the program you want
to use.

@llvmbot
Copy link
Member

llvmbot commented Nov 30, 2023

@llvm/pr-subscribers-github-workflow

Author: Tobias Hieta (tru)

Changes

As part of #73798 there was some discussion about using the format
helper to run from a git-hook. That was not possible for a number of
reasons, but with these changes it can now be installed as a hook and
then run on the local cache in git instead of a diff between revisions.

This also checks for two environment variables DARKER_FORMAT_PATH and
CLANG_FORMAT_PATH where you can specify the path to the program you want
to use.


Full diff: https://github.com/llvm/llvm-project/pull/73957.diff

1 Files Affected:

  • (modified) llvm/utils/git/code-format-helper.py (+134-45)
diff --git a/llvm/utils/git/code-format-helper.py b/llvm/utils/git/code-format-helper.py
old mode 100644
new mode 100755
index 8876ba7a14debec..4a2f7c2a53abbda
--- a/llvm/utils/git/code-format-helper.py
+++ b/llvm/utils/git/code-format-helper.py
@@ -14,8 +14,22 @@
 import sys
 from functools import cached_property
 
-import github
-from github import IssueComment, PullRequest
+
+class FormatArgs:
+    start_rev: str = None
+    end_rev: str = None
+    repo: str = None
+    changed_files: list[str] = []
+    token: str = None
+    verbose: bool = True
+
+    def __init__(self, args: argparse.Namespace = None) -> None:
+        if not args is None:
+            self.start_rev = args.start_rev
+            self.end_rev = args.end_rev
+            self.repo = args.repo
+            self.token = args.token
+            self.changed_files = args.changed_files
 
 
 class FormatHelper:
@@ -31,9 +45,10 @@ def comment_tag(self) -> str:
     def instructions(self) -> str:
         raise NotImplementedError()
 
-    def format_run(
-        self, changed_files: list[str], args: argparse.Namespace
-    ) -> str | None:
+    def has_tool(self) -> bool:
+        raise NotImplementedError()
+
+    def format_run(self, changed_files: list[str], args: FormatArgs) -> str | None:
         raise NotImplementedError()
 
     def pr_comment_text_for_diff(self, diff: str) -> str:
@@ -63,17 +78,16 @@ def pr_comment_text_for_diff(self, diff: str) -> str:
 </details>
 """
 
-    def find_comment(
-        self, pr: PullRequest.PullRequest
-    ) -> IssueComment.IssueComment | None:
+    def find_comment(self, pr: any) -> any:
         for comment in pr.as_issue().get_comments():
             if self.comment_tag in comment.body:
                 return comment
         return None
 
-    def update_pr(
-        self, comment_text: str, args: argparse.Namespace, create_new: bool
-    ) -> None:
+    def update_pr(self, comment_text: str, args: FormatArgs, create_new: bool) -> None:
+        import github
+        from github import IssueComment, PullRequest
+
         repo = github.Github(args.token).get_repo(args.repo)
         pr = repo.get_issue(args.issue_number).as_pull_request()
 
@@ -85,17 +99,25 @@ def update_pr(
         elif create_new:
             pr.as_issue().create_comment(comment_text)
 
-    def run(self, changed_files: list[str], args: argparse.Namespace) -> bool:
+    def run(self, changed_files: list[str], args: FormatArgs) -> bool:
         diff = self.format_run(changed_files, args)
+        should_update_gh = args.token is not None and args.repo is not None
+
         if diff is None:
-            comment_text = f"""
-:white_check_mark: With the latest revision this PR passed the {self.friendly_name}.
-"""
-            self.update_pr(comment_text, args, create_new=False)
+            if should_update_gh:
+                comment_text = f"""
+    :white_check_mark: With the latest revision this PR passed the {self.friendly_name}.
+    """
+                self.update_pr(comment_text, args, create_new=False)
             return True
         elif len(diff) > 0:
-            comment_text = self.pr_comment_text_for_diff(diff)
-            self.update_pr(comment_text, args, create_new=True)
+            if should_update_gh:
+                comment_text = self.pr_comment_text_for_diff(diff)
+                self.update_pr(comment_text, args, create_new=True)
+            else:
+                print(
+                    f"Warning: {self.friendly_name}, {self.name} detected some issues with your code formatting..."
+                )
             return False
         else:
             # The formatter failed but didn't output a diff (e.g. some sort of
@@ -128,29 +150,47 @@ def filter_changed_files(self, changed_files: list[str]) -> list[str]:
                 filtered_files.append(path)
         return filtered_files
 
-    def format_run(
-        self, changed_files: list[str], args: argparse.Namespace
-    ) -> str | None:
+    @property
+    def clang_fmt_path(self) -> str:
+        if "CLANG_FORMAT_PATH" in os.environ:
+            return os.environ["CLANG_FORMAT_PATH"]
+        return "git-clang-format"
+
+    def has_tool(self) -> bool:
+        cmd = [self.clang_fmt_path, "-h"]
+        proc = None
+        try:
+            proc = subprocess.run(cmd, capture_output=True)
+        except:
+            return False
+        return proc.returncode == 0
+
+    def format_run(self, changed_files: list[str], args: FormatArgs) -> str | None:
         cpp_files = self.filter_changed_files(changed_files)
         if not cpp_files:
             return None
-        cf_cmd = [
-            "git-clang-format",
-            "--diff",
-            args.start_rev,
-            args.end_rev,
-            "--",
-        ] + cpp_files
-        print(f"Running: {' '.join(cf_cmd)}")
+
+        cf_cmd = [self.clang_fmt_path, "--diff"]
+
+        if args.start_rev and args.end_rev:
+            cf_cmd.append(args.start_rev)
+            cf_cmd.append(args.end_rev)
+
+        cf_cmd.append("--")
+        cf_cmd += cpp_files
+
+        if args.verbose:
+            print(f"Running: {' '.join(cf_cmd)}")
         self.cf_cmd = cf_cmd
         proc = subprocess.run(cf_cmd, capture_output=True)
         sys.stdout.write(proc.stderr.decode("utf-8"))
 
         if proc.returncode != 0:
             # formatting needed, or the command otherwise failed
-            print(f"error: {self.name} exited with code {proc.returncode}")
-            # Print the diff in the log so that it is viewable there
-            print(proc.stdout.decode("utf-8"))
+            if args.verbose:
+                print(f"error: {self.name} exited with code {proc.returncode}")
+                # Print the diff in the log so that it is viewable there
+                print(proc.stdout.decode("utf-8"))
             return proc.stdout.decode("utf-8")
         else:
             sys.stdout.write(proc.stdout.decode("utf-8"))
@@ -174,29 +214,46 @@ def filter_changed_files(self, changed_files: list[str]) -> list[str]:
 
         return filtered_files
 
-    def format_run(
-        self, changed_files: list[str], args: argparse.Namespace
-    ) -> str | None:
+    @property
+    def darker_fmt_path(self) -> str:
+        if "DARKER_FORMAT_PATH" in os.environ:
+            return os.environ["DARKER_FORMAT_PATH"]
+        return "darker"
+
+    def has_tool(self) -> bool:
+        cmd = [self.darker_fmt_path, "--version"]
+        proc = None
+        try:
+            proc = subprocess.run(cmd, capture_output=True)
+        except:
+            return False
+        return proc.returncode == 0
+
+    def format_run(self, changed_files: list[str], args: FormatArgs) -> str | None:
         py_files = self.filter_changed_files(changed_files)
         if not py_files:
             return None
         darker_cmd = [
-            "darker",
+            self.darker_fmt_path,
             "--check",
             "--diff",
-            "-r",
-            f"{args.start_rev}..{args.end_rev}",
-        ] + py_files
-        print(f"Running: {' '.join(darker_cmd)}")
+        ]
+        if args.start_rev and args.end_rev:
+            darker_cmd += ["-r", f"{args.start_rev}..{args.end_rev}"]
+        darker_cmd += py_files
+        if args.verbose:
+            print(f"Running: {' '.join(darker_cmd)}")
         self.darker_cmd = darker_cmd
         proc = subprocess.run(darker_cmd, capture_output=True)
-        sys.stdout.write(proc.stderr.decode("utf-8"))
+        if args.verbose:
+            sys.stdout.write(proc.stderr.decode("utf-8"))
 
         if proc.returncode != 0:
             # formatting needed, or the command otherwise failed
-            print(f"error: {self.name} exited with code {proc.returncode}")
-            # Print the diff in the log so that it is viewable there
-            print(proc.stdout.decode("utf-8"))
+            if args.verbose:
+                print(f"error: {self.name} exited with code {proc.returncode}")
+                # Print the diff in the log so that it is viewable there
+                print(proc.stdout.decode("utf-8"))
             return proc.stdout.decode("utf-8")
         else:
             sys.stdout.write(proc.stdout.decode("utf-8"))
@@ -205,7 +262,39 @@ def format_run(
 
 ALL_FORMATTERS = (DarkerFormatHelper(), ClangFormatHelper())
 
+
+def hook_main():
+    # fill out args
+    args = FormatArgs()
+    args.verbose = False
+
+    # find the changed files
+    cmd = ["git", "diff", "--cached", "--name-only", "--diff-filter=d"]
+    proc = subprocess.run(cmd, capture_output=True)
+    output = proc.stdout.decode("utf-8")
+    for line in output.splitlines():
+        args.changed_files.append(line)
+
+    failed_fmts = []
+    for fmt in ALL_FORMATTERS:
+        if fmt.has_tool():
+            if not fmt.run(args.changed_files, args):
+                failed_fmts.append(fmt.name)
+        else:
+            print(f"Couldn't find {fmt.name}, can't check " + fmt.friendly_name.lower())
+
+    if len(failed_fmts) > 0:
+        sys.exit(1)
+
+    sys.exit(0)
+
+
 if __name__ == "__main__":
+    script_path = os.path.abspath(__file__)
+    if ".git/hooks" in script_path:  # and "GIT_DIR" in os.environ:
+        hook_main()
+        sys.exit(0)
+
     parser = argparse.ArgumentParser()
     parser.add_argument(
         "--token", type=str, required=True, help="GitHub authentiation token"
@@ -232,7 +321,7 @@ def format_run(
         help="Comma separated list of files that has been changed",
     )
 
-    args = parser.parse_args()
+    args = FormatArgs(parser.parse_args())
 
     changed_files = []
     if args.changed_files:

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! I tried it out and have a few comments based on that.

@tru
Copy link
Collaborator Author

tru commented Dec 1, 2023

I will try to address these issues, but I don't think it will happen until next week, just FYI.

@tru
Copy link
Collaborator Author

tru commented Dec 6, 2023

@ldionne any more thoughts on this one?

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@tru
Copy link
Collaborator Author

tru commented Dec 6, 2023

@boomanaiden154 @tstellar do you guys have any additional comments?

@tstellar
Copy link
Collaborator

tstellar commented Dec 6, 2023

This seems like a good idea to me, but I'm not familiar with some of the technical details.

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No major issues on my side, a couple small python nits.

Can't really comment on the full hooks integration as I'm not very familiar with how they work/any nuances/footguns there.

It would be nice to figure out unit/integration testing for this to make changes easier/prevent regressions, but given the interaction with the GH API it would require mocking and some engineering effort that might not be worth it.

@tru tru force-pushed the format_helper_hook branch from a0bb04a to cbee49a Compare December 11, 2023 08:25
@tru
Copy link
Collaborator Author

tru commented Dec 11, 2023

This should be good now - a last look @boomanaiden154 ?

…t hook

As part of llvm#73798 there was some discussion about using the format
helper to run from a git-hook. That was not possible for a number of
reasons, but with these changes it can now be installed as a hook and
then run on the local cache in git instead of a diff between revisions.

This also checks for two environment variables DARKER_FORMAT_PATH and
CLANG_FORMAT_PATH where you can specify the path to the program you want
to use.
@tru tru force-pushed the format_helper_hook branch from cbee49a to e8932fa Compare December 11, 2023 08:27
Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing else from me. Thank you for very much for putting this together and for making it broadly useful across all of LLVM!

@tru tru merged commit bd3e8eb into llvm:main Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants