[lit] Improve test output from lit's internal shell

This patch and D154984 were discussed in <https://discourse.llvm.org/t/rfc-improving-lits-debug-output/72839>. Motivation ---------- D154984 removes the "Script:" section that lit prints along with a test's output, and it makes -v and -a imply -vv. For example, after D154984, the "Script:" section below is never shown, but -v is enough to produce the execution trace following it: ``` Script: -- : 'RUN: at line 1'; echo hello | FileCheck bogus.txt && echo success -- Exit Code: 2 Command Output (stdout): -- $ ":" "RUN: at line 1" $ "echo" "hello" # command output: hello $ "FileCheck" "bogus.txt" # command stderr: Could not open check file 'bogus.txt': No such file or directory error: command failed with exit status: 2 -- ``` In the D154984 review, some reviewers point out that they have been using the "Script:" section for copying and pasting a test's shell commands to a terminal window. The shell commands as printed in the execution trace can be harder to copy and paste for the following reasons: - They drop redirections and break apart RUN lines at `&&`, `|`, etc. - They add `$` at the start of every command, which makes it hard to copy and paste multiple commands in bulk. - Command stdout, stderr, etc. are interleaved with the commands and are not clearly delineated. - They don't always use proper shell quoting. Instead, they blindly enclose all command-line arguments in double quotes. Changes ------- D154984 plus this patch converts the above example into: ``` Exit Code: 2 Command Output (stdout): -- # RUN: at line 1 echo hello | FileCheck bogus-file.txt && echo success # executed command: echo hello # .---command stdout------------ # | hello # `----------------------------- # executed command: FileCheck bogus-file.txt # .---command stderr------------ # | Could not open check file 'bogus-file.txt': No such file or directory # `----------------------------- # error: command failed with exit status: 2 -- ``` Thus, this patch addresses the above issues as follows: - The entire execution trace can be copied and pasted in bulk to a terminal for correct execution of the RUN lines, which are printed intact as they appeared in the original RUN lines except lit substitutions are expanded. Everything else in the execution trace appears in shell comments so it has no effect in a terminal. - Each of the RUN line's commands is repeated (in shell comments) as it executes to show (1) that the command actually executed (e.g., `echo success` above didn't) and (2) what stdout, stderr, non-zero exit status, and output files are associated with the command, if any. Shell quoting in the command is now correct and minimal but is not necessarily the original shell quoting from the RUN line. - The start and end of the contents of stdout, stderr, or an output file is now delineated clearly in the trace. To help produce some of the above output, this patch extends lit's internal shell with a built-in `@echo` command. It's like `echo` except lit suppresses the normal execution trace for `@echo` and just prints its stdout directly. For now, `@echo` isn't documented for use in lit tests. Without this patch, libcxx's custom lit test format tries to parse the stdout from `lit.TestRunner.executeScriptInternal` (which runs lit's internal shell) to extract the stdout and stderr produced by shell commands, and that parse no longer works after the above changes. This patch makes a small adjustment to `lit.TestRunner.executeScriptInternal` so libcxx can just request stdout and stderr without an execution trace. (As a minor drive-by fix that came up in testing: lit's internal `not` command now always produces a numeric exit status and never `True`.) Caveat ------ This patch only makes the above changes for lit's internal shell. In most cases, we do not know how to force external shells (e.g., bash, sh, window's `cmd`) to produce execution traces in the manner we want. To configure a test suite to use lit's internal shell (which is usually better for test portability than external shells anyway), add this to the test suite's `lit.cfg` or other configuration file: ``` config.test_format = lit.formats.ShTest(execute_external=False) ``` Reviewed By: MaskRay, awarzynski Differential Revision: https://reviews.llvm.org/D156954
llvm · Aug 29, 2023 · c981c53 · c981c53
1 parent 09b6e45
commit c981c53
Show file tree

Hide file tree

Showing 33 changed files with 1,015 additions and 812 deletions.
diff --git a/libcxx/utils/libcxx/test/dsl.py b/libcxx/utils/libcxx/test/dsl.py
@@ -180,7 +180,7 @@ def programOutput(config, program, args=None):
                 "Failed to run program, cmd:\n{}\nstderr is:\n{}".format(runcmd, err)
             )
 
-        return libcxx.test.format._parseLitOutput(out)
+        return out
 
 
 @_memoizeExpensiveOperation(

diff --git a/libcxx/utils/libcxx/test/format.py b/libcxx/utils/libcxx/test/format.py
@@ -35,39 +35,6 @@ def _checkBaseSubstitutions(substitutions):
     for s in ["%{cxx}", "%{compile_flags}", "%{link_flags}", "%{flags}", "%{exec}"]:
         assert s in substitutions, "Required substitution {} was not provided".format(s)
 
-def _parseLitOutput(fullOutput):
-    """
-    Parse output of a Lit ShTest to extract the actual output of the contained commands.
-
-    This takes output of the form
-
-        $ ":" "RUN: at line 11"
-        $ "echo" "OUTPUT1"
-        # command output:
-        OUTPUT1
-
-        $ ":" "RUN: at line 12"
-        $ "echo" "OUTPUT2"
-        # command output:
-        OUTPUT2
-
-    and returns a string containing
-
-        OUTPUT1
-        OUTPUT2
-
-    as-if the commands had been run directly. This is a workaround for the fact
-    that Lit doesn't let us execute ShTest and retrieve the raw output without
-    injecting additional Lit output around it.
-    """
-    parsed = ''
-    for output in re.split('[$]\s*":"\s*"RUN: at line \d+"', fullOutput):
-        if output: # skip blank lines
-            commandOutput = re.search("# command output:\n(.+)\n$", output, flags=re.DOTALL)
-            if commandOutput:
-                parsed += commandOutput.group(1)
-    return parsed
-
 def _executeScriptInternal(test, litConfig, commands):
     """
     Returns (stdout, stderr, exitCode, timeoutInfo, parsedCommands)
@@ -79,21 +46,12 @@ def _executeScriptInternal(test, litConfig, commands):
     _, tmpBase = _getTempPaths(test)
     execDir = os.path.dirname(test.getExecPath())
     res = lit.TestRunner.executeScriptInternal(
-        test, litConfig, tmpBase, parsedCommands, execDir
+        test, litConfig, tmpBase, parsedCommands, execDir, debug=False
     )
     if isinstance(res, lit.Test.Result):  # Handle failure to parse the Lit test
         res = ("", res.output, 127, None)
     (out, err, exitCode, timeoutInfo) = res
 
-    # TODO: As a temporary workaround until https://reviews.llvm.org/D81892 lands, manually
-    #       split any stderr output that is included in stdout. It shouldn't be there, but
-    #       the Lit internal shell conflates stderr and stdout.
-    conflatedErrorOutput = re.search("(# command stderr:.+$)", out, flags=re.DOTALL)
-    if conflatedErrorOutput:
-        conflatedErrorOutput = conflatedErrorOutput.group(0)
-        out = out[: -len(conflatedErrorOutput)]
-        err += conflatedErrorOutput
-
     return (out, err, exitCode, timeoutInfo, parsedCommands)
 
 
@@ -400,8 +358,7 @@ def _generateGenTest(self, testSuite, pathInSuite, litConfig, localConfig):
             raise RuntimeError(f"Error while trying to generate gen test\nstdout:\n{out}\n\nstderr:\n{err}")
 
         # Split the generated output into multiple files and generate one test for each file
-        parsed = _parseLitOutput(out)
-        for (subfile, content) in self._splitFile(parsed):
+        for (subfile, content) in self._splitFile(out):
             generatedFile = testSuite.getExecPath(pathInSuite + (subfile, ))
             os.makedirs(os.path.dirname(generatedFile), exist_ok=True)
             with open(generatedFile, 'w') as f:

diff --git a/llvm/utils/lit/lit/TestRunner.py b/llvm/utils/lit/lit/TestRunner.py
@@ -8,6 +8,7 @@
 import stat
 import pathlib
 import platform
+import shlex
 import shutil
 import tempfile
 import threading
@@ -57,6 +58,14 @@ def __init__(self, command, message):
 kPdbgRegex = "%dbg\\(([^)'\"]*)\\)(.*)"
 
 
+def buildPdbgCommand(msg, cmd):
+    res = f"%dbg({msg}) {cmd}"
+    assert re.match(
+        kPdbgRegex, res
+    ), f"kPdbgRegex expected to match actual %dbg usage: {res}"
+    return res
+
+
 class ShellEnvironment(object):
 
     """Mutable shell environment containing things like CWD and env vars.
@@ -340,12 +349,12 @@ def executeBuiltinExport(cmd, shenv):
 
 
 def executeBuiltinEcho(cmd, shenv):
-    """Interpret a redirected echo command"""
+    """Interpret a redirected echo or @echo command"""
     opened_files = []
     stdin, stdout, stderr = processRedirects(cmd, subprocess.PIPE, shenv, opened_files)
     if stdin != subprocess.PIPE or stderr != subprocess.PIPE:
         raise InternalShellError(
-            cmd, "stdin and stderr redirects not supported for echo"
+            cmd, f"stdin and stderr redirects not supported for {cmd.args[0]}"
         )
 
     # Some tests have un-redirected echo commands to help debug test failures.
@@ -692,6 +701,7 @@ def _executeShCmd(cmd, shenv, results, timeoutHelper):
         "cd": executeBuiltinCd,
         "export": executeBuiltinExport,
         "echo": executeBuiltinEcho,
+        "@echo": executeBuiltinEcho,
         "mkdir": executeBuiltinMkdir,
         "popd": executeBuiltinPopd,
         "pushd": executeBuiltinPushd,
@@ -919,7 +929,7 @@ def _executeShCmd(cmd, shenv, results, timeoutHelper):
         if res == -signal.SIGINT:
             raise KeyboardInterrupt
         if proc_not_counts[i] % 2:
-            res = not res
+            res = 1 if res == 0 else 0
         elif proc_not_counts[i] > 1:
             res = 1 if res != 0 else 0
 
@@ -982,19 +992,58 @@ def _executeShCmd(cmd, shenv, results, timeoutHelper):
     return exitCode
 
 
-def executeScriptInternal(test, litConfig, tmpBase, commands, cwd):
+def formatOutput(title, data, limit=None):
+    if not data.strip():
+        return ""
+    if not limit is None and len(data) > limit:
+        data = data[:limit] + "\n...\n"
+        msg = "data was truncated"
+    else:
+        msg = ""
+    ndashes = 30
+    out =  f"# .---{title}{'-' * (ndashes - 4 - len(title))}\n"
+    out += f"# | " + "\n# | ".join(data.splitlines()) + "\n"
+    out += f"# `---{msg}{'-' * (ndashes - 4 - len(msg))}\n"
+    return out
+
+# Normally returns out, err, exitCode, timeoutInfo.
+#
+# If debug is True (the normal lit behavior), err is empty, and out contains an
+# execution trace, including stdout and stderr shown per command executed.
+#
+# If debug is False (set by some custom lit test formats that call this
+# function), out contains only stdout from the script, err contains only stderr
+# from the script, and there is no execution trace.
+def executeScriptInternal(test, litConfig, tmpBase, commands, cwd,
+                          debug=True):
     cmds = []
     for i, ln in enumerate(commands):
+        # Within lit, we try to always add '%dbg(...)' to command lines in order
+        # to maximize debuggability.  However, custom lit test formats might not
+        # always add it, so add a generic debug message in that case.
         match = re.match(kPdbgRegex, ln)
         if match:
+            dbg = match.group(1)
             command = match.group(2)
-            ln = commands[i] = match.expand(": '\\1'; \\2" if command else ": '\\1'")
+        else:
+            dbg = "command line"
+            command = ln
+        if debug:
+            ln = f"@echo '# {dbg}' "
+            if command:
+                ln += f"&& @echo {shlex.quote(command.lstrip())} && {command}"
+            else:
+                ln += "has no command after substitutions"
+        else:
+            ln = command
         try:
             cmds.append(
                 ShUtil.ShParser(ln, litConfig.isWindows, test.config.pipefail).parse()
             )
         except:
-            return lit.Test.Result(Test.FAIL, "shell parser error on: %r" % ln)
+            return lit.Test.Result(
+                Test.FAIL, f"shell parser error on {dbg}: {command.lstrip()}\n"
+            )
 
     cmd = cmds[0]
     for c in cmds[1:]:
@@ -1014,8 +1063,42 @@ def executeScriptInternal(test, litConfig, tmpBase, commands, cwd):
 
     out = err = ""
     for i, result in enumerate(results):
-        # Write the command line run.
-        out += "$ %s\n" % (" ".join('"%s"' % s for s in result.command.args),)
+        if not debug:
+            out += result.stdout
+            err += result.stderr
+            continue
+
+        # The purpose of an "@echo" command is merely to add a debugging message
+        # directly to lit's output.  It is used internally by lit's internal
+        # shell and is not currently documented for use in lit tests.  However,
+        # if someone misuses it (e.g., both "echo" and "@echo" complain about
+        # stdin redirection), produce the normal execution trace to facilitate
+        # debugging.
+        if (
+            result.command.args[0] == "@echo"
+            and result.exitCode == 0
+            and not result.stderr
+            and not result.outputFiles
+            and not result.timeoutReached
+        ):
+            out += result.stdout
+            continue
+
+        # Write the command line that was run.  Properly quote it.  Leading
+        # "!" commands should not be quoted as that would indicate they are not
+        # the builtins.
+        out += "# executed command: "
+        nLeadingBangs = next(
+            (i for i, cmd in enumerate(result.command.args) if cmd != "!"),
+            len(result.command.args),
+        )
+        out += "! " * nLeadingBangs
+        out += " ".join(
+            shlex.quote(str(s))
+            for i, s in enumerate(result.command.args)
+            if i >= nLeadingBangs
+        )
+        out += "\n"
 
         # If nothing interesting happened, move on.
         if (
@@ -1030,22 +1113,16 @@ def executeScriptInternal(test, litConfig, tmpBase, commands, cwd):
 
         # Add the command output, if redirected.
         for (name, path, data) in result.outputFiles:
-            if data.strip():
-                out += "# redirected output from %r:\n" % (name,)
-                data = to_string(data.decode("utf-8", errors="replace"))
-                if len(data) > 1024:
-                    out += data[:1024] + "\n...\n"
-                    out += "note: data was truncated\n"
-                else:
-                    out += data
-                out += "\n"
-
+            data = to_string(data.decode("utf-8", errors="replace"))
+            out += formatOutput(
+                f"redirected output from '{name}'", data, limit=1024
+            )
         if result.stdout.strip():
-            out += "# command output:\n%s\n" % (result.stdout,)
+            out += formatOutput("command stdout", result.stdout)
         if result.stderr.strip():
-            out += "# command stderr:\n%s\n" % (result.stderr,)
+            out += formatOutput("command stderr", result.stderr)
         if not result.stdout.strip() and not result.stderr.strip():
-            out += "note: command had no output on stdout or stderr\n"
+            out += "# note: command had no output on stdout or stderr\n"
 
         # Show the error conditions:
         if result.exitCode != 0:
@@ -1055,9 +1132,9 @@ def executeScriptInternal(test, litConfig, tmpBase, commands, cwd):
                 codeStr = hex(int(result.exitCode & 0xFFFFFFFF)).rstrip("L")
             else:
                 codeStr = str(result.exitCode)
-            out += "error: command failed with exit status: %s\n" % (codeStr,)
+            out += "# error: command failed with exit status: %s\n" % (codeStr,)
         if litConfig.maxIndividualTestTime > 0 and result.timeoutReached:
-            out += "error: command reached timeout: %s\n" % (
+            out += "# error: command reached timeout: %s\n" % (
                 str(result.timeoutReached),
             )
 
@@ -1833,13 +1910,7 @@ def _handleCommand(cls, line_number, line, output, keyword):
         if not output or not output[-1].add_continuation(line_number, keyword, line):
             if output is None:
                 output = []
-            pdbg = "%dbg({keyword} at line {line_number})".format(
-                keyword=keyword, line_number=line_number
-            )
-            assert re.match(
-                kPdbgRegex + "$", pdbg
-            ), "kPdbgRegex expected to match actual %dbg usage"
-            line = "{pdbg} {real_command}".format(pdbg=pdbg, real_command=line)
+            line = buildPdbgCommand(f"{keyword} at line {line_number}", line)
             output.append(CommandDirective(line_number, line_number, keyword, line))
         return output
 
@@ -2104,6 +2175,8 @@ def executeShTest(
         return lit.Test.Result(Test.UNSUPPORTED, "Test is unsupported")
 
     script = list(preamble_commands)
+    script = [buildPdbgCommand(f"preamble command line", ln) for ln in script]
+
     parsed = parseIntegratedTestScript(test, require_script=not script)
     if isinstance(parsed, lit.Test.Result):
         return parsed

diff --git a/llvm/utils/lit/tests/Inputs/lit-opts/lit.cfg b/llvm/utils/lit/tests/Inputs/lit-opts/lit.cfg
@@ -5,4 +5,4 @@ config.suffixes = [".txt"]
 config.test_format = lit.formats.ShTest()
 config.test_source_root = None
 config.test_exec_root = None
-config.substitutions.append(("%var", lit_config.params.get("var", "")))
+config.substitutions.append(("%var", lit_config.params.get("var", "default")))
diff --git a/llvm/utils/lit/tests/Inputs/shtest-define/examples/param-subst.txt b/llvm/utils/lit/tests/Inputs/shtest-define/examples/param-subst.txt
@@ -14,21 +14,21 @@
 ; REDEFINE: %{cflags} = -triple x86_64-apple-darwin10.6.0 -fopenmp-simd
 ; REDEFINE: %{fcflags} = -check-prefix=SIMD
 ; RUN: %{check}
-; CHECK: %clang_cc1 -verify -fopenmp -fopenmp-version=51 -triple x86_64-apple-darwin10.6.0 -fopenmp-simd -emit-llvm -o - {{.*}} | FileCheck -check-prefix=SIMD {{.*}}
+; CHECK:# | %clang_cc1 -verify -fopenmp -fopenmp-version=51 -triple x86_64-apple-darwin10.6.0 -fopenmp-simd -emit-llvm -o - {{.*}} | FileCheck -check-prefix=SIMD {{.*}}
 
 ; REDEFINE: %{cflags} = -triple x86_64-unknown-linux-gnu -fopenmp-simd
 ; REDEFINE: %{fcflags} = -check-prefix=SIMD
 ; RUN: %{check}
-; CHECK: %clang_cc1 -verify -fopenmp -fopenmp-version=51 -triple x86_64-unknown-linux-gnu -fopenmp-simd -emit-llvm -o - {{.*}} | FileCheck -check-prefix=SIMD {{.*}}
+; CHECK:# | %clang_cc1 -verify -fopenmp -fopenmp-version=51 -triple x86_64-unknown-linux-gnu -fopenmp-simd -emit-llvm -o - {{.*}} | FileCheck -check-prefix=SIMD {{.*}}
 
 ; REDEFINE: %{cflags} = -triple x86_64-apple-darwin10.6.0
 ; REDEFINE: %{fcflags} = -check-prefix=NO-SIMD
 ; RUN: %{check}
-; CHECK: %clang_cc1 -verify -fopenmp -fopenmp-version=51 -triple x86_64-apple-darwin10.6.0 -emit-llvm -o - {{.*}} | FileCheck -check-prefix=NO-SIMD {{.*}}
+; CHECK:# | %clang_cc1 -verify -fopenmp -fopenmp-version=51 -triple x86_64-apple-darwin10.6.0 -emit-llvm -o - {{.*}} | FileCheck -check-prefix=NO-SIMD {{.*}}
 
 ; REDEFINE: %{cflags} = -triple x86_64-unknown-linux-gnu
 ; REDEFINE: %{fcflags} = -check-prefix=NO-SIMD
 ; RUN: %{check}
-; CHECK: %clang_cc1 -verify -fopenmp -fopenmp-version=51 -triple x86_64-unknown-linux-gnu -emit-llvm -o - {{.*}} | FileCheck -check-prefix=NO-SIMD {{.*}}
+; CHECK:# | %clang_cc1 -verify -fopenmp -fopenmp-version=51 -triple x86_64-unknown-linux-gnu -emit-llvm -o - {{.*}} | FileCheck -check-prefix=NO-SIMD {{.*}}
 
 ; CHECK: Passed: 1
diff --git a/llvm/utils/lit/tests/Inputs/shtest-define/expansion-order.txt b/llvm/utils/lit/tests/Inputs/shtest-define/expansion-order.txt
@@ -7,7 +7,7 @@
 #
 # REDEFINE: %{global:greeting}=Hello
 # RUN: %{global:echo}
-# CHECK: GLOBAL: Hello World
+# CHECK:# | GLOBAL: Hello World
 
 # We can redefine the test suite config's substitutions multiple times.  Again,
 # the expansion order remains the same (%{global:echo} before %{global:greeting}
@@ -17,7 +17,7 @@
 # REDEFINE: %{global:greeting}=Goodbye %{global:what}
 # REDEFINE: %{global:what}=Sleep
 # RUN: %{global:echo}
-# CHECK: GLOBAL: Goodbye Sleep Sleep
+# CHECK:# | GLOBAL: Goodbye Sleep Sleep
 
 # A new local substitution is prepended to the substitution list so that it can
 # depend on all substitutions that were defined previously, including those from
@@ -26,7 +26,7 @@
 # DEFINE: %{local:greeting}=Hey %{global:what}
 # DEFINE: %{local:echo}=echo "LOCAL: %{local:greeting} %{global:what}"
 # RUN: %{local:echo}
-# CHECK: LOCAL: Hey Sleep Sleep
+# CHECK:# | LOCAL: Hey Sleep Sleep
 
 # As for substitutions from the test suite config, redefining local
 # substitutions should not change the expansion order.  Again, the expansion
@@ -36,6 +36,6 @@
 # REDEFINE: %{local:greeting}=So Long %{global:what}
 # REDEFINE: %{global:what}=World
 # RUN: %{local:echo}
-# CHECK: LOCAL: So Long World World
+# CHECK:# | LOCAL: So Long World World
 
 # CHECK: Passed: 1