[lit] Fix some issues from --per-test-coverage #65242

jdenny-ornl · 2023-09-04T03:41:14Z

Key issue

When lit is configured to use windows cmd as an external shell, it appears that all RUN lines are effectively commented out. The problematic change landed in July 26, 2023, so it's surprising it hasn't come up yet. I'm afraid I don't have a windows setup, so I could use some help to verify what's happening.

Details

In the case that lit is configured to use an external shell, D154280 (landed in 64d1954 on July 26, 2023) causes
%dbg(RUN: at line N) to be expanded in RUN lines early and in a manner that is specific to sh-like shells. As a result, later code in lit that expands it in a shell-specific manner is useless.

That sh-like expansion uses the : command as follows:

: 'RUN: at line N'; original-commands

In sh-like shells, the : command simply discards its arguments. However, in windows cmd, : indicates a goto label. That appears to effectively comment out the rest of the line, including the original commands from the RUN line.

I am not aware of any complaints about this change. Did I miss them? Are all tests still passing and so no one noticed? Lit's own test suite has some tests that normally fail if RUN lines don't execute. Is no one running lit's test suite with windows cmd as a lit external shell? Or is no one using windows cmd as a lit external shell at all anymore?

Another issue

D154280 doesn't implement --per-test-coverage for lit's internal shell.

Fix

This patch fixes the above problems by implementing --per-test-coverage before selecting the shell (internal or external) and by leaving %dbg(RUN: at line N) unexpanded. Thus, it is expanded later in a shell-specific manner, as before D154280.

I would like to understand whether windows cmd as a lit external shell is worthwhile to support anymore.

jh7370 · 2023-09-04T07:06:35Z

What is %dbg actually used for? I couldn't find any references to it in the main LLVM tests, nor is there any reference to it in the Testing Guid docs, so I suspect I'm missing something crucial to understanding this situation...

I would like to understand whether windows cmd as a lit external shell is worthwhile to support anymore.

We probably should, purely on the basis that people who develop on Windows might be using this setup. That being said, I'm fairly confident that Windows users by default use the internal shell. I have not verified that though.

mstorsjo · 2023-09-04T07:47:08Z

I would like to understand whether windows cmd as a lit external shell is worthwhile to support anymore.

We probably should, purely on the basis that people who develop on Windows might be using this setup. That being said, I'm fairly confident that Windows users by default use the internal shell. I have not verified that though.

FWIW, I'm not a regular Windows user, I only run it in VMs and similar for testing things, but whenever I do and want to test things in a native Windows environment (as opposed to running in msys2 bash or similar), I do it with cmd. I've never really gotten familiar with powershell.

And whenever tools like python etc, programmatically execute some command string in a way that it is parsed by a shell, that shell is cmd, not powershell afaik.

mstorsjo · 2023-09-04T10:44:58Z

I haven't been so much involved in the recent Lit changes, and it's been a while since I poked at these things, so it took me a little bit of fiddling around to get back on track about what the actual status quo is here. I'll try to summarize my findings, please doublecheck if I've missed something:

Status quo

ShTest defaults to execute_external=False (i.e., internal shell) if no parameter has been passed: https://github.com/llvm/llvm-project/blob/llvmorg-18-init/llvm/utils/lit/lit/formats/shtest.py#L22
The LLVM testsuite initializes this based on llvm_config.use_lit_shell - https://github.com/llvm/llvm-project/blob/llvmorg-18-init/llvm/test/lit.cfg.py#L21. The vast majority of testsuites in LLVM also copies this behaviour.
The LLVM testsuite defaults to self.use_lit_shell = False, except that on Windows it defaults to self.use_lit_shell = True. This can be overridden with the environment variable LIT_USE_INTERNAL_SHELL. https://github.com/llvm/llvm-project/blob/llvmorg-18-init/llvm/utils/lit/lit/llvm/config.py#L23-L52 I.e. if people don't explicitly opt in to it by setting LIT_USE_INTERNAL_SHELL=0, people will be using the internal shell on Windows
The OpenMP testsuite initializes ShTest with the defaults, i.e. always using the internal shell: https://github.com/llvm/llvm-project/blob/llvmorg-18-init/openmp/runtime/test/lit.cfg#L42
compiler-rt's tests have custom setup logic that doesn't reuse the same setup as most other tools with llvm_config.use_lit_shell, but has essentially equivalent logic here: https://github.com/llvm/llvm-project/blob/llvmorg-18-init/compiler-rt/test/lit.common.cfg.py#L113-L128
The libcxx testsuite (which is reused for libcxxabi and libunwind) defaults to the internal shell, without any option for overriding it: https://github.com/llvm/llvm-project/blob/llvmorg-18-init/libcxx/utils/libcxx/test/format.py#L384-L387
If using an external shell for running tests, this will be bash, if bash was found in the path - if not, it will be the win32 cmd.exe: https://github.com/llvm/llvm-project/blob/llvmorg-18-init/llvm/utils/lit/lit/TestRunner.py#L1068-L1069

As additional context: The libcxx testsuite used to use the external shell a few years ago, but this was problematic for running the tests on Windows, it didn't really work right there (I don't have exact details). This was changed in 39bbfb7 - before that, running the libcxx tests required making sure that bash was available in path.

I.e., unless jumping through hoops, none of these testsuites really run things with the win32 cmd.exe.

Breakage

Running things with win32 cmd.exe as external shell did use to work, but currently doesn't. This actually seems to have broken in 1041a96 already. If running a simple llvm shell test with LIT_USE_INTERNAL_SHELL=0, before that, I got a successful run:

Script:
--
echo 'RUN: at line 3' > nul &&   c:\code\llvm-project\llvm\build\bin\llvm-mc.exe -triple aarch64-pc-win32 -filetype=obj C:/code/llvm-project/llvm/test/MC/AArch64/seh.s | c:\code\llvm-project\llvm\build\bin\llvm-readobj.exe -S -r -u - | c:\code\llvm-project\llvm\build\bin\filecheck.exe C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
echo 'RUN: at line 8' > nul &&   c:\code\llvm-project\llvm\build\bin\llvm-mc.exe -triple aarch64-pc-win32 -filetype=asm C:/code/llvm-project/llvm/test/MC/AArch64/seh.s | c:\code\llvm-project\llvm\build\bin\llvm-mc.exe -triple aarch64-pc-win32 -filetype=obj - | c:\code\llvm-project\llvm\build\bin\llvm-readobj.exe -S -r -u - | c:\code\llvm-project\llvm\build\bin\filecheck.exe C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
--
Exit Code: 0

After that commit, it instead breaks like this:

Script:
--
echo 'RUN: at line 3' > nul &&
echo 'RUN: at line 8' > nul &&
--
Exit Code: 255

Command Output (stderr):
--
The syntax of the command is incorrect.

It further breaks at 64d1954 as you've noticed, since the RUN lines essentially just become no-op goto labels.

Script:
--
: 'RUN: at line 3';   c:\code\llvm-project\llvm\build\bin\llvm-mc.exe -triple aarch64-pc-win32 -filetype=obj C:/code/llvm-project/llvm/test/MC/AArch64/seh.s | c:\code\llvm-project\llvm\build\bin\llvm-readobj.exe -S -r -u - | c:\code\llvm-project\llvm\build\bin\filecheck.exe C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
: 'RUN: at line 8';   c:\code\llvm-project\llvm\build\bin\llvm-mc.exe -triple aarch64-pc-win32 -filetype=asm C:/code/llvm-project/llvm/test/MC/AArch64/seh.s | c:\code\llvm-project\llvm\build\bin\llvm-mc.exe -triple aarch64-pc-win32 -filetype=obj - | c:\code\llvm-project\llvm\build\bin\llvm-readobj.exe -S -r -u - | c:\code\llvm-project\llvm\build\bin\filecheck.exe C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
--
Exit Code: 0

(It prints this even if injecting an error in the test.)

Furthermore, 09b6e45 changes this case further, making it just print this:

PASS: LLVM :: MC/AArch64/seh.s (1 of 1)
Exit Code: 0

(Even when executing with -a -v -v.)

Separate side note; running the LLVM testsuite with bash as external executor (if it happens to be available in path) also seems to fail:

Exit Code: 127

Command Output (stderr):
--
+ : 'RUN: at line 3'
+ c:codellvm-projectllvmbuildbinllvm-mc.exe -triple aarch64-pc-win32 -filetype=obj C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
C:\code\\llvm-project\llvm\build\test\MC\AArch64\Output\seh.s.script: line 1: c:codellvm-projectllvmbuildbinllvm-mc.exe: command not found
+ c:codellvm-projectllvmbuildbinllvm-readobj.exe -S -r -u -
C:\code\\llvm-project\llvm\build\test\MC\AArch64\Output\seh.s.script: line 1: c:codellvm-projectllvmbuildbinllvm-readobj.exe: command not found
+ c:codellvm-projectllvmbuildbinfilecheck.exe C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
C:\code\\llvm-project\llvm\build\test\MC\AArch64\Output\seh.s.script: line 1: c:codellvm-projectllvmbuildbinfilecheck.exe: command not found

--

(This seems to have been broken throughout this time period at least. However before 39bbfb7, when libcxx tests were using the external executor, they did work when executed with bash as external executor.)

Conclusion

It certainly looks like this configuration is severely broken. I'm not aware of any parts of the configurations that use it by default, and even then, it's generally problematic, so I'm not sure if it's worth spending effort on fixing. (The suggestion that was made somewhere around these discussions, to move most things towards using the internal shell even on unix, sounds to me like a good direction in general .)

That said; for echoing executed commands, when using the internal executor, it's probably very valuable if they're printed in a form that is easy to copypaste and execute in cmd.exe (where possible - at least for the majority of simple cases).

mstorsjo · 2023-09-04T10:49:32Z

FWIW, this PR doesn't entirely fix running the tests with win32 cmd.exe either; with this PR, it fails like this:

Exit Code: 255

Command Output (stdout):
--

C:\code\llvm-project\llvm\build\test\MC\AArch64>echo 'RUN: at line 3' > nul &&

--
Command Output (stderr):
--
The syntax of the command is incorrect.

llvm/utils/lit/lit/TestRunner.py

jdenny-ornl · 2023-09-04T13:46:08Z

What is %dbg actually used for? I couldn't find any references to it in the main LLVM tests, nor is there any reference to it in the Testing Guid docs, so I suspect I'm missing something crucial to understanding this situation...

It's something like an internal substitution. When parsing a RUN line, lit inserts it before the RUN line's commands like this:

%dbg(RUN: at line N) commands

Later, lit expands it based on the shell (internal, external sh-like shell, external windows cmd, etc.). For example, for sh-like shells:

: 'RUN: at line N'; commands

jdenny-ornl · 2023-09-04T14:15:16Z

I haven't been so much involved in the recent Lit changes, and it's been a while since I poked at these things, so it took me a little bit of fiddling around to get back on track about what the actual status quo is here. I'll try to summarize my findings, please doublecheck if I've missed something:

Thanks for the nice summary. It looks right to me.

Running things with win32 cmd.exe as external shell did use to work, but currently doesn't. This actually seems to have broken in 1041a96 already.

You're right, and that landed in April, 2022, so windows cmd support has been broken a while. That part is the reason why the current PR wasn't sufficient to restore correct behavior, as you point out in your later comment. It looks straight-forward to fix, but maybe doing so isn't worthwhile.

Separate side note; running the LLVM testsuite with bash as external executor (if it happens to be available in path) also seems to fail:
Exit Code: 127

Command Output (stderr):
--
+ : 'RUN: at line 3'
+ c:codellvm-projectllvmbuildbinllvm-mc.exe -triple aarch64-pc-win32 -filetype=obj C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
C:\code\\llvm-project\llvm\build\test\MC\AArch64\Output\seh.s.script: line 1: c:codellvm-projectllvmbuildbinllvm-mc.exe: command not found
+ c:codellvm-projectllvmbuildbinllvm-readobj.exe -S -r -u -
C:\code\\llvm-project\llvm\build\test\MC\AArch64\Output\seh.s.script: line 1: c:codellvm-projectllvmbuildbinllvm-readobj.exe: command not found
+ c:codellvm-projectllvmbuildbinfilecheck.exe C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
C:\code\\llvm-project\llvm\build\test\MC\AArch64\Output\seh.s.script: line 1: c:codellvm-projectllvmbuildbinfilecheck.exe: command not found

--
(This seems to have been broken throughout this time period at least. However before 39bbfb7, when libcxx tests were using the external executor, they did work when executed with bash as external executor.)

It looks like the llvm-mc substitution isn't quoting its value correctly.

Conclusion

It certainly looks like this configuration is severely broken. I'm not aware of any parts of the configurations that use it by default, and even then, it's generally problematic, so I'm not sure if it's worth spending effort on fixing. (The suggestion that was made somewhere around these discussions, to move most things towards using the internal shell even on unix, sounds to me like a good direction in general .)

Agreed. What if we raise a python exception on the windows cmd code path in lit for now? If we still see no complaints for a few weeks, we can think about removing the code path entirely.

That said; for echoing executed commands, when using the internal executor, it's probably very valuable if they're printed in a form that is easy to copypaste and execute in cmd.exe (where possible - at least for the majority of simple cases).

If RUN lines are written for lit's internal shell, I'm not sure if they are generally guaranteed to work in windows cmd. Do you have a specific change in mind here?

mstorsjo · 2023-09-04T17:10:14Z

(This seems to have been broken throughout this time period at least. However before 39bbfb7, when libcxx tests were using the external executor, they did work when executed with bash as external executor.)

It looks like the llvm-mc substitution isn't quoting its value correctly.

Yep, probably - I guess this goes the same for all the llvm tools that have substitutions set up.

Conclusion

It certainly looks like this configuration is severely broken. I'm not aware of any parts of the configurations that use it by default, and even then, it's generally problematic, so I'm not sure if it's worth spending effort on fixing. (The suggestion that was made somewhere around these discussions, to move most things towards using the internal shell even on unix, sounds to me like a good direction in general .)

Agreed. What if we raise a python exception on the windows cmd code path in lit for now? If we still see no complaints for a few weeks, we can think about removing the code path entirely.

I guess that sounds reasaonble. Although I'm not sure how often odd downstreams integrate from upstream, but things have been quite broken for quite some time anyway, so I guess it's pretty safe to say that this is dead code.

That said; for echoing executed commands, when using the internal executor, it's probably very valuable if they're printed in a form that is easy to copypaste and execute in cmd.exe (where possible - at least for the majority of simple cases).

If RUN lines are written for lit's internal shell, I'm not sure if they are generally guaranteed to work in windows cmd. Do you have a specific change in mind here?

Not really, no - I'm mostly commenting on it from the point of view that I remember seeing somewhere else in these discussions (regarding formatting of the echo RUN stuff I think). As long as most commands, in the form command1 | command2 work and can be copypasted as such, we're probably just fine. (And I have no reason to believe that wouldn't be the case.)

RoboTux

LGTM

hnrklssn · 2023-09-05T10:59:29Z

What is %dbg actually used for? I couldn't find any references to it in the main LLVM tests, nor is there any reference to it in the Testing Guid docs, so I suspect I'm missing something crucial to understanding this situation...

It's something like an internal substitution. When parsing a RUN line, lit inserts it before the RUN line's commands like this:
%dbg(RUN: at line N) commands
Later, lit expands it based on the shell (internal, external sh-like shell, external windows cmd, etc.). For example, for sh-like shells:
: 'RUN: at line N'; commands

Ah, thanks for the explanation. Then it's used a lot more than I thought.

Before <https://reviews.llvm.org/D154984> and <https://reviews.llvm.org/D156954>, lit reported full RUN lines in a `Script:` section. Now, in the case of lit's internal shell, it's the execution trace that includes them. However, if lit is configured to use an external shell (e.g., bash, windows `cmd`), they aren't reported at all. A fix was requested at the following: * <https://reviews.llvm.org/D154984#4627605> * <https://discourse.llvm.org/t/rfc-improving-lits-debug-output/72839/35?u=jdenny-ornl> This patch does not correctly address the case when the external shell is windows `cmd`. As discussed at <llvm#65242>, it's not clear whether that's a use case that people still care about, and it seems to be generally broken anyway.

jdenny-ornl · 2023-09-11T15:02:00Z

I noticed that the python formatting check failed so I looked into that, the problem is the file llvm/utils/lit/tests/per-test-coverage.py

Thanks for pointing that out. I noticed it too but made the bad assumption that something was wrong with the pre-commit check itself, and I ran out of time to investigate. This is the diagnostic I saw after a traceback:

black.parsing.InvalidInput: Cannot parse: 12:13: t-cfg.py ({{[^)]*}})

I didn't see llvm/utils/lit/tests/per-test-coverage.py mentioned anywhere. Did I miss it?

In contrast, when I run black on the command line on my local system, I see:

$ black llvm/utils/lit/tests/per-test-coverage.py 
error: cannot format llvm/utils/lit/tests/per-test-coverage.py: Cannot parse: 12:13: t-cfg.py ({{[^)]*}})

Oh no! 💥 💔 💥
1 file failed to reformat.

Is there any way to get the pre-commit check to mention the name of the problematic file?

(By the way, I have noticed that many python files were formatted with >80 columns, probably the black default of 88. That doesn't fit in the 80-column editor/terminal windows I use for most of LLVM. Is there any good reason not to follow the LLVM coding standard for source code width in python files? I see no exception there for python, or did I misunderstand something? Black just needs a -l80 command-line option.)

this file is not valid python, neither black or python can parse it. Since that's the case, can we change the extension on it so that we don't have to create a lot of exceptions in our CI and pre-commit hook?

Lit's tests use the .py suffix. Most just contain comments that contain lit/FileCheck directives. Some contain actual python code called by RUN lines. The above test contains garbage outside of a comment, so the test works fine, but it's invalid as python. I'll push a commit to this PR to fix it. Thanks again for pointing me in the right direction.

tru · 2023-09-11T15:08:54Z

I noticed that the python formatting check failed so I looked into that, the problem is the file llvm/utils/lit/tests/per-test-coverage.py

Thanks for pointing that out. I noticed it too but made the bad assumption that something was wrong with the pre-commit check itself, and I ran out of time to investigate. This is the diagnostic I saw after a traceback:
Is there any way to get the pre-commit check to mention the name of the problematic file?

I'll look into this, it also took me time to figure it out.

(By the way, I have noticed that many python files were formatted with >80 columns, probably the black default of 88. That doesn't fit in the 80-column editor/terminal windows I use for most of LLVM. Is there any good reason not to follow the LLVM coding standard for source code width in python files? I see no exception there for python, or did I misunderstand something? Black just needs a -l80 command-line option.)

This was discussed when we wrote the coding style for python and then we thought it was just better to use black's default to avoid having people need to run black with arguments (since it won't read a config file).

jdenny-ornl · 2023-09-11T15:46:10Z

(By the way, I have noticed that many python files were formatted with >80 columns, probably the black default of 88. That doesn't fit in the 80-column editor/terminal windows I use for most of LLVM. Is there any good reason not to follow the LLVM coding standard for source code width in python files? I see no exception there for python, or did I misunderstand something? Black just needs a -l80 command-line option.)

This was discussed when we wrote the coding style for python and then we thought it was just better to use black's default to avoid having people need to run black with arguments

OK, but the LLVM coding standard appears to contradict that approach, as far as I can tell:

"Write your code to fit within 80 columns." (From https://llvm.org/docs/CodingStandards.html#source-code-width.)
"The Python code within the LLVM repository should adhere to the formatting guidelines outlined in PEP-8." (From https://llvm.org/docs/CodingStandards.html#python-version-and-source-code-formatting).
"Limit all lines to a maximum of 79 characters." (From https://peps.python.org/pep-0008/#maximum-line-length.)

The LLVM coding standard's advice on using black is:

For consistency and to limit churn, code should be automatically formatted with the black utility. Black allows changing the formatting rules based on major version. In order to avoid unnecessary churn in the formatting rules we currently use black version 23.x in LLVM.

Combining all the above, I'm led to believe I should be passing options to black, but it's unclear which ones.

I'm sorry I didn't participate in the original discussion. If it's too late to change things, can we clarify the LLVM coding standard?

(since it won't read a config file).

See https://black.readthedocs.io/en/stable/usage_and_configuration/the_basics.html#configuration-via-a-file.

For example, in llvm/utils/lit/pyproject.toml , I appended the following:

[tool.black]
line-length = 80

And it worked for me.

tru · 2023-09-11T15:52:29Z

(By the way, I have noticed that many python files were formatted with >80 columns, probably the black default of 88. That doesn't fit in the 80-column editor/terminal windows I use for most of LLVM. Is there any good reason not to follow the LLVM coding standard for source code width in python files? I see no exception there for python, or did I misunderstand something? Black just needs a -l80 command-line option.)

This was discussed when we wrote the coding style for python and then we thought it was just better to use black's default to avoid having people need to run black with arguments

OK, but the LLVM coding standard appears to contradict that approach, as far as I can tell:

"Write your code to fit within 80 columns." (From https://llvm.org/docs/CodingStandards.html#source-code-width.)

"The Python code within the LLVM repository should adhere to the formatting guidelines outlined in PEP-8." (From https://llvm.org/docs/CodingStandards.html#python-version-and-source-code-formatting).

"Limit all lines to a maximum of 79 characters." (From https://peps.python.org/pep-0008/#maximum-line-length.)

The LLVM coding standard's advice on using black is:

For consistency and to limit churn, code should be automatically formatted with the black utility. Black allows changing the formatting rules based on major version. In order to avoid unnecessary churn in the formatting rules we currently use black version 23.x in LLVM.

Combining all the above, I'm led to believe I should be passing options to black, but it's unclear which ones.

I'm sorry I didn't participate in the original discussion. If it's too late to change things, can we clarify the LLVM coding standard?

(since it won't read a config file).

See https://black.readthedocs.io/en/stable/usage_and_configuration/the_basics.html#configuration-via-a-file.

For example, in llvm/utils/lit/pyproject.toml , I appended the following:
[tool.black]
line-length = 80
And it worked for me.

I think that's pretty new then. We tried this back during the original discussion without luck. But feel free to either update the docs to reflect the reality or post to discourse suggesting a policy change.

I have no strong feelings either way as long as we don't have to remember to manually change options. A change here would also mean a lot of reformatting which is a bit unnecessary churn.

jdenny-ornl · 2023-09-11T16:01:52Z

But feel free to either update the docs to reflect the reality or post to discourse suggesting a policy change.

I'll plan to raise it in the original RFC before proposing any changes.

Thanks for all your work in this area. It is much appreciated.

Before <https://reviews.llvm.org/D154984> and <https://reviews.llvm.org/D156954>, lit reported full RUN lines in a `Script:` section. Now, in the case of lit's internal shell, it's the execution trace that includes them. However, if lit is configured to use an external shell (e.g., bash, windows `cmd`), they aren't reported at all. A fix was requested at the following: * <https://reviews.llvm.org/D154984#4627605> * <https://discourse.llvm.org/t/rfc-improving-lits-debug-output/72839/35?u=jdenny-ornl> This patch does not correctly address the case when the external shell is windows `cmd`. As discussed at <llvm#65242>, it's not clear whether that's a use case that people still care about, and it seems to be generally broken anyway.

PR has changed since the review.

jdenny-ornl · 2023-09-11T20:41:49Z

I'm not trying to rush reviewers, but I want it to be clear that this patch is still in need of review due to recent changes. See #65242 (comment) for a summary. I'm afraid the recent discussion of black might have buried that comment. Maybe I'm just not used to LLVM PR reviews yet.

RoboTux · 2023-09-12T08:26:05Z

I'm not trying to rush reviewers, but I want it to be clear that this patch is still in need of review due to recent changes. See #65242 (comment) for a summary. I'm afraid the recent discussion of black might have buried that comment. Maybe I'm just not used to LLVM PR reviews yet.

I saw it but was waiting on the updated version fixing the python issue before giving it another look. I shall look at it today.

RoboTux · 2023-09-12T09:22:06Z

llvm/utils/lit/lit/TestRunner.py

-                keyword=keyword, line_number=line_number
-            )
-            assert re.match(
-                kPdbgRegex + "$", pdbg


Why is the $ dropped in the buildPdbgCommand?

Good question.

Originally, my goal was for the assert to mimic how kPdbgRegex will actually be used later when expanding %dbg substitutions. That's the usage that the assert is trying to verify will work correctly. Thus, this patch changes the assert in two ways: the searched string contains all of %dbg(...) cmd-line instead of just %dbg(...), and it does not use $.

However, now that you ask, I think the assert should be stricter. For example, D154987 proposes to extend kPdbgRegex to permit newlines, which might appear in lit.run(cmd) in PYTHON directives, as in the example in that review's summary. Using $ in the assert here would have caught that existing deficiency of kPdbgRegex: its .* doesn't match beyond the first newline.

Instead of $, what do you think of using re.fullmatch in the assert? That is, buildPdbgCommand expects kPdbgRegex to match the entire string it's building.

I went ahead and applied that change and wrote a commit log to explain it.

LGTM but does it mean you'll just rebase and push to the repo using pure git? If using the github tool AFAIK you can only merge and squash on the LLVM project so the commit message won't show up unless you modify the PR message. I might be wrong though, I haven't interacted much with github.

At least from the web UI you have the chance to edit the final commit message. So you can combine them there if you're ok with it all being one commit.

(otherwise yes manually push some of it)

Thanks for the review!

Yes, I performed most steps from the git command line: squash commits and commit logs together, rebase onto the latest main, and force push to the PR. I then clicked "Squash and merge" and copied and pasted my new commit log there. The commit log it initially offered was the original comment I posted for this PR, but the focus of the PR has shifted since then. It would be nicer if it offered the current commit logs, squashed in the same way a git merge --squash would squash them.

I don't know how to accomplish the rebase onto main via the github web UI... even though it seems like that's almost always going to be required given how frequently new commits show up on main. That means I needed to work from the command line anyway, so it seemed more practical to handle squashing there too.

Please let me know if I've overlooked best practices for using the github web UI.

From https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/configuring-pull-request-merges/about-merge-methods-on-github#squashing-your-merge-commits it seems squash and merge is actually a squash and rebase (it makes it clear it's a fast forward).

The web UI squash-and-merge will automatically rebase to current HEAD.

Thanks for the clarifications. For some reason, I thought it previously rejected one when I wasn't caught up to main, but perhaps I had conflicts then. I'll try again next time.

Yep, it worked fine in another PR that had no conflicts.

RoboTux

LGTM

D154280 (landed in 64d1954 in July, 2023) implements `--per-test-coverage` (which can also be specified via `lit_config.per_test_coverage`). However, it has a few issues, which the current patch addresses: 1. D154280 implements `--per-test-coverage` only for the case that lit is configured to use an external shell. The current patch extends the implementation to lit's internal shell. 2. In the case that lit is configured to use an external shell, regardless of whether `--per-test-coverage` is actually specified, D154280 causes `%dbg(RUN: at line N)` to be expanded in RUN lines early and in a manner that is specific to sh-like shells. As a result, later code in lit that expands it in a shell-specific manner is useless as there's nothing left to expand. The current patch cleans up the implementation to avoid useless code. 3. Because of issue 2, D154280 corrupts support for windows `cmd` as an external shell (effectively comments out all RUN lines with `:`). The current patch happens to fix that particular corruption by addressing issue 2. However, D122569 (landed in 1041a96 in April, 2022) had already broken support for windows `cmd` as an external shell (discards RUN lines when expanding `%dbg(RUN: at line N)`). The current patch does not attempt to fix that bug. For further details, see the PR discussion of the current patch. The current patch addresses the above issues by implementing `--per-test-coverage` before selecting the shell (internal or external) and by leaving `%dbg(RUN: at line N)` unexpanded there. Thus, it is expanded later in a shell-specific manner, as before D154280. This patch introduces `buildPdbgCommand` into lit's implementation to encapsulate the process of building (or rebuilding in the case of the `--per-test-coverage` implementation) a full `%dbg(RUN: at line N) cmd` line and asserting that the result matches `kPdbgRegex`. It also cleans up that and all other uses of `kPdbgRegex` to operate on the full line with `re.fullmatch` not `re.match`. This change better reflects the intention in every case, but it is expected to be NFC because `kPdbgRegex` ends in `.*` and thus avoids the difference between `re.fullmatch` and `re.match`. The only caveat is that `.*` does not match newlines, but RUN lines cannot contain newlines currently, so this caveat currently shouldn't matter in practice. The original `--per-test-coverage` implementation avoided accumulating `export LLVM_PROFILE_FILE={profile}` insertions across retries (due to `ALLOW_RETRIES`) by skipping the insertion if `%dbg(RUN: at line N)` was not present and thus had already been expanded. However, the current patch makes sure the insertions also happen for commands without `%dbg(RUN: at line N)`, such as preamble commands or some commands from other lit test formats. Thus, the current patch implements a different mechanism to avoid accumulating those insertions (see code comments).

xgupta · 2023-09-14T14:59:55Z

Thank you @jdenny-ornl for fixing the issues.

D154280 (landed in 64d1954 in July, 2023) implements `--per-test-coverage` (which can also be specified via `lit_config.per_test_coverage`). However, it has a few issues, which the current patch addresses: 1. D154280 implements `--per-test-coverage` only for the case that lit is configured to use an external shell. The current patch extends the implementation to lit's internal shell. 2. In the case that lit is configured to use an external shell, regardless of whether `--per-test-coverage` is actually specified, D154280 causes `%dbg(RUN: at line N)` to be expanded in RUN lines early and in a manner that is specific to sh-like shells. As a result, later code in lit that expands it in a shell-specific manner is useless as there's nothing left to expand. The current patch cleans up the implementation to avoid useless code. 3. Because of issue 2, D154280 corrupts support for windows `cmd` as an external shell (effectively comments out all RUN lines with `:`). The current patch happens to fix that particular corruption by addressing issue 2. However, D122569 (landed in 1041a96 in April, 2022) had already broken support for windows `cmd` as an external shell (discards RUN lines when expanding `%dbg(RUN: at line N)`). The current patch does not attempt to fix that bug. For further details, see the PR discussion of the current patch. The current patch addresses the above issues by implementing `--per-test-coverage` before selecting the shell (internal or external) and by leaving `%dbg(RUN: at line N)` unexpanded there. Thus, it is expanded later in a shell-specific manner, as before D154280. This patch introduces `buildPdbgCommand` into lit's implementation to encapsulate the process of building (or rebuilding in the case of the `--per-test-coverage` implementation) a full `%dbg(RUN: at line N) cmd` line and asserting that the result matches `kPdbgRegex`. It also cleans up that and all other uses of `kPdbgRegex` to operate on the full line with `re.fullmatch` not `re.match`. This change better reflects the intention in every case, but it is expected to be NFC because `kPdbgRegex` ends in `.*` and thus avoids the difference between `re.fullmatch` and `re.match`. The only caveat is that `.*` does not match newlines, but RUN lines cannot contain newlines currently, so this caveat currently shouldn't matter in practice. The original `--per-test-coverage` implementation avoided accumulating `export LLVM_PROFILE_FILE={profile}` insertions across retries (due to `ALLOW_RETRIES`) by skipping the insertion if `%dbg(RUN: at line N)` was not present and thus had already been expanded. However, the current patch makes sure the insertions also happen for commands without `%dbg(RUN: at line N)`, such as preamble commands or some commands from other lit test formats. Thus, the current patch implements a different mechanism to avoid accumulating those insertions (see code comments).

jdenny-ornl · 2023-09-14T16:02:34Z

Thank you @jdenny-ornl for fixing the issues.

By the way, thanks for the feature!

D154280 (landed in 64d1954 in July, 2023) implements `--per-test-coverage` (which can also be specified via `lit_config.per_test_coverage`). However, it has a few issues, which the current patch addresses: 1. D154280 implements `--per-test-coverage` only for the case that lit is configured to use an external shell. The current patch extends the implementation to lit's internal shell. 2. In the case that lit is configured to use an external shell, regardless of whether `--per-test-coverage` is actually specified, D154280 causes `%dbg(RUN: at line N)` to be expanded in RUN lines early and in a manner that is specific to sh-like shells. As a result, later code in lit that expands it in a shell-specific manner is useless as there's nothing left to expand. The current patch cleans up the implementation to avoid useless code. 3. Because of issue 2, D154280 corrupts support for windows `cmd` as an external shell (effectively comments out all RUN lines with `:`). The current patch happens to fix that particular corruption by addressing issue 2. However, D122569 (landed in 1041a96 in April, 2022) had already broken support for windows `cmd` as an external shell (discards RUN lines when expanding `%dbg(RUN: at line N)`). The current patch does not attempt to fix that bug. For further details, see the PR discussion of the current patch. The current patch addresses the above issues by implementing `--per-test-coverage` before selecting the shell (internal or external) and by leaving `%dbg(RUN: at line N)` unexpanded there. Thus, it is expanded later in a shell-specific manner, as before D154280. This patch introduces `buildPdbgCommand` into lit's implementation to encapsulate the process of building (or rebuilding in the case of the `--per-test-coverage` implementation) a full `%dbg(RUN: at line N) cmd` line and asserting that the result matches `kPdbgRegex`. It also cleans up that and all other uses of `kPdbgRegex` to operate on the full line with `re.fullmatch` not `re.match`. This change better reflects the intention in every case, but it is expected to be NFC because `kPdbgRegex` ends in `.*` and thus avoids the difference between `re.fullmatch` and `re.match`. The only caveat is that `.*` does not match newlines, but RUN lines cannot contain newlines currently, so this caveat currently shouldn't matter in practice. The original `--per-test-coverage` implementation avoided accumulating `export LLVM_PROFILE_FILE={profile}` insertions across retries (due to `ALLOW_RETRIES`) by skipping the insertion if `%dbg(RUN: at line N)` was not present and thus had already been expanded. However, the current patch makes sure the insertions also happen for commands without `%dbg(RUN: at line N)`, such as preamble commands or some commands from other lit test formats. Thus, the current patch implements a different mechanism to avoid accumulating those insertions (see code comments).

Before <https://reviews.llvm.org/D154984> and <https://reviews.llvm.org/D156954>, lit reported full RUN lines in a `Script:` section. Now, in the case of lit's internal shell, it's the execution trace that includes them. However, if lit is configured to use an external shell (e.g., bash, windows `cmd`), they aren't reported at all. A fix was requested at the following: * <https://reviews.llvm.org/D154984#4627605> * <https://discourse.llvm.org/t/rfc-improving-lits-debug-output/72839/35?u=jdenny-ornl> This patch does not address the case when the external shell is windows `cmd`. As discussed at <llvm#65242>, it's not clear whether that's a use case that people still care about, and it seems to be generally broken anyway.

Before <https://reviews.llvm.org/D154984> and <https://reviews.llvm.org/D156954>, lit reported full RUN lines in a `Script:` section. Now, in the case of lit's internal shell, it's the execution trace that includes them. However, if lit is configured to use an external shell (e.g., bash, windows `cmd`), they aren't reported at all. A fix was requested at the following: * <https://reviews.llvm.org/D154984#4627605> * <https://discourse.llvm.org/t/rfc-improving-lits-debug-output/72839/35?u=jdenny-ornl> This patch does not address the case when the external shell is windows `cmd`. As discussed at <#65242>, it's not clear whether that's a use case that people still care about, and it seems to be generally broken anyway.

In PR llvm#65242 (landed as 9e739fd), I claimed that RUN lines cannot contain newlines. Actually, they can after substitution expansion. More generally, a lit config file can define substitutions or preamble commands containing newlines. While both of those cases seem unlikely in practice, [D154987](https://reviews.llvm.org/D154987) proposes PYTHON directives where it seems very likely. Regardless of the use case, without this patch, such newlines break expansion of `%dbg(RUN: at line N)`, and the fix is simple.

In PR #65242 (landed as 9e739fd), I claimed that RUN lines cannot contain newlines. Actually, they can after substitution expansion. More generally, a lit config file can define substitutions or preamble commands containing newlines. While both of those cases seem unlikely in practice, [D154987](https://reviews.llvm.org/D154987) proposes PYTHON directives where it seems very likely. Regardless of the use case, without this patch, such newlines break expansion of `%dbg(RUN: at line N)`, and the fix is simple.

Before <https://reviews.llvm.org/D154984> and <https://reviews.llvm.org/D156954>, lit reported full RUN lines in a `Script:` section. Now, in the case of lit's internal shell, it's the execution trace that includes them. However, if lit is configured to use an external shell (e.g., bash, windows `cmd`), they aren't reported at all. A fix was requested at the following: * <https://reviews.llvm.org/D154984#4627605> * <https://discourse.llvm.org/t/rfc-improving-lits-debug-output/72839/35?u=jdenny-ornl> This patch does not correctly address the case when the external shell is windows `cmd`. As discussed at <llvm/llvm-project#65242>, it's not clear whether that's a use case that people still care about, and it seems to be generally broken anyway.

jdenny-ornl requested review from rnk, mstorsjo, MaskRay, delcypher, yln, ldionne, DavidSpickett, banach-space, asavonic, AaronBallman, Endilll, hnrklssn, jh7370 and xgupta September 4, 2023 03:41

jdenny-ornl requested a review from a team as a code owner September 4, 2023 03:41

hnrklssn reviewed Sep 4, 2023

View reviewed changes

llvm/utils/lit/lit/TestRunner.py Show resolved Hide resolved

jdenny-ornl mentioned this pull request Sep 4, 2023

[lit] Echo full RUN lines in case of external shells #65267

Merged

banach-space removed their request for review September 4, 2023 18:51

RoboTux previously approved these changes Sep 5, 2023

View reviewed changes

jdenny-ornl force-pushed the lit-windows-skips-run branch from 21b4d57 to e756b3a Compare September 11, 2023 16:08

RoboTux reviewed Sep 12, 2023

View reviewed changes

RoboTux approved these changes Sep 14, 2023

View reviewed changes

jdenny-ornl force-pushed the lit-windows-skips-run branch from e2c0d8c to c0fa7bc Compare September 14, 2023 14:05

jdenny-ornl merged commit 9e739fd into llvm:main Sep 14, 2023
2 of 3 checks passed

This was referenced Sep 14, 2023

[LSAN][NFC] Add a new line to a log kstoimenov/llvm-project#5

Closed

[LSAN][NFC] Add a new line to a log kstoimenov/llvm-project#6

Closed

jdenny-ornl mentioned this pull request Sep 14, 2023

[lit] Echo full RUN lines in case of external shells #66408

Merged

jdenny-ornl mentioned this pull request Oct 1, 2023

[lit] Fix shell commands with newlines #67898

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[lit] Fix some issues from --per-test-coverage #65242

[lit] Fix some issues from --per-test-coverage #65242

jdenny-ornl commented Sep 4, 2023

jh7370 commented Sep 4, 2023

mstorsjo commented Sep 4, 2023

mstorsjo commented Sep 4, 2023

mstorsjo commented Sep 4, 2023

jdenny-ornl commented Sep 4, 2023

jdenny-ornl commented Sep 4, 2023 •

edited

Conclusion

mstorsjo commented Sep 4, 2023

Conclusion

RoboTux left a comment

hnrklssn commented Sep 5, 2023

jdenny-ornl commented Sep 11, 2023

tru commented Sep 11, 2023

jdenny-ornl commented Sep 11, 2023

tru commented Sep 11, 2023

jdenny-ornl commented Sep 11, 2023

jdenny-ornl commented Sep 11, 2023

RoboTux commented Sep 12, 2023

RoboTux Sep 12, 2023

jdenny-ornl Sep 12, 2023

jdenny-ornl Sep 13, 2023

RoboTux Sep 14, 2023

DavidSpickett Sep 14, 2023

jdenny-ornl Sep 14, 2023

RoboTux Sep 14, 2023

pogo59 Sep 14, 2023

jdenny-ornl Sep 14, 2023

jdenny-ornl Sep 14, 2023 •

edited

RoboTux left a comment

xgupta commented Sep 14, 2023

jdenny-ornl commented Sep 14, 2023

[lit] Fix some issues from --per-test-coverage #65242

[lit] Fix some issues from --per-test-coverage #65242

Conversation

jdenny-ornl commented Sep 4, 2023

Key issue

Details

Another issue

Fix

jh7370 commented Sep 4, 2023

mstorsjo commented Sep 4, 2023

mstorsjo commented Sep 4, 2023

Status quo

Breakage

Conclusion

mstorsjo commented Sep 4, 2023

jdenny-ornl commented Sep 4, 2023

jdenny-ornl commented Sep 4, 2023 • edited

Conclusion

mstorsjo commented Sep 4, 2023

Conclusion

RoboTux left a comment

Choose a reason for hiding this comment

hnrklssn commented Sep 5, 2023

jdenny-ornl commented Sep 11, 2023

tru commented Sep 11, 2023

jdenny-ornl commented Sep 11, 2023

tru commented Sep 11, 2023

jdenny-ornl commented Sep 11, 2023

jdenny-ornl commented Sep 11, 2023

RoboTux commented Sep 12, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jdenny-ornl Sep 14, 2023 • edited

Choose a reason for hiding this comment

RoboTux left a comment

Choose a reason for hiding this comment

xgupta commented Sep 14, 2023

jdenny-ornl commented Sep 14, 2023

jdenny-ornl commented Sep 4, 2023 •

edited

jdenny-ornl Sep 14, 2023 •

edited