Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lit] Fix some issues from --per-test-coverage #65242

Merged
merged 1 commit into from Sep 14, 2023

Conversation

jdenny-ornl
Copy link
Collaborator

Key issue

When lit is configured to use windows cmd as an external shell, it appears that all RUN lines are effectively commented out. The problematic change landed in July 26, 2023, so it's surprising it hasn't come up yet. I'm afraid I don't have a windows setup, so I could use some help to verify what's happening.

Details

In the case that lit is configured to use an external shell, D154280 (landed in 64d1954 on July 26, 2023) causes
%dbg(RUN: at line N) to be expanded in RUN lines early and in a manner that is specific to sh-like shells. As a result, later code in lit that expands it in a shell-specific manner is useless.

That sh-like expansion uses the : command as follows:

: 'RUN: at line N'; original-commands

In sh-like shells, the : command simply discards its arguments. However, in windows cmd, : indicates a goto label. That appears to effectively comment out the rest of the line, including the original commands from the RUN line.

I am not aware of any complaints about this change. Did I miss them? Are all tests still passing and so no one noticed? Lit's own test suite has some tests that normally fail if RUN lines don't execute. Is no one running lit's test suite with windows cmd as a lit external shell? Or is no one using windows cmd as a lit external shell at all anymore?

Another issue

D154280 doesn't implement --per-test-coverage for lit's internal shell.

Fix

This patch fixes the above problems by implementing --per-test-coverage before selecting the shell (internal or external) and by leaving %dbg(RUN: at line N) unexpanded. Thus, it is expanded later in a shell-specific manner, as before D154280.

I would like to understand whether windows cmd as a lit external shell is worthwhile to support anymore.

@jh7370
Copy link
Collaborator

jh7370 commented Sep 4, 2023

What is %dbg actually used for? I couldn't find any references to it in the main LLVM tests, nor is there any reference to it in the Testing Guid docs, so I suspect I'm missing something crucial to understanding this situation...

I would like to understand whether windows cmd as a lit external shell is worthwhile to support anymore.

We probably should, purely on the basis that people who develop on Windows might be using this setup. That being said, I'm fairly confident that Windows users by default use the internal shell. I have not verified that though.

@mstorsjo
Copy link
Member

mstorsjo commented Sep 4, 2023

I would like to understand whether windows cmd as a lit external shell is worthwhile to support anymore.

We probably should, purely on the basis that people who develop on Windows might be using this setup. That being said, I'm fairly confident that Windows users by default use the internal shell. I have not verified that though.

FWIW, I'm not a regular Windows user, I only run it in VMs and similar for testing things, but whenever I do and want to test things in a native Windows environment (as opposed to running in msys2 bash or similar), I do it with cmd. I've never really gotten familiar with powershell.

And whenever tools like python etc, programmatically execute some command string in a way that it is parsed by a shell, that shell is cmd, not powershell afaik.

@mstorsjo
Copy link
Member

mstorsjo commented Sep 4, 2023

I haven't been so much involved in the recent Lit changes, and it's been a while since I poked at these things, so it took me a little bit of fiddling around to get back on track about what the actual status quo is here. I'll try to summarize my findings, please doublecheck if I've missed something:

Status quo

As additional context: The libcxx testsuite used to use the external shell a few years ago, but this was problematic for running the tests on Windows, it didn't really work right there (I don't have exact details). This was changed in 39bbfb7 - before that, running the libcxx tests required making sure that bash was available in path.

I.e., unless jumping through hoops, none of these testsuites really run things with the win32 cmd.exe.

Breakage

Running things with win32 cmd.exe as external shell did use to work, but currently doesn't. This actually seems to have broken in 1041a96 already. If running a simple llvm shell test with LIT_USE_INTERNAL_SHELL=0, before that, I got a successful run:

Script:
--
echo 'RUN: at line 3' > nul &&   c:\code\llvm-project\llvm\build\bin\llvm-mc.exe -triple aarch64-pc-win32 -filetype=obj C:/code/llvm-project/llvm/test/MC/AArch64/seh.s | c:\code\llvm-project\llvm\build\bin\llvm-readobj.exe -S -r -u - | c:\code\llvm-project\llvm\build\bin\filecheck.exe C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
echo 'RUN: at line 8' > nul &&   c:\code\llvm-project\llvm\build\bin\llvm-mc.exe -triple aarch64-pc-win32 -filetype=asm C:/code/llvm-project/llvm/test/MC/AArch64/seh.s | c:\code\llvm-project\llvm\build\bin\llvm-mc.exe -triple aarch64-pc-win32 -filetype=obj - | c:\code\llvm-project\llvm\build\bin\llvm-readobj.exe -S -r -u - | c:\code\llvm-project\llvm\build\bin\filecheck.exe C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
--
Exit Code: 0

After that commit, it instead breaks like this:

Script:
--
echo 'RUN: at line 3' > nul &&
echo 'RUN: at line 8' > nul &&
--
Exit Code: 255

Command Output (stderr):
--
The syntax of the command is incorrect.

It further breaks at 64d1954 as you've noticed, since the RUN lines essentially just become no-op goto labels.

Script:
--
: 'RUN: at line 3';   c:\code\llvm-project\llvm\build\bin\llvm-mc.exe -triple aarch64-pc-win32 -filetype=obj C:/code/llvm-project/llvm/test/MC/AArch64/seh.s | c:\code\llvm-project\llvm\build\bin\llvm-readobj.exe -S -r -u - | c:\code\llvm-project\llvm\build\bin\filecheck.exe C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
: 'RUN: at line 8';   c:\code\llvm-project\llvm\build\bin\llvm-mc.exe -triple aarch64-pc-win32 -filetype=asm C:/code/llvm-project/llvm/test/MC/AArch64/seh.s | c:\code\llvm-project\llvm\build\bin\llvm-mc.exe -triple aarch64-pc-win32 -filetype=obj - | c:\code\llvm-project\llvm\build\bin\llvm-readobj.exe -S -r -u - | c:\code\llvm-project\llvm\build\bin\filecheck.exe C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
--
Exit Code: 0

(It prints this even if injecting an error in the test.)

Furthermore, 09b6e45 changes this case further, making it just print this:

PASS: LLVM :: MC/AArch64/seh.s (1 of 1)
Exit Code: 0

(Even when executing with -a -v -v.)

Separate side note; running the LLVM testsuite with bash as external executor (if it happens to be available in path) also seems to fail:

Exit Code: 127

Command Output (stderr):
--
+ : 'RUN: at line 3'
+ c:codellvm-projectllvmbuildbinllvm-mc.exe -triple aarch64-pc-win32 -filetype=obj C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
C:\code\\llvm-project\llvm\build\test\MC\AArch64\Output\seh.s.script: line 1: c:codellvm-projectllvmbuildbinllvm-mc.exe: command not found
+ c:codellvm-projectllvmbuildbinllvm-readobj.exe -S -r -u -
C:\code\\llvm-project\llvm\build\test\MC\AArch64\Output\seh.s.script: line 1: c:codellvm-projectllvmbuildbinllvm-readobj.exe: command not found
+ c:codellvm-projectllvmbuildbinfilecheck.exe C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
C:\code\\llvm-project\llvm\build\test\MC\AArch64\Output\seh.s.script: line 1: c:codellvm-projectllvmbuildbinfilecheck.exe: command not found

--

(This seems to have been broken throughout this time period at least. However before 39bbfb7, when libcxx tests were using the external executor, they did work when executed with bash as external executor.)

Conclusion

It certainly looks like this configuration is severely broken. I'm not aware of any parts of the configurations that use it by default, and even then, it's generally problematic, so I'm not sure if it's worth spending effort on fixing. (The suggestion that was made somewhere around these discussions, to move most things towards using the internal shell even on unix, sounds to me like a good direction in general .)

That said; for echoing executed commands, when using the internal executor, it's probably very valuable if they're printed in a form that is easy to copypaste and execute in cmd.exe (where possible - at least for the majority of simple cases).

@mstorsjo
Copy link
Member

mstorsjo commented Sep 4, 2023

FWIW, this PR doesn't entirely fix running the tests with win32 cmd.exe either; with this PR, it fails like this:

Exit Code: 255

Command Output (stdout):
--

C:\code\llvm-project\llvm\build\test\MC\AArch64>echo 'RUN: at line 3' > nul &&

--
Command Output (stderr):
--
The syntax of the command is incorrect.

@jdenny-ornl
Copy link
Collaborator Author

What is %dbg actually used for? I couldn't find any references to it in the main LLVM tests, nor is there any reference to it in the Testing Guid docs, so I suspect I'm missing something crucial to understanding this situation...

It's something like an internal substitution. When parsing a RUN line, lit inserts it before the RUN line's commands like this:

%dbg(RUN: at line N) commands

Later, lit expands it based on the shell (internal, external sh-like shell, external windows cmd, etc.). For example, for sh-like shells:

: 'RUN: at line N'; commands

@jdenny-ornl
Copy link
Collaborator Author

jdenny-ornl commented Sep 4, 2023

I haven't been so much involved in the recent Lit changes, and it's been a while since I poked at these things, so it took me a little bit of fiddling around to get back on track about what the actual status quo is here. I'll try to summarize my findings, please doublecheck if I've missed something:

Thanks for the nice summary. It looks right to me.

Running things with win32 cmd.exe as external shell did use to work, but currently doesn't. This actually seems to have broken in 1041a96 already.

You're right, and that landed in April, 2022, so windows cmd support has been broken a while. That part is the reason why the current PR wasn't sufficient to restore correct behavior, as you point out in your later comment. It looks straight-forward to fix, but maybe doing so isn't worthwhile.

Separate side note; running the LLVM testsuite with bash as external executor (if it happens to be available in path) also seems to fail:

Exit Code: 127

Command Output (stderr):
--
+ : 'RUN: at line 3'
+ c:codellvm-projectllvmbuildbinllvm-mc.exe -triple aarch64-pc-win32 -filetype=obj C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
C:\code\\llvm-project\llvm\build\test\MC\AArch64\Output\seh.s.script: line 1: c:codellvm-projectllvmbuildbinllvm-mc.exe: command not found
+ c:codellvm-projectllvmbuildbinllvm-readobj.exe -S -r -u -
C:\code\\llvm-project\llvm\build\test\MC\AArch64\Output\seh.s.script: line 1: c:codellvm-projectllvmbuildbinllvm-readobj.exe: command not found
+ c:codellvm-projectllvmbuildbinfilecheck.exe C:/code/llvm-project/llvm/test/MC/AArch64/seh.s
C:\code\\llvm-project\llvm\build\test\MC\AArch64\Output\seh.s.script: line 1: c:codellvm-projectllvmbuildbinfilecheck.exe: command not found

--

(This seems to have been broken throughout this time period at least. However before 39bbfb7, when libcxx tests were using the external executor, they did work when executed with bash as external executor.)

It looks like the llvm-mc substitution isn't quoting its value correctly.

Conclusion

It certainly looks like this configuration is severely broken. I'm not aware of any parts of the configurations that use it by default, and even then, it's generally problematic, so I'm not sure if it's worth spending effort on fixing. (The suggestion that was made somewhere around these discussions, to move most things towards using the internal shell even on unix, sounds to me like a good direction in general .)

Agreed. What if we raise a python exception on the windows cmd code path in lit for now? If we still see no complaints for a few weeks, we can think about removing the code path entirely.

That said; for echoing executed commands, when using the internal executor, it's probably very valuable if they're printed in a form that is easy to copypaste and execute in cmd.exe (where possible - at least for the majority of simple cases).

If RUN lines are written for lit's internal shell, I'm not sure if they are generally guaranteed to work in windows cmd. Do you have a specific change in mind here?

@mstorsjo
Copy link
Member

mstorsjo commented Sep 4, 2023

(This seems to have been broken throughout this time period at least. However before 39bbfb7, when libcxx tests were using the external executor, they did work when executed with bash as external executor.)

It looks like the llvm-mc substitution isn't quoting its value correctly.

Yep, probably - I guess this goes the same for all the llvm tools that have substitutions set up.

Conclusion

It certainly looks like this configuration is severely broken. I'm not aware of any parts of the configurations that use it by default, and even then, it's generally problematic, so I'm not sure if it's worth spending effort on fixing. (The suggestion that was made somewhere around these discussions, to move most things towards using the internal shell even on unix, sounds to me like a good direction in general .)

Agreed. What if we raise a python exception on the windows cmd code path in lit for now? If we still see no complaints for a few weeks, we can think about removing the code path entirely.

I guess that sounds reasaonble. Although I'm not sure how often odd downstreams integrate from upstream, but things have been quite broken for quite some time anyway, so I guess it's pretty safe to say that this is dead code.

That said; for echoing executed commands, when using the internal executor, it's probably very valuable if they're printed in a form that is easy to copypaste and execute in cmd.exe (where possible - at least for the majority of simple cases).

If RUN lines are written for lit's internal shell, I'm not sure if they are generally guaranteed to work in windows cmd. Do you have a specific change in mind here?

Not really, no - I'm mostly commenting on it from the point of view that I remember seeing somewhere else in these discussions (regarding formatting of the echo RUN stuff I think). As long as most commands, in the form command1 | command2 work and can be copypasted as such, we're probably just fine. (And I have no reason to believe that wouldn't be the case.)

@banach-space banach-space removed their request for review September 4, 2023 18:51
RoboTux
RoboTux previously approved these changes Sep 5, 2023
Copy link
Contributor

@RoboTux RoboTux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hnrklssn
Copy link
Member

hnrklssn commented Sep 5, 2023

What is %dbg actually used for? I couldn't find any references to it in the main LLVM tests, nor is there any reference to it in the Testing Guid docs, so I suspect I'm missing something crucial to understanding this situation...

It's something like an internal substitution. When parsing a RUN line, lit inserts it before the RUN line's commands like this:

%dbg(RUN: at line N) commands

Later, lit expands it based on the shell (internal, external sh-like shell, external windows cmd, etc.). For example, for sh-like shells:

: 'RUN: at line N'; commands

Ah, thanks for the explanation. Then it's used a lot more than I thought.

jdenny-ornl added a commit to jdenny-ornl/llvm-project that referenced this pull request Sep 5, 2023
Before <https://reviews.llvm.org/D154984> and
<https://reviews.llvm.org/D156954>, lit reported full RUN lines in a
`Script:` section.  Now, in the case of lit's internal shell, it's the
execution trace that includes them.  However, if lit is configured to
use an external shell (e.g., bash, windows `cmd`), they aren't
reported at all.

A fix was requested at the following:

* <https://reviews.llvm.org/D154984#4627605>
* <https://discourse.llvm.org/t/rfc-improving-lits-debug-output/72839/35?u=jdenny-ornl>

This patch does not correctly address the case when the external shell
is windows `cmd`.  As discussed at
<llvm#65242>, it's not clear
whether that's a use case that people still care about, and it seems
to be generally broken anyway.
@jdenny-ornl
Copy link
Collaborator Author

I noticed that the python formatting check failed so I looked into that, the problem is the file llvm/utils/lit/tests/per-test-coverage.py

Thanks for pointing that out. I noticed it too but made the bad assumption that something was wrong with the pre-commit check itself, and I ran out of time to investigate. This is the diagnostic I saw after a traceback:

black.parsing.InvalidInput: Cannot parse: 12:13: t-cfg.py ({{[^)]*}})

I didn't see llvm/utils/lit/tests/per-test-coverage.py mentioned anywhere. Did I miss it?

In contrast, when I run black on the command line on my local system, I see:

$ black llvm/utils/lit/tests/per-test-coverage.py 
error: cannot format llvm/utils/lit/tests/per-test-coverage.py: Cannot parse: 12:13: t-cfg.py ({{[^)]*}})

Oh no! 💥 💔 💥
1 file failed to reformat.

Is there any way to get the pre-commit check to mention the name of the problematic file?

(By the way, I have noticed that many python files were formatted with >80 columns, probably the black default of 88. That doesn't fit in the 80-column editor/terminal windows I use for most of LLVM. Is there any good reason not to follow the LLVM coding standard for source code width in python files? I see no exception there for python, or did I misunderstand something? Black just needs a -l80 command-line option.)

this file is not valid python, neither black or python can parse it. Since that's the case, can we change the extension on it so that we don't have to create a lot of exceptions in our CI and pre-commit hook?

Lit's tests use the .py suffix. Most just contain comments that contain lit/FileCheck directives. Some contain actual python code called by RUN lines. The above test contains garbage outside of a comment, so the test works fine, but it's invalid as python. I'll push a commit to this PR to fix it. Thanks again for pointing me in the right direction.

@tru
Copy link
Collaborator

tru commented Sep 11, 2023

I noticed that the python formatting check failed so I looked into that, the problem is the file llvm/utils/lit/tests/per-test-coverage.py

Thanks for pointing that out. I noticed it too but made the bad assumption that something was wrong with the pre-commit check itself, and I ran out of time to investigate. This is the diagnostic I saw after a traceback:
Is there any way to get the pre-commit check to mention the name of the problematic file?

I'll look into this, it also took me time to figure it out.

(By the way, I have noticed that many python files were formatted with >80 columns, probably the black default of 88. That doesn't fit in the 80-column editor/terminal windows I use for most of LLVM. Is there any good reason not to follow the LLVM coding standard for source code width in python files? I see no exception there for python, or did I misunderstand something? Black just needs a -l80 command-line option.)

This was discussed when we wrote the coding style for python and then we thought it was just better to use black's default to avoid having people need to run black with arguments (since it won't read a config file).

@jdenny-ornl
Copy link
Collaborator Author

(By the way, I have noticed that many python files were formatted with >80 columns, probably the black default of 88. That doesn't fit in the 80-column editor/terminal windows I use for most of LLVM. Is there any good reason not to follow the LLVM coding standard for source code width in python files? I see no exception there for python, or did I misunderstand something? Black just needs a -l80 command-line option.)

This was discussed when we wrote the coding style for python and then we thought it was just better to use black's default to avoid having people need to run black with arguments

OK, but the LLVM coding standard appears to contradict that approach, as far as I can tell:

  1. "Write your code to fit within 80 columns." (From https://llvm.org/docs/CodingStandards.html#source-code-width.)
  2. "The Python code within the LLVM repository should adhere to the formatting guidelines outlined in PEP-8." (From https://llvm.org/docs/CodingStandards.html#python-version-and-source-code-formatting).
  3. "Limit all lines to a maximum of 79 characters." (From https://peps.python.org/pep-0008/#maximum-line-length.)

The LLVM coding standard's advice on using black is:

For consistency and to limit churn, code should be automatically formatted with the black utility. Black allows changing the formatting rules based on major version. In order to avoid unnecessary churn in the formatting rules we currently use black version 23.x in LLVM.

Combining all the above, I'm led to believe I should be passing options to black, but it's unclear which ones.

I'm sorry I didn't participate in the original discussion. If it's too late to change things, can we clarify the LLVM coding standard?

(since it won't read a config file).

See https://black.readthedocs.io/en/stable/usage_and_configuration/the_basics.html#configuration-via-a-file.

For example, in llvm/utils/lit/pyproject.toml , I appended the following:

[tool.black]
line-length = 80

And it worked for me.

@tru
Copy link
Collaborator

tru commented Sep 11, 2023

(By the way, I have noticed that many python files were formatted with >80 columns, probably the black default of 88. That doesn't fit in the 80-column editor/terminal windows I use for most of LLVM. Is there any good reason not to follow the LLVM coding standard for source code width in python files? I see no exception there for python, or did I misunderstand something? Black just needs a -l80 command-line option.)

This was discussed when we wrote the coding style for python and then we thought it was just better to use black's default to avoid having people need to run black with arguments

OK, but the LLVM coding standard appears to contradict that approach, as far as I can tell:

  1. "Write your code to fit within 80 columns." (From https://llvm.org/docs/CodingStandards.html#source-code-width.)
  2. "The Python code within the LLVM repository should adhere to the formatting guidelines outlined in PEP-8." (From https://llvm.org/docs/CodingStandards.html#python-version-and-source-code-formatting).
  3. "Limit all lines to a maximum of 79 characters." (From https://peps.python.org/pep-0008/#maximum-line-length.)

The LLVM coding standard's advice on using black is:

For consistency and to limit churn, code should be automatically formatted with the black utility. Black allows changing the formatting rules based on major version. In order to avoid unnecessary churn in the formatting rules we currently use black version 23.x in LLVM.

Combining all the above, I'm led to believe I should be passing options to black, but it's unclear which ones.

I'm sorry I didn't participate in the original discussion. If it's too late to change things, can we clarify the LLVM coding standard?

(since it won't read a config file).

See https://black.readthedocs.io/en/stable/usage_and_configuration/the_basics.html#configuration-via-a-file.

For example, in llvm/utils/lit/pyproject.toml , I appended the following:

[tool.black]
line-length = 80

And it worked for me.

I think that's pretty new then. We tried this back during the original discussion without luck. But feel free to either update the docs to reflect the reality or post to discourse suggesting a policy change.

I have no strong feelings either way as long as we don't have to remember to manually change options. A change here would also mean a lot of reformatting which is a bit unnecessary churn.

@jdenny-ornl
Copy link
Collaborator Author

But feel free to either update the docs to reflect the reality or post to discourse suggesting a policy change.

I'll plan to raise it in the original RFC before proposing any changes.

Thanks for all your work in this area. It is much appreciated.

avillega pushed a commit to avillega/llvm-project that referenced this pull request Sep 11, 2023
Before <https://reviews.llvm.org/D154984> and
<https://reviews.llvm.org/D156954>, lit reported full RUN lines in a
`Script:` section. Now, in the case of lit's internal shell, it's the
execution trace that includes them. However, if lit is configured to use
an external shell (e.g., bash, windows `cmd`), they aren't reported at
all.

A fix was requested at the following:

* <https://reviews.llvm.org/D154984#4627605>
*
<https://discourse.llvm.org/t/rfc-improving-lits-debug-output/72839/35?u=jdenny-ornl>

This patch does not correctly address the case when the external shell
is windows `cmd`. As discussed at
<llvm#65242>, it's not clear
whether that's a use case that people still care about, and it seems to
be generally broken anyway.
@jdenny-ornl jdenny-ornl dismissed RoboTux’s stale review September 11, 2023 20:36

PR has changed since the review.

@jdenny-ornl
Copy link
Collaborator Author

I'm not trying to rush reviewers, but I want it to be clear that this patch is still in need of review due to recent changes. See #65242 (comment) for a summary. I'm afraid the recent discussion of black might have buried that comment. Maybe I'm just not used to LLVM PR reviews yet.

@RoboTux
Copy link
Contributor

RoboTux commented Sep 12, 2023

I'm not trying to rush reviewers, but I want it to be clear that this patch is still in need of review due to recent changes. See #65242 (comment) for a summary. I'm afraid the recent discussion of black might have buried that comment. Maybe I'm just not used to LLVM PR reviews yet.

I saw it but was waiting on the updated version fixing the python issue before giving it another look. I shall look at it today.

keyword=keyword, line_number=line_number
)
assert re.match(
kPdbgRegex + "$", pdbg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the $ dropped in the buildPdbgCommand?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.

Originally, my goal was for the assert to mimic how kPdbgRegex will actually be used later when expanding %dbg substitutions. That's the usage that the assert is trying to verify will work correctly. Thus, this patch changes the assert in two ways: the searched string contains all of %dbg(...) cmd-line instead of just %dbg(...), and it does not use $.

However, now that you ask, I think the assert should be stricter. For example, D154987 proposes to extend kPdbgRegex to permit newlines, which might appear in lit.run(cmd) in PYTHON directives, as in the example in that review's summary. Using $ in the assert here would have caught that existing deficiency of kPdbgRegex: its .* doesn't match beyond the first newline.

Instead of $, what do you think of using re.fullmatch in the assert? That is, buildPdbgCommand expects kPdbgRegex to match the entire string it's building.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and applied that change and wrote a commit log to explain it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but does it mean you'll just rebase and push to the repo using pure git? If using the github tool AFAIK you can only merge and squash on the LLVM project so the commit message won't show up unless you modify the PR message. I might be wrong though, I haven't interacted much with github.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least from the web UI you have the chance to edit the final commit message. So you can combine them there if you're ok with it all being one commit.

(otherwise yes manually push some of it)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!

Yes, I performed most steps from the git command line: squash commits and commit logs together, rebase onto the latest main, and force push to the PR. I then clicked "Squash and merge" and copied and pasted my new commit log there. The commit log it initially offered was the original comment I posted for this PR, but the focus of the PR has shifted since then. It would be nicer if it offered the current commit logs, squashed in the same way a git merge --squash would squash them.

I don't know how to accomplish the rebase onto main via the github web UI... even though it seems like that's almost always going to be required given how frequently new commits show up on main. That means I needed to work from the command line anyway, so it seemed more practical to handle squashing there too.

Please let me know if I've overlooked best practices for using the github web UI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The web UI squash-and-merge will automatically rebase to current HEAD.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarifications. For some reason, I thought it previously rejected one when I wasn't caught up to main, but perhaps I had conflicts then. I'll try again next time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it worked fine in another PR that had no conflicts.

Copy link
Contributor

@RoboTux RoboTux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

D154280 (landed in 64d1954 in July, 2023) implements
`--per-test-coverage` (which can also be specified via
`lit_config.per_test_coverage`).  However, it has a few issues, which
the current patch addresses:

1. D154280 implements `--per-test-coverage` only for the case that lit
   is configured to use an external shell.  The current patch extends
   the implementation to lit's internal shell.

2. In the case that lit is configured to use an external shell,
   regardless of whether `--per-test-coverage` is actually specified,
   D154280 causes `%dbg(RUN: at line N)` to be expanded in RUN lines
   early and in a manner that is specific to sh-like shells.  As a
   result, later code in lit that expands it in a shell-specific
   manner is useless as there's nothing left to expand.  The current
   patch cleans up the implementation to avoid useless code.

3. Because of issue 2, D154280 corrupts support for windows `cmd` as
   an external shell (effectively comments out all RUN lines with
   `:`).  The current patch happens to fix that particular corruption
   by addressing issue 2.  However, D122569 (landed in 1041a96 in
   April, 2022) had already broken support for windows `cmd` as an
   external shell (discards RUN lines when expanding `%dbg(RUN: at
   line N)`).  The current patch does not attempt to fix that bug.
   For further details, see the PR discussion of the current patch.

The current patch addresses the above issues by implementing
`--per-test-coverage` before selecting the shell (internal or
external) and by leaving `%dbg(RUN: at line N)` unexpanded there.
Thus, it is expanded later in a shell-specific manner, as before
D154280.

This patch introduces `buildPdbgCommand` into lit's implementation to
encapsulate the process of building (or rebuilding in the case of the
`--per-test-coverage` implementation) a full `%dbg(RUN: at line N)
cmd` line and asserting that the result matches `kPdbgRegex`.  It also
cleans up that and all other uses of `kPdbgRegex` to operate on the
full line with `re.fullmatch` not `re.match`.  This change better
reflects the intention in every case, but it is expected to be NFC
because `kPdbgRegex` ends in `.*` and thus avoids the difference
between `re.fullmatch` and `re.match`.  The only caveat is that `.*`
does not match newlines, but RUN lines cannot contain newlines
currently, so this caveat currently shouldn't matter in practice.

The original `--per-test-coverage` implementation avoided accumulating
`export LLVM_PROFILE_FILE={profile}` insertions across retries (due to
`ALLOW_RETRIES`) by skipping the insertion if `%dbg(RUN: at line N)`
was not present and thus had already been expanded.  However, the
current patch makes sure the insertions also happen for commands
without `%dbg(RUN: at line N)`, such as preamble commands or some
commands from other lit test formats.  Thus, the current patch
implements a different mechanism to avoid accumulating those
insertions (see code comments).
@jdenny-ornl jdenny-ornl merged commit 9e739fd into llvm:main Sep 14, 2023
2 of 3 checks passed
@xgupta
Copy link
Contributor

xgupta commented Sep 14, 2023

Thank you @jdenny-ornl for fixing the issues.

kstoimenov pushed a commit to kstoimenov/llvm-project that referenced this pull request Sep 14, 2023
D154280 (landed in 64d1954 in July, 2023) implements
`--per-test-coverage` (which can also be specified via 
`lit_config.per_test_coverage`).  However, it has a few issues, which
the current patch addresses:

1. D154280 implements `--per-test-coverage` only for the case that lit 
   is configured to use an external shell.  The current patch extends
   the implementation to lit's internal shell.

2. In the case that lit is configured to use an external shell,
   regardless of whether `--per-test-coverage` is actually specified,
   D154280 causes `%dbg(RUN: at line N)` to be expanded in RUN lines
   early and in a manner that is specific to sh-like shells.  As a
   result, later code in lit that expands it in a shell-specific
   manner is useless as there's nothing left to expand.  The current
   patch cleans up the implementation to avoid useless code.

3. Because of issue 2, D154280 corrupts support for windows `cmd` as
   an external shell (effectively comments out all RUN lines with
   `:`).  The current patch happens to fix that particular corruption
   by addressing issue 2.  However, D122569 (landed in 1041a96 in
   April, 2022) had already broken support for windows `cmd` as an
   external shell (discards RUN lines when expanding `%dbg(RUN: at
   line N)`).  The current patch does not attempt to fix that bug.
   For further details, see the PR discussion of the current patch.

The current patch addresses the above issues by implementing
`--per-test-coverage` before selecting the shell (internal or
external) and by leaving `%dbg(RUN: at line N)` unexpanded there.
Thus, it is expanded later in a shell-specific manner, as before
D154280.

This patch introduces `buildPdbgCommand` into lit's implementation to
encapsulate the process of building (or rebuilding in the case of the 
`--per-test-coverage` implementation) a full `%dbg(RUN: at line N)
cmd` line and asserting that the result matches `kPdbgRegex`.  It also
cleans up that and all other uses of `kPdbgRegex` to operate on the 
full line with `re.fullmatch` not `re.match`.  This change better
reflects the intention in every case, but it is expected to be NFC 
because `kPdbgRegex` ends in `.*` and thus avoids the difference
between `re.fullmatch` and `re.match`.  The only caveat is that `.*`
does not match newlines, but RUN lines cannot contain newlines
currently, so this caveat currently shouldn't matter in practice.

The original `--per-test-coverage` implementation avoided accumulating
`export LLVM_PROFILE_FILE={profile}` insertions across retries (due to
`ALLOW_RETRIES`) by skipping the insertion if `%dbg(RUN: at line N)` 
was not present and thus had already been expanded.  However, the 
current patch makes sure the insertions also happen for commands
without `%dbg(RUN: at line N)`, such as preamble commands or some
commands from other lit test formats.  Thus, the current patch
implements a different mechanism to avoid accumulating those
insertions (see code comments).
@jdenny-ornl
Copy link
Collaborator Author

Thank you @jdenny-ornl for fixing the issues.

By the way, thanks for the feature!

ZijunZhaoCCK pushed a commit to ZijunZhaoCCK/llvm-project that referenced this pull request Sep 19, 2023
D154280 (landed in 64d1954 in July, 2023) implements
`--per-test-coverage` (which can also be specified via 
`lit_config.per_test_coverage`).  However, it has a few issues, which
the current patch addresses:

1. D154280 implements `--per-test-coverage` only for the case that lit 
   is configured to use an external shell.  The current patch extends
   the implementation to lit's internal shell.

2. In the case that lit is configured to use an external shell,
   regardless of whether `--per-test-coverage` is actually specified,
   D154280 causes `%dbg(RUN: at line N)` to be expanded in RUN lines
   early and in a manner that is specific to sh-like shells.  As a
   result, later code in lit that expands it in a shell-specific
   manner is useless as there's nothing left to expand.  The current
   patch cleans up the implementation to avoid useless code.

3. Because of issue 2, D154280 corrupts support for windows `cmd` as
   an external shell (effectively comments out all RUN lines with
   `:`).  The current patch happens to fix that particular corruption
   by addressing issue 2.  However, D122569 (landed in 1041a96 in
   April, 2022) had already broken support for windows `cmd` as an
   external shell (discards RUN lines when expanding `%dbg(RUN: at
   line N)`).  The current patch does not attempt to fix that bug.
   For further details, see the PR discussion of the current patch.

The current patch addresses the above issues by implementing
`--per-test-coverage` before selecting the shell (internal or
external) and by leaving `%dbg(RUN: at line N)` unexpanded there.
Thus, it is expanded later in a shell-specific manner, as before
D154280.

This patch introduces `buildPdbgCommand` into lit's implementation to
encapsulate the process of building (or rebuilding in the case of the 
`--per-test-coverage` implementation) a full `%dbg(RUN: at line N)
cmd` line and asserting that the result matches `kPdbgRegex`.  It also
cleans up that and all other uses of `kPdbgRegex` to operate on the 
full line with `re.fullmatch` not `re.match`.  This change better
reflects the intention in every case, but it is expected to be NFC 
because `kPdbgRegex` ends in `.*` and thus avoids the difference
between `re.fullmatch` and `re.match`.  The only caveat is that `.*`
does not match newlines, but RUN lines cannot contain newlines
currently, so this caveat currently shouldn't matter in practice.

The original `--per-test-coverage` implementation avoided accumulating
`export LLVM_PROFILE_FILE={profile}` insertions across retries (due to
`ALLOW_RETRIES`) by skipping the insertion if `%dbg(RUN: at line N)` 
was not present and thus had already been expanded.  However, the 
current patch makes sure the insertions also happen for commands
without `%dbg(RUN: at line N)`, such as preamble commands or some
commands from other lit test formats.  Thus, the current patch
implements a different mechanism to avoid accumulating those
insertions (see code comments).
jdenny-ornl added a commit to jdenny-ornl/llvm-project that referenced this pull request Sep 19, 2023
Before <https://reviews.llvm.org/D154984> and
<https://reviews.llvm.org/D156954>, lit reported full RUN lines in a
`Script:` section.  Now, in the case of lit's internal shell, it's the
execution trace that includes them.  However, if lit is configured to
use an external shell (e.g., bash, windows `cmd`), they aren't
reported at all.

A fix was requested at the following:

* <https://reviews.llvm.org/D154984#4627605>
* <https://discourse.llvm.org/t/rfc-improving-lits-debug-output/72839/35?u=jdenny-ornl>

This patch does not address the case when the external shell is
windows `cmd`.  As discussed at
<llvm#65242>, it's not clear
whether that's a use case that people still care about, and it seems
to be generally broken anyway.
jdenny-ornl added a commit that referenced this pull request Sep 19, 2023
Before <https://reviews.llvm.org/D154984> and
<https://reviews.llvm.org/D156954>, lit reported full RUN lines in a
`Script:` section. Now, in the case of lit's internal shell, it's the
execution trace that includes them. However, if lit is configured to use
an external shell (e.g., bash, windows `cmd`), they aren't reported at
all.

A fix was requested at the following:

* <https://reviews.llvm.org/D154984#4627605>
*
<https://discourse.llvm.org/t/rfc-improving-lits-debug-output/72839/35?u=jdenny-ornl>

This patch does not address the case when the external shell is windows
`cmd`. As discussed at
<#65242>, it's not clear
whether that's a use case that people still care about, and it seems to
be generally broken anyway.
jdenny-ornl added a commit to jdenny-ornl/llvm-project that referenced this pull request Oct 1, 2023
In PR llvm#65242 (landed as 9e739fd), I
claimed that RUN lines cannot contain newlines.  Actually, they can
after substitution expansion.  More generally, a lit config file can
define substitutions or preamble commands containing newlines.  While
both of those cases seem unlikely in practice,
[D154987](https://reviews.llvm.org/D154987) proposes PYTHON directives
where it seems very likely.

Regardless of the use case, without this patch, such newlines break
expansion of `%dbg(RUN: at line N)`, and the fix is simple.
jdenny-ornl added a commit that referenced this pull request Oct 3, 2023
In PR #65242 (landed as 9e739fd), I
claimed that RUN lines cannot contain newlines. Actually, they can after
substitution expansion. More generally, a lit config file can define
substitutions or preamble commands containing newlines. While both of
those cases seem unlikely in practice,
[D154987](https://reviews.llvm.org/D154987) proposes PYTHON directives
where it seems very likely.

Regardless of the use case, without this patch, such newlines break
expansion of `%dbg(RUN: at line N)`, and the fix is simple.
qihangkong pushed a commit to rvgpu/llvm that referenced this pull request Apr 18, 2024
Before <https://reviews.llvm.org/D154984> and
<https://reviews.llvm.org/D156954>, lit reported full RUN lines in a
`Script:` section. Now, in the case of lit's internal shell, it's the
execution trace that includes them. However, if lit is configured to use
an external shell (e.g., bash, windows `cmd`), they aren't reported at
all.

A fix was requested at the following:

* <https://reviews.llvm.org/D154984#4627605>
*
<https://discourse.llvm.org/t/rfc-improving-lits-debug-output/72839/35?u=jdenny-ornl>

This patch does not correctly address the case when the external shell
is windows `cmd`. As discussed at
<llvm/llvm-project#65242>, it's not clear
whether that's a use case that people still care about, and it seems to
be generally broken anyway.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet