Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix forward-slash path seperator in lldb and add lldb-vscode.exe to release #266

Closed
nidefawl opened this issue Mar 24, 2022 · 8 comments
Closed

Comments

@nidefawl
Copy link

lldb-vscode.exe together with lldb/tools/lldb-vscode/package.json and the few files in lldb-vscode/syntaxes is enough to get debugging in VSCode without Microsofts Cpp or thirdparty extensions working.
This works without the MI interface, by directly communicating to VSCodes debugger API.

But file+linenumber breakpoints don't resolve when launching a target. Function breakpoints do work.

I spent some time debugging, and it turned out to be a path-seperator problem.
My code is built using forwards slashes in the filepaths.

After lldb loads the Dwarf symbols the source file paths are getting transformed to /my/project/build/c:/my/project/src/somefile.cpp

I made these 2 changes, and it now constructs the correct path for source files.

diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
index 5487f709d223..f44ca1705027 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
@@ -771,7 +771,7 @@ removeHostnameFromPathname(llvm::StringRef path_from_dwarf) {

   // check whether we have a windows path, and so the first character is a
   // drive-letter not a hostname.
-  if (host.size() == 1 && llvm::isAlpha(host[0]) && path.startswith("\\"))
+  if (host.size() == 1 && llvm::isAlpha(host[0]) && (path.startswith("\\") || path.startswith("/")))
     return path_from_dwarf;

   return path;
diff --git a/lldb/source/Utility/FileSpec.cpp b/lldb/source/Utility/FileSpec.cpp
index 24f8c2b1c23f..01e1183a9195 100644
--- a/lldb/source/Utility/FileSpec.cpp
+++ b/lldb/source/Utility/FileSpec.cpp
@@ -311,7 +311,7 @@ llvm::Optional<FileSpec::Style> FileSpec::GuessPathStyle(llvm::StringRef absolut
   if (absolute_path.startswith(R"(\\)"))
     return Style::windows;
   if (absolute_path.size() >= 3 && llvm::isAlpha(absolute_path[0]) &&
-      absolute_path.substr(1, 2) == R"(:\)")
+      (absolute_path.substr(1, 2) == R"(:\)" || absolute_path.substr(1, 2) == ":/"))
     return Style::windows;
   return llvm::None;
 }

I haven't run the unittests for FileSpec::GuessPathStyle.

Is it feasible to get this change upstream and to include lldb-vscode.exe in future releases?

@mstorsjo
Copy link
Owner

lldb-vscode.exe together with lldb/tools/lldb-vscode/package.json and the few files in lldb-vscode/syntaxes is enough to get debugging in VSCode without Microsofts Cpp or thirdparty extensions working. This works without the MI interface, by directly communicating to VSCodes debugger API.

But file+linenumber breakpoints don't resolve when launching a target. Function breakpoints do work.

I spent some time debugging, and it turned out to be a path-seperator problem. My code is built using forwards slashes in the filepaths.

After lldb loads the Dwarf symbols the source file paths are getting transformed to /my/project/build/c:/my/project/src/somefile.cpp

I made these 2 changes, and it now constructs the correct path for source files.

diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
index 5487f709d223..f44ca1705027 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
@@ -771,7 +771,7 @@ removeHostnameFromPathname(llvm::StringRef path_from_dwarf) {

   // check whether we have a windows path, and so the first character is a
   // drive-letter not a hostname.
-  if (host.size() == 1 && llvm::isAlpha(host[0]) && path.startswith("\\"))
+  if (host.size() == 1 && llvm::isAlpha(host[0]) && (path.startswith("\\") || path.startswith("/")))
     return path_from_dwarf;

   return path;
diff --git a/lldb/source/Utility/FileSpec.cpp b/lldb/source/Utility/FileSpec.cpp
index 24f8c2b1c23f..01e1183a9195 100644
--- a/lldb/source/Utility/FileSpec.cpp
+++ b/lldb/source/Utility/FileSpec.cpp
@@ -311,7 +311,7 @@ llvm::Optional<FileSpec::Style> FileSpec::GuessPathStyle(llvm::StringRef absolut
   if (absolute_path.startswith(R"(\\)"))
     return Style::windows;
   if (absolute_path.size() >= 3 && llvm::isAlpha(absolute_path[0]) &&
-      absolute_path.substr(1, 2) == R"(:\)")
+      (absolute_path.substr(1, 2) == R"(:\)" || absolute_path.substr(1, 2) == ":/"))
     return Style::windows;
   return llvm::None;
 }

I haven't run the unittests for FileSpec::GuessPathStyle.

Is it feasible to get this change upstream and

Thanks for looking into this and providing a provisional patch!

Sure, I can try to upstream these changes. They seem straightforward and correct to me. The unittests for FileSpec::GuessPathStyle do pass, and it's easy to extend them to include testing for this case too. Not quite as sure how to go about creating a test for the dwarf part of the patch though.

Is it possible to trigger and notice the same issue with commandline lldb? I tried to compile a file with a hardcoded path to the source with both path styles, e.g. clang c:\code\test.cpp and clang c:/code/test.cpp, and at least commandline lldb managed to open and view the source for code in that file just fine.

Btw when poking around with this, I noticed another similar-looking issue. If you build a C++ test example (I built test/hello-exception.cpp from this repo) and inspect the executable with llvm-dwarfdump -debug-info (not included in this toolchain distribution unfortunately, but commonly available on linux), I find things like this:

0x00001323:       DW_TAG_enumeration_type
                    DW_AT_type  (cu + 0x19d0 "int")
                    DW_AT_name  ("float_denorm_style")
                    DW_AT_byte_size     (0x04)
                    DW_AT_decl_file     ("C:\code\debugtest/C:\code/llvm-mingw-13\include\c++\v1\limits")
                    DW_AT_decl_line     (135)

So something, somewhere, fails to realize that the appended path is absolute. The same issue seems to be present both with the LLVM 13 based release and with the 14 based release (where path separators have been changed to generally prefer forward slashes). If you'd happen to have time, can you manage to pinpoint where this happens? If not I can try to look into it at some point...

to include lldb-vscode.exe in future releases?

Sure, I can do that. It's been discussed before, but last comment on that topic I heard was that it shouldn't be needed as the lldb-vscode extension comes with its own binary. But if there's a usecase for setting up that extension manually, it's quite straightforward to include lldb-vscode.exe in the packages.

@nidefawl
Copy link
Author

                DW_AT_decl_file     ("C:\code\debugtest/C:\code/llvm-mingw-13\include\c++\v1\limits")

I did some tests and I cannot reproduce this issue. Did you build it with a 13.x release?
I tested building with the latest 14.0 release and with my own 14.0 build. I inspected with a llvm-dwarfdump.exe from my build and llvm-dwarfdump-9, 12 and 14 on ubuntu.
I did compilation from different sibling directories, and I even tried building next to the toolchain directory.

It mixes forward and backslashes, but all of the paths are valid and correct.

@mstorsjo
Copy link
Owner

                DW_AT_decl_file     ("C:\code\debugtest/C:\code/llvm-mingw-13\include\c++\v1\limits")

I did some tests and I cannot reproduce this issue. Did you build it with a 13.x release?

I tried both with a 13.x and a 14.x based release.

But it seems rather elusive. I can reproduce it on a machine running Windows 11 on ARM64, running both the aarch64 and x86_64 versions of the toolchain (with the x86_64 version of the toolchain running emulated). But it doesn't reproduce on a VM running Windows 10 (21H2) on x86_64, nor does it reproduce on Windows Server 2019 (1809) on x86_64.

To reproduce it, on the one system where it happens, the key seems to be running the compile process from a directory that isn't parent of the toolchain - i.e. having the toolchain unpacked as e.g. c:\code\llvm-mingw-14 but running the compile process from c:\code\debugtest.

One exact procedure to reproduce it is e.g. this:

wget https://github.com/mstorsjo/llvm-mingw/releases/download/20220323/llvm-mingw-20220323-ucrt-x86_64.zip
unzip llvm-mingw-20220323-ucrt-x86_64.zip
mkdir debugtest
cd debugtest
wget https://raw.githubusercontent.com/mstorsjo/llvm-mingw/master/test/hello-exception.cpp
../llvm-mingw-20220323-ucrt-x86_64/bin/clang++ -c hello-exception.cpp -g
llvm-dwarfdump -debug-info hello-exception.o | grep decl_file

So apparently this is some case of code trying to deduce whether two paths share a common path prefix or not, which apparently behaves differently depending on version/platform of Windows. So I guess I'll have to dig in myself if others can't reproduce this...

mstorsjo added a commit to llvm/llvm-project that referenced this issue Mar 26, 2022
In practice, Windows paths can use either backslashes or forward slashes.

This fixes an issue reported downstream at
mstorsjo/llvm-mingw#266.

Differential Revision: https://reviews.llvm.org/D122389
@mstorsjo
Copy link
Owner

I pushed the LLDB fix upstream in llvm/llvm-project@b548f58.

In 0a23956 I pushed a commit to keep the lldb-vscode.exe executable in the distribution packages.

@nidefawl
Copy link
Author

Very nice.
I will look into setting up a VM for testing Windows ARM64.
Does lldb in general work on ARM64 Windows?

I spent some more time implementing missing windows features in lldb-vsode and fixing issues.
i.e. https://reviews.llvm.org/D99974 does not pipe the processes stdout/stderr streams correctly.
In general the handling of the target process std I/O is diverging a lot on windows.
lldb logs where not forwarded to VSCodes console.
And I fixed an issue where VSCode shows the target as halted on startup.

I will see if I can get my changes reviewed and merged the official way once I managed to test my changes at least on linux x64 and windows x64.

@mstorsjo
Copy link
Owner

Does lldb in general work on ARM64 Windows?

Some basics work (so if you're debugging a crash or similar, you can probably use it to get hints about what's happening), but I'm not so sure about anything more fancy, and it probably has quite little real-world testing. Backtraces on crashes, breakpoints, stepping do work, at least in simple tests. Watchpoints - most probably not.

I will see if I can get my changes reviewed and merged the official way once I managed to test my changes at least on linux x64 and windows x64.

Awesome! I probably don't have much to add in reviews about such patches (and I'm not authoritative to approve them), but feel free to include me as subscriber to follow the progress.

@mstorsjo
Copy link
Owner

mstorsjo commented Mar 26, 2022

But it seems rather elusive. I can reproduce it on a machine running Windows 11 on ARM64, running both the aarch64 and x86_64 versions of the toolchain (with the x86_64 version of the toolchain running emulated). But it doesn't reproduce on a VM running Windows 10 (21H2) on x86_64, nor does it reproduce on Windows Server 2019 (1809) on x86_64.

Actually, this was a total red herring, sorry for the noise. The issue was that I was inspecting the object files with the version of llvm-dwarfdump that ships with Ubuntu 18.04 (in WSL), and that's a rather ancient LLVM 6.0.0. With a more modern version of llvm-dwarfdump it looks ok.

@mstorsjo
Copy link
Owner

Anyway, I think this issue can be closed now? The LLDB bugfix has been upstreamed, llvm-mingw is modified to include lldb-vscode in releases, and my red herring has been analyzed and dismissed.

SquallATF pushed a commit to SquallATF/llvm-project that referenced this issue Jul 29, 2022
In practice, Windows paths can use either backslashes or forward slashes.

This fixes an issue reported downstream at
mstorsjo/llvm-mingw#266.

Differential Revision: https://reviews.llvm.org/D122389
mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
In practice, Windows paths can use either backslashes or forward slashes.

This fixes an issue reported downstream at
mstorsjo/llvm-mingw#266.

Differential Revision: https://reviews.llvm.org/D122389
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants