Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lldb test suite fails to JIT expressions after update to Ubuntu Jammy ubuntu3.3 #68987

Closed
DavidSpickett opened this issue Oct 13, 2023 · 4 comments · Fixed by #69932
Closed
Assignees

Comments

@DavidSpickett
Copy link
Collaborator

DavidSpickett commented Oct 13, 2023

TLDR: LLDB for some reason unloads important information about ld-linux, which prevents us from correctly calling mmap on Arm/Thumb.

Long report for background, I'll add the important follow up questions in a comment after this.

Since I updated our 32 bit Arm lldb bot container from:

$ ldd --version
ldd (Ubuntu GLIBC 2.35-0ubuntu3.1) 2.35

To:

$ ldd --version
ldd (Ubuntu GLIBC 2.35-0ubuntu3.3) 2.35

The Arm lldb bot has been failing with:

Testing Time: 493.49s
  Unsupported      :  485
  Passed           : 2264
  Expectedly Failed:   25
  Unresolved       :    1
  Failed           :  101

Tests fail with various things along the theme of:

error: Can't evaluate the expression without a running target due to: Interpreter doesn't handle one of the expression's opcodes

First happened here, the changes are unrelated:
https://lab.llvm.org/buildbot/#/builders/17/builds/44298

For reasons I still cannot explain, the fixing of this Ubuntu issue (realted to gdb) on Jammy, caused this failure:
https://bugs.launchpad.net/ubuntu/+source/gdb/+bug/1927192

The funny thing is, back when this was fixed on Bionic we also saw this and figured it was a bug there and moved the bot to Jammy. Now the GDB issue is fixed on Jammy too but and lldb has broken again. So clearly we are in the wrong here.

The background of that bug is not so important once I explain what lldb is doing. Though it is along the same lines,
it was preventing gdb from putting breakpoints in ld-linux.

Reproducer:

$ cat /tmp/test.c
int fn() { return 99; }

int main() { return 0; }
./bin/lldb /tmp/test.o -o "process launch -m" -o "expr -- fn()"
Process 275137 launched: '/tmp/test.o' (arm)
(lldb) expr -- fn()
error: Can't evaluate the expression without a running target due to: Interpreter doesn't handle one of the expression's opcodes

Here's what happens.

  • LLDB wants to run an expression, the first choice is to JIT it.
  • To JIT it we need to allocate some memory to JIT into.
  • First choice for that is the _M packet sent to lldb-server.
  • This packet is only implemented for x86_64.
  • We fall back to instead doing a direct call to mmap.
  • mmap is looked up, we pick one of the symbols found and call PrepareTrivialCall.
  • PrepareTrivialCall must set up the PC to point to the entry of the function,
    setup the arguments according to the ABI and possibly change the CPSR (program status register) for ARM and Thumb modes.

And here is where the fun starts.

On 32 bit Arm we have ARM and Thumb code modes, Thumb is the compressed set. If we look at the mmap symbol
that lldb chooses, it's in ./arm-linux-gnueabihf/ld-linux-armhf.so.3.

0001686c <__mmap>:
   1686c:       b530            push    {r4, r5, lr}
   1686e:       9d04            ldr     r5, [sp, #16]
   16870:       f3c5 040b       ubfx    r4, r5, #0, #12
   16874:       b954            cbnz    r4, 1688c <__mmap+0x20>

See that start address? 2 byte alignment which means a Thumb mode function. At least, it not being 4 byte aligned means it's not ARM. I'm not 100% sure what Thumb requires.

Anyway, point is that the usual trick of "bit 0 or 1 is set means Thumb" doesn't work here. Since the symbol doesn't
have the bottom bits set. This should be fine because we can look up the function's type from the symbol, or the type of the section (it is not important which here).

Great, do we do that? We try to, but it doesn't work. There are no loaded sections for ld-linux for us to find the mmap in.

(lldb) memory region --all
[0x0000000000000000-0x0000000000400000) ---
[0x0000000000400000-0x0000000000401000) r-x /tmp/test.o PT_LOAD[0]
[0x0000000000401000-0x0000000000410000) ---
[0x0000000000410000-0x0000000000411000) r-- /tmp/test.o
[0x0000000000411000-0x0000000000412000) rw- /tmp/test.o PT_LOAD[1]
[0x0000000000412000-0x00000000f7e90000) ---
[0x00000000f7e90000-0x00000000f7f9c000) r-x /usr/lib/arm-linux-gnueabihf/libc.so.6 PT_LOAD[0]
[0x00000000f7f9c000-0x00000000f7fac000) --- /usr/lib/arm-linux-gnueabihf/libc.so.6
[0x00000000f7fac000-0x00000000f7fae000) r-- /usr/lib/arm-linux-gnueabihf/libc.so.6
[0x00000000f7fae000-0x00000000f7faf000) rw- /usr/lib/arm-linux-gnueabihf/libc.so.6
[0x00000000f7faf000-0x00000000f7fb9000) rw-
[0x00000000f7fb9000-0x00000000f7fbf000) ---
[0x00000000f7fbf000-0x00000000f7fde000) r-x /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3
[0x00000000f7fde000-0x00000000f7fe2000) ---
[0x00000000f7fe2000-0x00000000f7fe4000) rw-
[0x00000000f7fe4000-0x00000000f7fec000) ---
[0x00000000f7fec000-0x00000000f7fed000) r-x [sigpage]
[0x00000000f7fed000-0x00000000f7fef000) r-- /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3
[0x00000000f7fef000-0x00000000f7ff0000) rw- /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3
[0x00000000f7ff0000-0x00000000fffcf000) ---
[0x00000000fffcf000-0x00000000ffff0000) rw- [stack]
[0x00000000ffff0000-0x00000000ffff1000) r-x [vectors]
[0x00000000ffff1000-0xffffffffffffffff) ---

In the output above, anything with a name after the file name is a section we know about.
So we have some for test.o and libc.so.6 but nothing for ld-linux.

This is why we are able to resolve the fake return address we give for mmap. Which is the symbol
_start from the libc. This we do have a section for therefore when we resolve it, the thumb bit is placed correctly.

mmap we have no idea so the address comes back unchanged and PrepareTrivialCall decides well it must
be ARM then. Doesn't set the T bit in CPSR, jumps to mmap and immediately SIGILLs because we're trying run Thumb in Arm mode.

So, let's just load the sections for ld-linux, right?

Well, it turns out we actually do, then we throw them away. This patch fixes the whole issue (for Arm at least):

diff --git a/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
index 85d7ae9dac75..7a228d0d9ebf 100644
--- a/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
+++ b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
@@ -459,15 +459,6 @@ void DynamicLoaderPOSIXDYLD::RefreshModules() {
         } else if (module_sp == interpreter_sp) {
           // Module already loaded.
           continue;
-        } else {
-          // If this is a duplicate instance of ld.so, unload it.  We may end
-          // up with it if we load it via a different path than before
-          // (symlink vs real path).
-          // TODO: remove this once we either fix library matching or avoid
-          // loading the interpreter when setting the rendezvous breakpoint.
-          UnloadSections(module_sp);
-          loaded_modules.Remove(module_sp);
-          continue;
         }
       }

Now you can run expressions succesfully. If we look at the memory regions again:

(lldb) memory region --all
[0x0000000000000000-0x0000000000400000) ---
[0x0000000000400000-0x0000000000401000) r-x /tmp/test.o PT_LOAD[0]
[0x0000000000401000-0x0000000000410000) ---
[0x0000000000410000-0x0000000000411000) r-- /tmp/test.o
[0x0000000000411000-0x0000000000412000) rw- /tmp/test.o PT_LOAD[1]
[0x0000000000412000-0x00000000f7e90000) ---
[0x00000000f7e90000-0x00000000f7f9c000) r-x /usr/lib/arm-linux-gnueabihf/libc.so.6 PT_LOAD[0]
[0x00000000f7f9c000-0x00000000f7fac000) --- /usr/lib/arm-linux-gnueabihf/libc.so.6
[0x00000000f7fac000-0x00000000f7fae000) r-- /usr/lib/arm-linux-gnueabihf/libc.so.6
[0x00000000f7fae000-0x00000000f7faf000) rw- /usr/lib/arm-linux-gnueabihf/libc.so.6
[0x00000000f7faf000-0x00000000f7fb9000) rw-
[0x00000000f7fb9000-0x00000000f7fbf000) ---
[0x00000000f7fbf000-0x00000000f7fde000) r-x /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3 PT_LOAD[0]
[0x00000000f7fde000-0x00000000f7fe2000) ---
[0x00000000f7fe2000-0x00000000f7fe4000) rw-
[0x00000000f7fe4000-0x00000000f7fe8000) ---
[0x00000000f7fe8000-0x00000000f7fe9000) r-- objc_imageinfo
[0x00000000f7fe9000-0x00000000f7fea000) rw- .bss
[0x00000000f7fea000-0x00000000f7feb000) r-x .text
[0x00000000f7feb000-0x00000000f7fec000) rwx
[0x00000000f7fec000-0x00000000f7fed000) r-x [sigpage]
[0x00000000f7fed000-0x00000000f7fef000) r-- /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3
[0x00000000f7fef000-0x00000000f7ff0000) rw- /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3 PT_LOAD[1]
[0x00000000f7ff0000-0x00000000fffcf000) ---
[0x00000000fffcf000-0x00000000ffff0000) rw- [stack]
[0x00000000ffff0000-0x00000000ffff1000) r-x [vectors]
[0x00000000ffff1000-0xffffffffffffffff) ---

Now we have a region at 0x00000000f7fef000 that has a section associated with it. mmap is at 0xf7fd586c.

Before patch:

(lldb) memory region 0xf7fd586c
[0x00000000f7fbf000-0x00000000f7fde000) r-x /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3

After patch:

(lldb) memory region 0xf7fd586c
[0x00000000f7fbf000-0x00000000f7fde000) r-x /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3 PT_LOAD[0]

Having a section allows us to resolve the address and set the correct mode. Its AddressClass goes from eUnknown to eCodeAlternateISA (alternate being Thumb here). PrepareTrivialCall sets CPSR.T correctly and it all works.

I have no idea why the test suite ever passed on previous versions of Jammy, and am unable to test it because lldb appears to try to treat the older Jammy's ld-linux as ARM code, meaning I can't start a program because breaking inside of it doesn't work (again, no idea how the test suite managed to run).

This unloading of ld-linux happens on AArch64 also but there we have no reason to need the details of the mmap symbol. It's enough to know its address.

This unloading was added by 5535582

"The change in RefreshModules ensures we don't broadcast the loaded
notification for the dynamic loader (ld.so) module more than once."

The reason lldb believes it has already loaded the ld-linux (or at least, told the user it has) is that m_interpreter_address in Dyld is set from AUXV_AT_BASE in DynamicLoaderPOSIXDYLD::EvalSpecialModulesStatus. This happens before any shared objects have been looked at.

I also confirmed that there is only one point at which ld-linux is added. So at least for this distro, there are not multiple copies of it that we have to ignore.

@DavidSpickett DavidSpickett self-assigned this Oct 13, 2023
@DavidSpickett
Copy link
Collaborator Author

Immediate action from me will be to make the buildbot silent.

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 13, 2023

@llvm/issue-subscribers-lldb

Author: David Spickett (DavidSpickett)

TLDR: Long report for background, I'll add the important follow up questions in a comment after this.

Since I updated our 32 bit Arm lldb bot container from:

$ ldd --version
ldd (Ubuntu GLIBC 2.35-0ubuntu3.1) 2.35

To:

$ ldd --version
ldd (Ubuntu GLIBC 2.35-0ubuntu3.3) 2.35

The Arm lldb bot has been failing with:

Testing Time: 493.49s
  Unsupported      :  485
  Passed           : 2264
  Expectedly Failed:   25
  Unresolved       :    1
  Failed           :  101

Tests fail with various things along the theme of:

error: Can't evaluate the expression without a running target due to: Interpreter doesn't handle one of the expression's opcodes

First happened here, the changes are unrelated:
https://lab.llvm.org/buildbot/#/builders/17/builds/44298

For reasons I still cannot explain, the fixing of this Ubuntu issue (realted to gdb) on Jammy, caused this failure:
https://bugs.launchpad.net/ubuntu/+source/gdb/+bug/1927192

The funny thing is, back when this was fixed on Bionic we also saw this and figured it was a bug there and moved the bot to Jammy. Now the GDB issue is fixed on Jammy too but and lldb has broken again. So clearly we are in the wrong here.

The background of that bug is not so important once I explain what lldb is doing. Though it is along the same lines,
it was preventing gdb from putting breakpoints in ld-linux.

Reproducer:

$ cat /tmp/test.c
int fn() { return 99; }

int main() { return 0; }
./bin/lldb /tmp/test.o -o "process launch -m" -o "expr -- fn()"
Process 275137 launched: '/tmp/test.o' (arm)
(lldb) expr -- fn()
error: Can't evaluate the expression without a running target due to: Interpreter doesn't handle one of the expression's opcodes

Here's what happens.

  • LLDB wants to run an expression, the first choice is to JIT it.
  • To JIT it we need to allocate some memory to JIT into.
  • First choice for that is the _M packet sent to lldb-server.
  • This packet is only implemented for x86_64.
  • We fall back to instead doing a direct call to mmap.
  • mmap is looked up, we pick one of the symbols found and call PrepareTrivialCall.
  • PrepareTrivialCall must set up the PC to point to the entry of the function,
    setup the arguments according to the ABI and possibly change the CPSR (program status register) for ARM and Thumb modes.

And here is where the fun starts.

On 32 bit Arm we have ARM and Thumb code modes, Thumb is the compressed set. If we look at the mmap symbol
that lldb chooses, it's in ./arm-linux-gnueabihf/ld-linux-armhf.so.3.

0001686c &lt;__mmap&gt;:
   1686c:       b530            push    {r4, r5, lr}
   1686e:       9d04            ldr     r5, [sp, #<!-- -->16]
   16870:       f3c5 040b       ubfx    r4, r5, #<!-- -->0, #<!-- -->12
   16874:       b954            cbnz    r4, 1688c &lt;__mmap+0x20&gt;

See that start address? 2 byte alignment which means a Thumb mode function. At least, it not being 4 byte aligned means it's not ARM. I'm not 100% sure what Thumb requires.

Anyway, point is that the usual trick of "bit 0 or 1 is set means Thumb" doesn't work here. Since the symbol doesn't
have the bottom bits set. This should be fine because we can look up the function's type from the symbol, or the type of the section (it is not important which here).

Great, do we do that? We try to, but it doesn't work. There are no loaded sections for ld-linux for us to find the mmap in.

(lldb) memory region --all
[0x0000000000000000-0x0000000000400000) ---
[0x0000000000400000-0x0000000000401000) r-x /tmp/test.o PT_LOAD[0]
[0x0000000000401000-0x0000000000410000) ---
[0x0000000000410000-0x0000000000411000) r-- /tmp/test.o
[0x0000000000411000-0x0000000000412000) rw- /tmp/test.o PT_LOAD[1]
[0x0000000000412000-0x00000000f7e90000) ---
[0x00000000f7e90000-0x00000000f7f9c000) r-x /usr/lib/arm-linux-gnueabihf/libc.so.6 PT_LOAD[0]
[0x00000000f7f9c000-0x00000000f7fac000) --- /usr/lib/arm-linux-gnueabihf/libc.so.6
[0x00000000f7fac000-0x00000000f7fae000) r-- /usr/lib/arm-linux-gnueabihf/libc.so.6
[0x00000000f7fae000-0x00000000f7faf000) rw- /usr/lib/arm-linux-gnueabihf/libc.so.6
[0x00000000f7faf000-0x00000000f7fb9000) rw-
[0x00000000f7fb9000-0x00000000f7fbf000) ---
[0x00000000f7fbf000-0x00000000f7fde000) r-x /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3
[0x00000000f7fde000-0x00000000f7fe2000) ---
[0x00000000f7fe2000-0x00000000f7fe4000) rw-
[0x00000000f7fe4000-0x00000000f7fec000) ---
[0x00000000f7fec000-0x00000000f7fed000) r-x [sigpage]
[0x00000000f7fed000-0x00000000f7fef000) r-- /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3
[0x00000000f7fef000-0x00000000f7ff0000) rw- /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3
[0x00000000f7ff0000-0x00000000fffcf000) ---
[0x00000000fffcf000-0x00000000ffff0000) rw- [stack]
[0x00000000ffff0000-0x00000000ffff1000) r-x [vectors]
[0x00000000ffff1000-0xffffffffffffffff) ---

In the output above, anything with a name after the file name is a section we know about.
So we have some for test.o and libc.so.6 but nothing for ld-linux.

This is why we are able to resolve the fake return address we give for mmap. Which is the symbol
_start from the libc. This we do have a section for therefore when we resolve it, the thumb bit is placed correctly.

mmap we have no idea so the address comes back unchanged and PrepareTrivialCall decides well it must
be ARM then. Doesn't set the T bit in CPSR, jumps to mmap and immediately SIGILLs because we're trying run Thumb in Arm mode.

So, let's just load the sections for ld-linux, right?

Well, it turns out we actually do, then we throw them away. This patch fixes the whole issue (for Arm at least):

diff --git a/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
index 85d7ae9dac75..7a228d0d9ebf 100644
--- a/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
+++ b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
@@ -459,15 +459,6 @@ void DynamicLoaderPOSIXDYLD::RefreshModules() {
         } else if (module_sp == interpreter_sp) {
           // Module already loaded.
           continue;
-        } else {
-          // If this is a duplicate instance of ld.so, unload it.  We may end
-          // up with it if we load it via a different path than before
-          // (symlink vs real path).
-          // TODO: remove this once we either fix library matching or avoid
-          // loading the interpreter when setting the rendezvous breakpoint.
-          UnloadSections(module_sp);
-          loaded_modules.Remove(module_sp);
-          continue;
         }
       }

Now you can run expressions succesfully. If we look at the memory regions again:

(lldb) memory region --all
[0x0000000000000000-0x0000000000400000) ---
[0x0000000000400000-0x0000000000401000) r-x /tmp/test.o PT_LOAD[0]
[0x0000000000401000-0x0000000000410000) ---
[0x0000000000410000-0x0000000000411000) r-- /tmp/test.o
[0x0000000000411000-0x0000000000412000) rw- /tmp/test.o PT_LOAD[1]
[0x0000000000412000-0x00000000f7e90000) ---
[0x00000000f7e90000-0x00000000f7f9c000) r-x /usr/lib/arm-linux-gnueabihf/libc.so.6 PT_LOAD[0]
[0x00000000f7f9c000-0x00000000f7fac000) --- /usr/lib/arm-linux-gnueabihf/libc.so.6
[0x00000000f7fac000-0x00000000f7fae000) r-- /usr/lib/arm-linux-gnueabihf/libc.so.6
[0x00000000f7fae000-0x00000000f7faf000) rw- /usr/lib/arm-linux-gnueabihf/libc.so.6
[0x00000000f7faf000-0x00000000f7fb9000) rw-
[0x00000000f7fb9000-0x00000000f7fbf000) ---
[0x00000000f7fbf000-0x00000000f7fde000) r-x /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3 PT_LOAD[0]
[0x00000000f7fde000-0x00000000f7fe2000) ---
[0x00000000f7fe2000-0x00000000f7fe4000) rw-
[0x00000000f7fe4000-0x00000000f7fe8000) ---
[0x00000000f7fe8000-0x00000000f7fe9000) r-- objc_imageinfo
[0x00000000f7fe9000-0x00000000f7fea000) rw- .bss
[0x00000000f7fea000-0x00000000f7feb000) r-x .text
[0x00000000f7feb000-0x00000000f7fec000) rwx
[0x00000000f7fec000-0x00000000f7fed000) r-x [sigpage]
[0x00000000f7fed000-0x00000000f7fef000) r-- /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3
[0x00000000f7fef000-0x00000000f7ff0000) rw- /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3 PT_LOAD[1]
[0x00000000f7ff0000-0x00000000fffcf000) ---
[0x00000000fffcf000-0x00000000ffff0000) rw- [stack]
[0x00000000ffff0000-0x00000000ffff1000) r-x [vectors]
[0x00000000ffff1000-0xffffffffffffffff) ---

Now we have a region at 0x00000000f7fef000 that has a section associated with it. mmap is at 0xf7fd586c.

Before patch:

(lldb) memory region 0xf7fd586c
[0x00000000f7fbf000-0x00000000f7fde000) r-x /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3

After patch:

(lldb) memory region 0xf7fd586c
[0x00000000f7fbf000-0x00000000f7fde000) r-x /usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3 PT_LOAD[0]

Having a section allows us to resolve the address and set the correct mode. Its AddressClass goes from eUnknown to eCodeAlternateISA (alternate being Thumb here). PrepareTrivialCall sets CPSR.T correctly and it all works.

I have no idea why the test suite ever passed on previous versions of Jammy, and am unable to test it because lldb appears to try to treat the older Jammy's ld-linux as ARM code, meaning I can't start a program because breaking inside of it doesn't work (again, no idea how the test suite managed to run).

This unloading of ld-linux happens on AArch64 also but there we have no reason to need the details of the mmap symbol. It's enough to know its address.

This unloading was added by 5535582

"The change in RefreshModules ensures we don't broadcast the loaded
notification for the dynamic loader (ld.so) module more than once."

The reason lldb believes it has already loaded the ld-linux (or at least, told the user it has) is that m_interpreter_address in Dyld is set from AUXV_AT_BASE in DynamicLoaderPOSIXDYLD::EvalSpecialModulesStatus. This happens before any shared objects have been looked at.

I also confirmed that there is only one point at which ld-linux is added. So at least for this distro, there are not multiple copies of it that we have to ignore.

@DavidSpickett
Copy link
Collaborator Author

@labath Do you remember what the intent was with 5535582 ?

I do not know what these broadcasts are, and what harm broadcasting one about ld-linux twice would be. Is there a way I can write a test case to observe them?

@clayborg You were in this area recently, any ideas here?

(and I did confirm that your recent change 07c215e is not to blame here)

@DavidSpickett
Copy link
Collaborator Author

DavidSpickett commented Oct 16, 2023

Thinking more about it, I just need to draw the distinction between m_interpreter_address being set from auxv, and it being set from a loaded SO.

Once I can confirm that the value set from auxv does not generate one of these signals.

Turns out we load the interpreter first in DynamicLoaderPOSIXDYLD::LoadInterpreterModule, so it is loaded from the SO twice. I will have to figure out how to not destroy the information from the first one in the process of unloading the second.

DavidSpickett added a commit to DavidSpickett/llvm-project that referenced this issue Oct 23, 2023
Fixes llvm#68987

Early on we load the interpreter (most commonly ld-linux) in
LoadInterpreterModule. Then later when we get the first DYLD
rendezvous we get a list of libraries that commonly includes ld-linux
again.

Previously we would load this duplicate, see that it was a duplicate,
and unload it.

Problem was that this unloaded the section information of the first
copy of ld-linux. On platforms where you can place a breakpoint
using only an address, this wasn't an issue.

On ARM you have ARM and Thumb modes. We must know which one the section
we're breaking in is, otherwise we'll go there in the wrong mode and SIGILL.
This happened on ARM when lldb tried to call mmap during expression
evaluation.

To fix this, I am making the assumption that the base address we see
in the module prior to loading can be compared with what we know
the interpreter base address is. Then we don't have to load the
module to know we can ignore it.

This fixes the lldb test suite on Ubuntu versions where
https://bugs.launchpad.net/ubuntu/+source/gdb/+bug/1927192 has been fixed.
Which was recently done on Jammy.
DavidSpickett added a commit that referenced this issue Oct 25, 2023
…ing them (#69932)

Fixes #68987

Early on we load the interpreter (most commonly ld-linux) in
LoadInterpreterModule. Then later when we get the first DYLD rendezvous
we get a list of libraries that commonly includes ld-linux again.

Previously we would load this duplicate, see that it was a duplicate,
and unload it.

Problem was that this unloaded the section information of the first copy
of ld-linux. On platforms where you can place a breakpoint using only an
address, this wasn't an issue.

On ARM you have ARM and Thumb modes. We must know which one the section
we're breaking in is, otherwise we'll go there in the wrong mode and
SIGILL. This happened on ARM when lldb tried to call mmap during
expression evaluation.

To fix this, I am making the assumption that the base address we see in
the module prior to loading can be compared with what we know the
interpreter base address is. Then we don't have to load the module to
know we can ignore it.

This fixes the lldb test suite on Ubuntu versions where
https://bugs.launchpad.net/ubuntu/+source/gdb/+bug/1927192 has been
fixed. Which was recently done on Jammy.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants