Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb] Be conversative about setting highmem address masks #90533

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jasonmolenda
Copy link
Collaborator

The most common case with address masks/addressable bits is that we have one value/mask that applies across the entire address space, for code and data. On AArch64, we can have separate masks for high and low address spaces. In the normal "one mask for all memory" case, we use the low memory masks in Process, and we use the virtual-addressable-bits setting for the user to override that value. When high memory has a different set of masks, those become active in Process and the user can use highmem-virtual-addressable-bits to override only the highmem value, if it is wrong.

This patch is handling a case where a gdb stub sent incorrect address masks, for both high and low memory:

(lldb) process plugin packet send qHostInfo
  packet: qHostInfo
response: cputype:16777228;cpusubtype:0;endian:little;ptrsize:8;low_mem_addressing_bits:64;high_mem_addressing_bits:64;

When in fact this target was using 47 bits of addressing. This qHostInfo response set the Process low and high address masks for code and data so that no bit-clearing is performed. The user tried to override this with the virtual-addressable-bits setting, and was surprised when it had no affect on code running in high memory. Because the high memory code mask now has a setting (all bits are addressable) and they needed to use the very-uncommon highmem-virtual-addressable-bits setting.

When we have an unset high memory address mask, and we are told to set low- and high-memory to the same new address mask, maintain the high memory mask as unset in Process. The same thing is done with the SBProcess::SetAddressMask API when the user specifies lldb.eAddressMaskRangeAll.

Added a new test case to TestAddressMasks.py to test that the low-memory override works correctly. Also fixed a bug in the testsuite that I did where I added @skipIf(archs=["arm"]) to avoid a problem on the Linaro 32-bit arm Ubuntu CI bot. I forgot that this is a regex, and so the tests were being skipped entirely on any arm.* system.

The most common case with address masks/addressable bits is that
we have one value/mask that applies across the entire address space,
for code and data.  On AArch64, we can have separate masks for high
and low address spaces.  In the normal "one mask for all memory"
case, we use the low memory masks in Process, and we use the
`virtual-addressable-bits` setting for the user to override that
value.  When high memory has a different set of masks, those become
active in Process and the user can use `highmem-virtual-addressable-bits`
to override only the highmem value, if it is wrong.

This patch is handling a case where a gdb stub sent incorrect address
masks, for both high and low memory:

```
(lldb) process plugin packet send qHostInfo
  packet: qHostInfo
response: cputype:16777228;cpusubtype:0;endian:little;ptrsize:8;low_mem_addressing_bits:64;high_mem_addressing_bits:64;
```

When in fact this target was using 47 bits of addressing.  This
qHostInfo response set the Process low and high address masks for
code and data so that no bit-clearing is performed.  The user tried
to override this with the virtual-addressable-bits setting, and was
surprised when it had no affect on code running in high memory.
Because the high memory code mask now has a setting (all bits are
addressable) and they needed to use the very-uncommon
highmem-virtual-addressable-bits setting.

When we have an *unset* high memory address mask, and we are told
to set low- and high-memory to the same new address mask, maintain
the high memory mask as unset in Process.  The same thing is done
with the SBProcess::SetAddressMask API when the user specifies
lldb.eAddressMaskRangeAll.

Added a new test case to TestAddressMasks.py to test that the
low-memory override works correctly.  Also fixed a bug in the
testsuite that I did where I added `@skipIf(archs=["arm"])`
to avoid a problem on the Linaro 32-bit arm Ubuntu CI bot.  I
forgot that this is a regex, and so the tests were being skipped
entirely on any arm.* system.
@llvmbot
Copy link
Collaborator

llvmbot commented Apr 29, 2024

@llvm/pr-subscribers-lldb

Author: Jason Molenda (jasonmolenda)

Changes

The most common case with address masks/addressable bits is that we have one value/mask that applies across the entire address space, for code and data. On AArch64, we can have separate masks for high and low address spaces. In the normal "one mask for all memory" case, we use the low memory masks in Process, and we use the virtual-addressable-bits setting for the user to override that value. When high memory has a different set of masks, those become active in Process and the user can use highmem-virtual-addressable-bits to override only the highmem value, if it is wrong.

This patch is handling a case where a gdb stub sent incorrect address masks, for both high and low memory:

(lldb) process plugin packet send qHostInfo
  packet: qHostInfo
response: cputype:16777228;cpusubtype:0;endian:little;ptrsize:8;low_mem_addressing_bits:64;high_mem_addressing_bits:64;

When in fact this target was using 47 bits of addressing. This qHostInfo response set the Process low and high address masks for code and data so that no bit-clearing is performed. The user tried to override this with the virtual-addressable-bits setting, and was surprised when it had no affect on code running in high memory. Because the high memory code mask now has a setting (all bits are addressable) and they needed to use the very-uncommon highmem-virtual-addressable-bits setting.

When we have an unset high memory address mask, and we are told to set low- and high-memory to the same new address mask, maintain the high memory mask as unset in Process. The same thing is done with the SBProcess::SetAddressMask API when the user specifies lldb.eAddressMaskRangeAll.

Added a new test case to TestAddressMasks.py to test that the low-memory override works correctly. Also fixed a bug in the testsuite that I did where I added @<!-- -->skipIf(archs=["arm"]) to avoid a problem on the Linaro 32-bit arm Ubuntu CI bot. I forgot that this is a regex, and so the tests were being skipped entirely on any arm.* system.


Full diff: https://github.com/llvm/llvm-project/pull/90533.diff

3 Files Affected:

  • (modified) lldb/source/API/SBProcess.cpp (+22-4)
  • (modified) lldb/source/Target/Process.cpp (+16-4)
  • (modified) lldb/test/API/python_api/process/address-masks/TestAddressMasks.py (+18-3)
diff --git a/lldb/source/API/SBProcess.cpp b/lldb/source/API/SBProcess.cpp
index c37c111c5a58e0..a0654d23e67efd 100644
--- a/lldb/source/API/SBProcess.cpp
+++ b/lldb/source/API/SBProcess.cpp
@@ -1298,7 +1298,12 @@ void SBProcess::SetAddressMask(AddressMaskType type, addr_t mask,
     case eAddressMaskTypeCode:
       if (addr_range == eAddressMaskRangeAll) {
         process_sp->SetCodeAddressMask(mask);
-        process_sp->SetHighmemCodeAddressMask(mask);
+        // If the highmem mask is the default-invalid,
+        // don't change it, keep everything consistently
+        // using the lowmem all-address-space mask.
+        if (process_sp->GetHighmemCodeAddressMask() !=
+            LLDB_INVALID_ADDRESS_MASK)
+          process_sp->SetHighmemCodeAddressMask(mask);
       } else if (addr_range == eAddressMaskRangeHigh) {
         process_sp->SetHighmemCodeAddressMask(mask);
       } else {
@@ -1308,7 +1313,12 @@ void SBProcess::SetAddressMask(AddressMaskType type, addr_t mask,
     case eAddressMaskTypeData:
       if (addr_range == eAddressMaskRangeAll) {
         process_sp->SetDataAddressMask(mask);
-        process_sp->SetHighmemDataAddressMask(mask);
+        // If the highmem mask is the default-invalid,
+        // don't change it, keep everything consistently
+        // using the lowmem all-address-space mask.
+        if (process_sp->GetHighmemDataAddressMask() !=
+            LLDB_INVALID_ADDRESS_MASK)
+          process_sp->SetHighmemDataAddressMask(mask);
       } else if (addr_range == eAddressMaskRangeHigh) {
         process_sp->SetHighmemDataAddressMask(mask);
       } else {
@@ -1319,8 +1329,16 @@ void SBProcess::SetAddressMask(AddressMaskType type, addr_t mask,
       if (addr_range == eAddressMaskRangeAll) {
         process_sp->SetCodeAddressMask(mask);
         process_sp->SetDataAddressMask(mask);
-        process_sp->SetHighmemCodeAddressMask(mask);
-        process_sp->SetHighmemDataAddressMask(mask);
+        // If the highmem masks are the default-invalid,
+        // don't change them, keep everything consistently
+        // using the lowmem all-address-space masks.
+        if (process_sp->GetHighmemDataAddressMask() !=
+                LLDB_INVALID_ADDRESS_MASK ||
+            process_sp->GetHighmemCodeAddressMask() !=
+                LLDB_INVALID_ADDRESS_MASK) {
+          process_sp->SetHighmemCodeAddressMask(mask);
+          process_sp->SetHighmemDataAddressMask(mask);
+        }
       } else if (addr_range == eAddressMaskRangeHigh) {
         process_sp->SetHighmemCodeAddressMask(mask);
         process_sp->SetHighmemDataAddressMask(mask);
diff --git a/lldb/source/Target/Process.cpp b/lldb/source/Target/Process.cpp
index 30c240b064b59c..a75d7b12520742 100644
--- a/lldb/source/Target/Process.cpp
+++ b/lldb/source/Target/Process.cpp
@@ -6473,9 +6473,21 @@ void Process::SetAddressableBitMasks(AddressableBits bit_masks) {
   }
 
   if (high_memory_addr_bits != 0) {
-    addr_t high_addr_mask =
-        AddressableBits::AddressableBitToMask(high_memory_addr_bits);
-    SetHighmemCodeAddressMask(high_addr_mask);
-    SetHighmemDataAddressMask(high_addr_mask);
+    // If the same high and low mem address bits were specified,
+    // and we don't have a highmem setting for code and data currently,
+    // don't set the highmem masks.
+    // When we have separate high- and low- masks, the user
+    // setting `virtual-addressable-bits` only overrides the low
+    // memory masks, which most users would be surprised by.
+    // Leave the high memory masks unset, to make it clear that only the
+    // low memory masks are active.
+    if (high_memory_addr_bits != low_memory_addr_bits ||
+        m_highmem_code_address_mask != LLDB_INVALID_ADDRESS_MASK ||
+        m_highmem_data_address_mask != LLDB_INVALID_ADDRESS_MASK) {
+      addr_t high_addr_mask =
+          AddressableBits::AddressableBitToMask(high_memory_addr_bits);
+      SetHighmemCodeAddressMask(high_addr_mask);
+      SetHighmemDataAddressMask(high_addr_mask);
+    }
   }
 }
diff --git a/lldb/test/API/python_api/process/address-masks/TestAddressMasks.py b/lldb/test/API/python_api/process/address-masks/TestAddressMasks.py
index 152776efc726f2..7908a46c0fb38c 100644
--- a/lldb/test/API/python_api/process/address-masks/TestAddressMasks.py
+++ b/lldb/test/API/python_api/process/address-masks/TestAddressMasks.py
@@ -19,7 +19,7 @@ def reset_all_masks(self, process):
         self.runCmd("settings set target.process.virtual-addressable-bits 0")
         self.runCmd("settings set target.process.highmem-virtual-addressable-bits 0")
 
-    @skipIf(archs=["arm"])  # 32-bit arm ABI hardcodes Code mask, is 32-bit
+    @skipIf(archs=["arm$"])  # 32-bit arm ABI hardcodes Code mask, is 32-bit
     def test_address_masks(self):
         self.build()
         (target, process, t, bp) = lldbutil.run_to_source_breakpoint(
@@ -80,7 +80,6 @@ def test_address_masks(self):
     # AArch64 can have different address masks for high and low memory, when different
     # page tables are set up.
     @skipIf(archs=no_match(["arm64", "arm64e", "aarch64"]))
-    @skipIf(archs=["arm"])  # 32-bit arm ABI hardcodes Code mask, is 32-bit
     def test_address_masks_target_supports_highmem_tests(self):
         self.build()
         (target, process, t, bp) = lldbutil.run_to_source_breakpoint(
@@ -113,7 +112,7 @@ def test_address_masks_target_supports_highmem_tests(self):
     # On most targets where we have a single mask for all address range, confirm
     # that the high memory masks are ignored.
     @skipIf(archs=["arm64", "arm64e", "aarch64"])
-    @skipIf(archs=["arm"])  # 32-bit arm ABI hardcodes Code mask, is 32-bit
+    @skipIf(archs=["arm$"])  # 32-bit arm ABI hardcodes Code mask, is 32-bit
     def test_address_masks_target_no_highmem(self):
         self.build()
         (target, process, t, bp) = lldbutil.run_to_source_breakpoint(
@@ -132,3 +131,19 @@ def test_address_masks_target_no_highmem(self):
         self.runCmd("settings set target.process.highmem-virtual-addressable-bits 42")
         self.assertEqual(0x0000000000007694, process.FixAddress(0x00265E950001F694))
         self.assertEqual(0xFFFFFFFFFFFFF694, process.FixAddress(0xFFA65E950000F694))
+
+    # On most targets where we have a single mask for all address range, confirm
+    # that the high memory masks are ignored.
+    @skipIf(archs=no_match(["arm64", "arm64e", "aarch64"]))
+    def test_address_unset_highmem_masks_stay_unset(self):
+        self.build()
+        (target, process, t, bp) = lldbutil.run_to_source_breakpoint(
+            self, "break here", lldb.SBFileSpec("main.c")
+        )
+        self.reset_all_masks(process)
+
+        process.SetAddressableBits(
+            lldb.eAddressMaskTypeAll, 64, lldb.eAddressMaskRangeLow
+        )
+        self.runCmd("settings set target.process.virtual-addressable-bits 47")
+        self.assertEqual(0xFFFFFE0044580BC4, process.FixAddress(0xFFE8FE0044580BC4))

@DavidSpickett
Copy link
Collaborator

DavidSpickett commented Apr 30, 2024

On the face of it

When we have an unset high memory address mask, and we are told to set low- and high-memory to the same new address mask, maintain the high memory mask as unset in Process. The same thing is done with the SBProcess::SetAddressMask API when the user specifies lldb.eAddressMaskRangeAll.

Seems to itself be confusing and if you're doing this to address

When high memory has a different set of masks, those become active in Process and the user can use highmem-virtual-addressable-bits to override only the highmem value, if it is wrong.

You're adding one gotcha to avoid another.

However, I think the result of the highmem mask being unset is the same as setting both masks to the same value. Except that virtual-addressable-bits will now effectively apply to both masks, until the user sets only the high mask to some value. At that point, highmem-virtual-addressable-bits must be used to modify the high mem value.

Correct?

So this doesn't actually change anything for an API user that was setting address masks for high and low all at once. They'll still think both are the same value, but internally, lldb tries high, sees that it's unset, and falls back to the low mask.

@jasonmolenda
Copy link
Collaborator Author

On the face of it

When we have an unset high memory address mask, and we are told to set low- and high-memory to the same new address mask, maintain the high memory mask as unset in Process. The same thing is done with the SBProcess::SetAddressMask API when the user specifies lldb.eAddressMaskRangeAll.

Seems to itself be confusing and if you're doing this to address

When high memory has a different set of masks, those become active in Process and the user can use highmem-virtual-addressable-bits to override only the highmem value, if it is wrong.

You're adding one gotcha to avoid another.

However, I think the result of the highmem mask being unset is the same as setting both masks to the same value. Except that virtual-addressable-bits will now effectively apply to both masks, until the user sets only the high mask to some value. At that point, highmem-virtual-addressable-bits must be used to modify the high mem value.

Correct?

So this doesn't actually change anything for an API user that was setting address masks for high and low all at once. They'll still think both are the same value, but internally, lldb tries high, sees that it's unset, and falls back to the low mask.

I'm not thrilled about where I'm ending up.

For the vast majority of our users, the separate address masks for high and low memory will never occur, it's an unusual environment that sets up the MMU this way, and operates in both halves of memory. The goal of this patch is to keep the high memory masks unset (set to LLDB_INVALID_MASK) unless someone sets them to a value distinct from the "low memory" masks.

I almost think about changing these masks in Process to be a class which holds either {code, data} or {low code, low data, high code, high data} internally, with methods to get/set the masks and the requirement that high and low be set simultaneously when they're both going to be specified so we can detect if it's genuinely a split-address space situation.

@DavidSpickett
Copy link
Collaborator

Seems logical to me, keep the 99% use case simple and folks who know they need a high mem mask have to put in the extra effort to really consider it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants