Skip to content

Conversation

sedymrak
Copy link
Contributor

@sedymrak sedymrak commented Oct 15, 2025

Fix the RegisterValue::SetValueFromData method so that it works also for 128-bit registers that contain integers.

Without this change, the RegisterValue::SetValueFromData method does not work correctly
for 128-bit registers that contain (signed or unsigned) integers.


Steps to reproduce the problem:

(1)

Create a program that writes a 128-bit number to a 128-bit registers xmm0. E.g.:

#include <stdint.h>

int main() {
  __asm__ volatile (
      "pinsrq $0, %[lo], %%xmm0\n\t"  // insert low 64 bits
      "pinsrq $1, %[hi], %%xmm0"    // insert high 64 bits
      :
      : [lo]"r"(0x7766554433221100),
        [hi]"r"(0xffeeddccbbaa9988)
  );
  return 0;
}

(2)

Compile this program with LLVM compiler:

$ $YOUR/clang -g -o main main.c

(3)

Modify LLDB so that when it will be reading value from the xmm0 register, instead of assuming that it is vector register, it will treat it as if it contain an integer. This can be achieved e.g. this way:

diff --git a/lldb/source/Utility/RegisterValue.cpp b/lldb/source/Utility/RegisterValue.cpp
index 0e99451c3b70..a4b51db3e56d 100644
--- a/lldb/source/Utility/RegisterValue.cpp
+++ b/lldb/source/Utility/RegisterValue.cpp
@@ -188,6 +188,7 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info,
     break;
   case eEncodingUint:
   case eEncodingSint:
+  case eEncodingVector:
     if (reg_info.byte_size == 1)
       SetUInt8(src.GetMaxU32(&src_offset, src_len));
     else if (reg_info.byte_size <= 2)
@@ -217,23 +218,6 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info,
     else if (reg_info.byte_size == sizeof(long double))
       SetLongDouble(src.GetLongDouble(&src_offset));
     break;
-  case eEncodingVector: {
-    m_type = eTypeBytes;
-    assert(reg_info.byte_size <= kMaxRegisterByteSize);
-    buffer.bytes.resize(reg_info.byte_size);
-    buffer.byte_order = src.GetByteOrder();
-    if (src.CopyByteOrderedData(
-            src_offset,          // offset within "src" to start extracting data
-            src_len,             // src length
-            buffer.bytes.data(), // dst buffer
-            buffer.bytes.size(), // dst length
-            buffer.byte_order) == 0) // dst byte order
-    {
-      error = Status::FromErrorStringWithFormat(
-          "failed to copy data for register write of %s", reg_info.name);
-      return error;
-    }
-  }
   }
 
   if (m_type == eTypeInvalid)

(4)

Rebuild the LLDB.

(5)

Observe what happens how LLDB will print the content of this register after it was initialized with 128-bit value.

$YOUR/lldb --source ./main
(lldb) target create main
Current executable set to '.../main' (x86_64).
(lldb) breakpoint set --file main.c --line 11
Breakpoint 1: where = main`main + 45 at main.c:11:3, address = 0x000000000000164d
(lldb) settings set stop-line-count-before 20
(lldb) process launch
Process 2568735 launched: '.../main' (x86_64)
Process 2568735 stopped
* thread #1, name = 'main', stop reason = breakpoint 1.1
    frame #0: 0x000055555555564d main`main at main.c:11:3
   1   	#include <stdint.h>
   2   	
   3   	int main() {
   4   	  __asm__ volatile (
   5   	      "pinsrq $0, %[lo], %%xmm0\n\t"  // insert low 64 bits
   6   	      "pinsrq $1, %[hi], %%xmm0"    // insert high 64 bits
   7   	      :
   8   	      : [lo]"r"(0x7766554433221100),
   9   	        [hi]"r"(0xffeeddccbbaa9988)
   10  	  );
-> 11  	  return 0;
   12  	}
(lldb) register read --format hex xmm0
    xmm0 = 0x7766554433221100ffeeddccbbaa9988

You can see that the upper and lower 64-bit wide halves are swapped.

Fix the "RegisterValue::SetValueFromData" so that it works
also for 128-bit registers that contain integers.
@llvmbot
Copy link
Member

llvmbot commented Oct 15, 2025

@llvm/pr-subscribers-lldb

Author: Matej Košík (sedymrak)

Changes

Fix the "RegisterValue::SetValueFromData" so that it works also for 128-bit registers that contain integers.

Without this change, the RegisterValue::SetValueFromData method does not work correctly
for 128-bit registers that contain (signed or unsigned) integers.

The original version displays the content of the 128-bit registers so that its lower and upper 64-bit halves are swapped.


Full diff: https://github.com/llvm/llvm-project/pull/163646.diff

1 Files Affected:

  • (modified) lldb/source/Utility/RegisterValue.cpp (+1-1)
diff --git a/lldb/source/Utility/RegisterValue.cpp b/lldb/source/Utility/RegisterValue.cpp
index 0e99451c3b700..12c349a143c0f 100644
--- a/lldb/source/Utility/RegisterValue.cpp
+++ b/lldb/source/Utility/RegisterValue.cpp
@@ -199,7 +199,7 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info,
     else if (reg_info.byte_size <= 16) {
       uint64_t data1 = src.GetU64(&src_offset);
       uint64_t data2 = src.GetU64(&src_offset);
-      if (src.GetByteOrder() == eByteOrderBig) {
+      if (src.GetByteOrder() == eByteOrderLittle) {
         int128.x[0] = data1;
         int128.x[1] = data2;
       } else {

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understood this, the code is trying to make the order inside the APInt always:
least significant 64-bit element, most significant 64-bit element

Because internally APInt expects this no matter the platform endian? Haven't verified that but it's implied.

The first element you'd get from a big endian extractor is actually the most significant, so the two elements need to be reversed.

Which was being done but the wrong way around. Little endian is fine, big endian needs to be swapped.

How did you hit this? I thought I could reach it by running an expression on AArch64 that used a v register, but because I need to use float or double to do that, it's always marked as a float type and takes that path.

The fix seems correct but I'd like to be able to test this. Though it will likely only get run with little endian, it'd be something.

Might be able to make something synthetic via the SBAPI.

@sedymrak
Copy link
Contributor Author

sedymrak commented Oct 16, 2025

Steps to reproduce the problem:

(1)

Create a program that writes a 128-bit number to a 128-bit registers xmm0. E.g.:

#include <stdint.h>

int main() {
  __asm__ volatile (
      "pinsrq $0, %[lo], %%xmm0\n\t"  // insert low 64 bits
      "pinsrq $1, %[hi], %%xmm0"    // insert high 64 bits
      :
      : [lo]"r"(0x7766554433221100),
        [hi]"r"(0xffeeddccbbaa9988)
  );
  return 0;
}

(2)

Compile this program with LLVM compiler:

$ $YOUR/clang -g -o main main.c

(3)

Modify LLDB so that when it will be reading value from the xmm0 register, instead of assuming that it is vector register, it will treat it as if it contain an integer. This can be achieved e.g. this way:

diff --git a/lldb/source/Utility/RegisterValue.cpp b/lldb/source/Utility/RegisterValue.cpp
index 0e99451c3b70..a4b51db3e56d 100644
--- a/lldb/source/Utility/RegisterValue.cpp
+++ b/lldb/source/Utility/RegisterValue.cpp
@@ -188,6 +188,7 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info,
     break;
   case eEncodingUint:
   case eEncodingSint:
+  case eEncodingVector:
     if (reg_info.byte_size == 1)
       SetUInt8(src.GetMaxU32(&src_offset, src_len));
     else if (reg_info.byte_size <= 2)
@@ -217,23 +218,6 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info,
     else if (reg_info.byte_size == sizeof(long double))
       SetLongDouble(src.GetLongDouble(&src_offset));
     break;
-  case eEncodingVector: {
-    m_type = eTypeBytes;
-    assert(reg_info.byte_size <= kMaxRegisterByteSize);
-    buffer.bytes.resize(reg_info.byte_size);
-    buffer.byte_order = src.GetByteOrder();
-    if (src.CopyByteOrderedData(
-            src_offset,          // offset within "src" to start extracting data
-            src_len,             // src length
-            buffer.bytes.data(), // dst buffer
-            buffer.bytes.size(), // dst length
-            buffer.byte_order) == 0) // dst byte order
-    {
-      error = Status::FromErrorStringWithFormat(
-          "failed to copy data for register write of %s", reg_info.name);
-      return error;
-    }
-  }
   }
 
   if (m_type == eTypeInvalid)

(4)

Observe what happens how LLDB will print the content of this register after it was initialized with 128-bit value.

$YOUR/lldb --source ./main
(lldb) target create main
Current executable set to '.../main' (x86_64).
(lldb) breakpoint set --file main.c --line 11
Breakpoint 1: where = main`main + 45 at main.c:11:3, address = 0x000000000000164d
(lldb) settings set stop-line-count-before 20
(lldb) process launch
Process 2568735 launched: '.../main' (x86_64)
Process 2568735 stopped
* thread #1, name = 'main', stop reason = breakpoint 1.1
    frame #0: 0x000055555555564d main`main at main.c:11:3
   1   	#include <stdint.h>
   2   	
   3   	int main() {
   4   	  __asm__ volatile (
   5   	      "pinsrq $0, %[lo], %%xmm0\n\t"  // insert low 64 bits
   6   	      "pinsrq $1, %[hi], %%xmm0"    // insert high 64 bits
   7   	      :
   8   	      : [lo]"r"(0x7766554433221100),
   9   	        [hi]"r"(0xffeeddccbbaa9988)
   10  	  );
-> 11  	  return 0;
   12  	}
(lldb) register read --format hex xmm0
    xmm0 = 0x7766554433221100ffeeddccbbaa9988

You can see that the upper and lower 64-bit wide halves are swapped.

@DavidSpickett
Copy link
Collaborator

Thanks for the reproducer. Please include it in the PR description as well, as it may be the closest thing we get to a test case.

I'm still not clear how you found this, but if the answer is:

  • You spotted it by reading code
  • You have a downstream target with registers like this
  • You are working on some NDA'd thing

Then the conclusion is the same. Hard to test this with an upstream target.

We have faked targets in testing before, so I think we could fake an AArch64 where V registers were not vectors, as this type is set in the XML sent from the debug server. lldb/test/API/functionalities/gdb_remote_client/TestXMLRegisterFlags.py is an example of this.

Then again, I found that there's no public API for this call. Running an expression on a fake target like that could work but it'd be a lot of mock stuff to implement (for a start, we'd need to pretend it can JIT code).

We have some tests that put a simulated debug sever in front of a real one, but that's too much effort for this fix.

Let me think about it for a bit. If I don't come up with anything then I'll approve this, as it does look correct to me.

@DavidSpickett DavidSpickett changed the title [lldb] fix the "RegisterValue::SetValueFromData" method [lldb] fix the "RegisterValue::SetValueFromData" method for 128-bit integer registers Oct 16, 2025
@sedymrak
Copy link
Contributor Author

sedymrak commented Oct 16, 2025

Thanks for the reproducer. Please include it in the PR description as well, as it may be the closest thing we get to a test case.

I am not sure if I understand correctly what you meant I should do.
I have edited my first post on this matter to include the instructions how one can reproduce the problem.
Is this what you meant?

I'm still not clear how you found this, but if the answer is:

* You spotted it by reading code

* You have a downstream target with registers like this

* You are working on some NDA'd thing

The problem floated to the surface when we adapted LLDB for debugging programs written in a custom programming language that supports non-standard integer sizes for registers as well as variables that, in this case, are treated as registers.

Then the conclusion is the same. Hard to test this with an upstream target.

I am not aware of any platform where this bug could be directly reproduced.

I am not sure how much it would help, but I plan (tomorrow) to look at related unit-tests. I need to know:

  • Why they did not scream when I changed the RegisterValue::SetValueFromData method?
  • Can I add a test that will cover the faulty branch?

At worst I will learn why it cannot be done...

We have faked targets in testing before, so I think we could fake an AArch64 where V registers were not vectors, as this type is set in the XML sent from the debug server. lldb/test/API/functionalities/gdb_remote_client/TestXMLRegisterFlags.py is an example of this.

Then again, I found that there's no public API for this call. Running an expression on a fake target like that could work but it'd be a lot of mock stuff to implement (for a start, we'd need to pretend it can JIT code).

We have some tests that put a simulated debug sever in front of a real one, but that's too much effort for this fix.

Let me think about it for a bit. If I don't come up with anything then I'll approve this, as it does look correct to me.

+1

@DavidSpickett
Copy link
Collaborator

Is this what you meant?

Yes, thanks.

Why they did not scream when I changed the RegisterValue::SetValueFromData method?

I doubt this ever had coverage because no target ever tried this. Those unit tests look promising though, if we can set the type of the register value.

@DavidSpickett
Copy link
Collaborator

DavidSpickett commented Oct 17, 2025

My idea with the fake target is going to be way too much work, the unittests is a better idea.

I got this to fail:

diff --git a/lldb/unittests/Utility/RegisterValueTest.cpp b/lldb/unittests/Utility/RegisterValueTest.cpp
index e0863a41605e..55fb3a2b05ad 100644
--- a/lldb/unittests/Utility/RegisterValueTest.cpp
+++ b/lldb/unittests/Utility/RegisterValueTest.cpp
@@ -7,6 +7,9 @@
 //===----------------------------------------------------------------------===//
 
 #include "lldb/Utility/RegisterValue.h"
+#include "lldb/Utility/DataExtractor.h"
+#include "lldb/lldb-enumerations.h"
+#include "lldb/lldb-private-types.h"
 #include "gtest/gtest.h"
 #include <optional>
 
@@ -54,3 +57,28 @@ TEST(RegisterValueTest, GetScalarValue) {
             Scalar((APInt(128, 0xffeeddccbbaa9988ull) << 64) |
                    APInt(128, 0x7766554433221100)));
 }
+
+TEST(RegisterValueTest, SetValueFromData) {
+  RegisterValue rv;
+  RegisterInfo ri {
+    "uint128_register", nullptr, 16, 0, lldb::Encoding::eEncodingUint,
+    lldb::Format::eFormatDefault, {0, 0, 0, LLDB_INVALID_REGNUM, 0},
+    nullptr, nullptr, nullptr
+  };
+
+  uint64_t src[] = {0x7766554433221100, 0xffeeddccbbaa9988};
+  DataExtractor src_extractor(src, 16, lldb::ByteOrder::eByteOrderLittle, 8);
+  rv.SetValueFromData(ri, src_extractor, /*src_offset=*/0, /*partial_data_ok=*/false);
+
+  // Copied from above.
+  const auto &Get = [](const RegisterValue &V) -> std::optional<Scalar> {
+    Scalar S;
+    if (V.GetScalarValue(S))
+      return S;
+    return std::nullopt;
+  };
+
+  EXPECT_EQ(Get(rv),
+    Scalar((APInt(128, 0xffeeddccbbaa9988ull) << 64) |
+                   APInt(128, 0x7766554433221100ull)));
+}

You can use that with some cleanup if you can confirm that your change fixes it. May need repeating for little and big endian extractors, though that may also swap the order within the uint64s so watch out for that.

@sedymrak
Copy link
Contributor Author

I came to the same conclusion a little later than you.
The test I was experimented with:

--- a/lldb/unittests/Utility/RegisterValueTest.cpp
+++ b/lldb/unittests/Utility/RegisterValueTest.cpp
@@ -7,6 +7,8 @@
 //===----------------------------------------------------------------------===//
 
 #include "lldb/Utility/RegisterValue.h"
+#include "lldb/lldb-private-types.h"
+#include "lldb/Utility/DataExtractor.h"
 #include "gtest/gtest.h"
 #include <optional>
 
@@ -54,3 +56,21 @@ TEST(RegisterValueTest, GetScalarValue) {
             Scalar((APInt(128, 0xffeeddccbbaa9988ull) << 64) |
                    APInt(128, 0x7766554433221100)));
 }
+
+TEST(RegisterValueTest, SetValueFromData) {
+  uint8_t bytes[] = {0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77,
+                     0x88, 0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff};
+  lldb::offset_t length = sizeof(bytes);
+  lldb_private::DataExtractor data(bytes, length, lldb::eByteOrderLittle,
+                                   sizeof(nullptr));
+  lldb_private::RegisterInfo reg_info;
+  reg_info.byte_size = 16;
+  reg_info.byte_offset = 0;
+  reg_info.encoding = lldb::eEncodingUint;
+  RegisterValue rv128;
+  EXPECT_TRUE(rv128.SetValueFromData(reg_info, data, 0, false).Success());
+  Scalar s128;
+  EXPECT_TRUE(rv128.GetScalarValue(s128));
+  EXPECT_TRUE(s128 == Scalar((APInt(128, 0xffeeddccbbaa9988ull) << 64) |
+                             APInt(128, 0x7766554433221100ull)));
+}

nicely failed on the main branch but it succeed in the the branch that contains the fix.

I will think about it and push something today.

@github-actions
Copy link

github-actions bot commented Oct 17, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good just one refactoring suggestion.

I like the design of having one target value and changing the inputs. I've tried to do it the other way around in the past and it always seemed harder to comprehend.

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I can merge it for you if needed, let me know.

@sedymrak
Copy link
Contributor Author

LGTM.

I can merge it for you if needed, let me know.

That would be great. If you can, please do that.

@DavidSpickett DavidSpickett changed the title [lldb] fix the "RegisterValue::SetValueFromData" method for 128-bit integer registers [lldb] Fix the "RegisterValue::SetValueFromData" method for 128-bit integer registers Oct 20, 2025
@DavidSpickett
Copy link
Collaborator

This commit will be authored by 99839051+sedymrak@users.noreply.github.com.

Before I can merge this, please update your GitHub email settings - https://llvm.org/docs/DeveloperPolicy.html#email-addresses

(your email does show up in Co-authored-by: but we need it as the author at minimum)

@sedymrak
Copy link
Contributor Author

This commit will be authored by 99839051+sedymrak@users.noreply.github.com.

Before I can merge this, please update your GitHub email settings - https://llvm.org/docs/DeveloperPolicy.html#email-addresses

(your email does show up in Co-authored-by: but we need it as the author at minimum)

No problem.
I've looked at these settings and I have turned the "Keep my email addresses private" setting OFF.
(It was set to "ON" previously).

@DavidSpickett DavidSpickett merged commit b2ad90b into llvm:main Oct 20, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants