[lldb] Fix the "RegisterValue::SetValueFromData" method for 128-bit integer registers #163646

sedymrak · 2025-10-15T21:58:08Z

Fix the RegisterValue::SetValueFromData method so that it works also for 128-bit registers that contain integers.

Without this change, the RegisterValue::SetValueFromData method does not work correctly
for 128-bit registers that contain (signed or unsigned) integers.

Steps to reproduce the problem:

(1)

Create a program that writes a 128-bit number to a 128-bit registers xmm0. E.g.:

#include <stdint.h>

int main() {
  __asm__ volatile (
      "pinsrq $0, %[lo], %%xmm0\n\t"  // insert low 64 bits
      "pinsrq $1, %[hi], %%xmm0"    // insert high 64 bits
      :
      : [lo]"r"(0x7766554433221100),
        [hi]"r"(0xffeeddccbbaa9988)
  );
  return 0;
}

(2)

Compile this program with LLVM compiler:

$ $YOUR/clang -g -o main main.c

(3)

Modify LLDB so that when it will be reading value from the xmm0 register, instead of assuming that it is vector register, it will treat it as if it contain an integer. This can be achieved e.g. this way:

diff --git a/lldb/source/Utility/RegisterValue.cpp b/lldb/source/Utility/RegisterValue.cpp
index 0e99451c3b70..a4b51db3e56d 100644
--- a/lldb/source/Utility/RegisterValue.cpp
+++ b/lldb/source/Utility/RegisterValue.cpp
@@ -188,6 +188,7 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info,
     break;
   case eEncodingUint:
   case eEncodingSint:
+  case eEncodingVector:
     if (reg_info.byte_size == 1)
       SetUInt8(src.GetMaxU32(&src_offset, src_len));
     else if (reg_info.byte_size <= 2)
@@ -217,23 +218,6 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info,
     else if (reg_info.byte_size == sizeof(long double))
       SetLongDouble(src.GetLongDouble(&src_offset));
     break;
-  case eEncodingVector: {
-    m_type = eTypeBytes;
-    assert(reg_info.byte_size <= kMaxRegisterByteSize);
-    buffer.bytes.resize(reg_info.byte_size);
-    buffer.byte_order = src.GetByteOrder();
-    if (src.CopyByteOrderedData(
-            src_offset,          // offset within "src" to start extracting data
-            src_len,             // src length
-            buffer.bytes.data(), // dst buffer
-            buffer.bytes.size(), // dst length
-            buffer.byte_order) == 0) // dst byte order
-    {
-      error = Status::FromErrorStringWithFormat(
-          "failed to copy data for register write of %s", reg_info.name);
-      return error;
-    }
-  }
   }
 
   if (m_type == eTypeInvalid)

(4)

Rebuild the LLDB.

(5)

Observe what happens how LLDB will print the content of this register after it was initialized with 128-bit value.

$YOUR/lldb --source ./main
(lldb) target create main
Current executable set to '.../main' (x86_64).
(lldb) breakpoint set --file main.c --line 11
Breakpoint 1: where = main`main + 45 at main.c:11:3, address = 0x000000000000164d
(lldb) settings set stop-line-count-before 20
(lldb) process launch
Process 2568735 launched: '.../main' (x86_64)
Process 2568735 stopped
* thread #1, name = 'main', stop reason = breakpoint 1.1
    frame #0: 0x000055555555564d main`main at main.c:11:3
   1   	#include <stdint.h>
   2   	
   3   	int main() {
   4   	  __asm__ volatile (
   5   	      "pinsrq $0, %[lo], %%xmm0\n\t"  // insert low 64 bits
   6   	      "pinsrq $1, %[hi], %%xmm0"    // insert high 64 bits
   7   	      :
   8   	      : [lo]"r"(0x7766554433221100),
   9   	        [hi]"r"(0xffeeddccbbaa9988)
   10  	  );
-> 11  	  return 0;
   12  	}
(lldb) register read --format hex xmm0
    xmm0 = 0x7766554433221100ffeeddccbbaa9988

You can see that the upper and lower 64-bit wide halves are swapped.

Fix the "RegisterValue::SetValueFromData" so that it works also for 128-bit registers that contain integers.

llvmbot · 2025-10-15T21:58:40Z

@llvm/pr-subscribers-lldb

Author: Matej Košík (sedymrak)

Changes

Fix the "RegisterValue::SetValueFromData" so that it works also for 128-bit registers that contain integers.

Without this change, the RegisterValue::SetValueFromData method does not work correctly
for 128-bit registers that contain (signed or unsigned) integers.

The original version displays the content of the 128-bit registers so that its lower and upper 64-bit halves are swapped.

Full diff: https://github.com/llvm/llvm-project/pull/163646.diff

1 Files Affected:

(modified) lldb/source/Utility/RegisterValue.cpp (+1-1)

diff --git a/lldb/source/Utility/RegisterValue.cpp b/lldb/source/Utility/RegisterValue.cpp
index 0e99451c3b700..12c349a143c0f 100644
--- a/lldb/source/Utility/RegisterValue.cpp
+++ b/lldb/source/Utility/RegisterValue.cpp
@@ -199,7 +199,7 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info,
     else if (reg_info.byte_size <= 16) {
       uint64_t data1 = src.GetU64(&src_offset);
       uint64_t data2 = src.GetU64(&src_offset);
-      if (src.GetByteOrder() == eByteOrderBig) {
+      if (src.GetByteOrder() == eByteOrderLittle) {
         int128.x[0] = data1;
         int128.x[1] = data2;
       } else {

DavidSpickett

If I understood this, the code is trying to make the order inside the APInt always:
least significant 64-bit element, most significant 64-bit element

Because internally APInt expects this no matter the platform endian? Haven't verified that but it's implied.

The first element you'd get from a big endian extractor is actually the most significant, so the two elements need to be reversed.

Which was being done but the wrong way around. Little endian is fine, big endian needs to be swapped.

How did you hit this? I thought I could reach it by running an expression on AArch64 that used a v register, but because I need to use float or double to do that, it's always marked as a float type and takes that path.

The fix seems correct but I'd like to be able to test this. Though it will likely only get run with little endian, it'd be something.

Might be able to make something synthetic via the SBAPI.

sedymrak · 2025-10-16T16:07:29Z

Steps to reproduce the problem:

(1)

Create a program that writes a 128-bit number to a 128-bit registers xmm0. E.g.:

#include <stdint.h>

int main() {
  __asm__ volatile (
      "pinsrq $0, %[lo], %%xmm0\n\t"  // insert low 64 bits
      "pinsrq $1, %[hi], %%xmm0"    // insert high 64 bits
      :
      : [lo]"r"(0x7766554433221100),
        [hi]"r"(0xffeeddccbbaa9988)
  );
  return 0;
}

(2)

Compile this program with LLVM compiler:

$ $YOUR/clang -g -o main main.c

(3)

Modify LLDB so that when it will be reading value from the xmm0 register, instead of assuming that it is vector register, it will treat it as if it contain an integer. This can be achieved e.g. this way:

diff --git a/lldb/source/Utility/RegisterValue.cpp b/lldb/source/Utility/RegisterValue.cpp
index 0e99451c3b70..a4b51db3e56d 100644
--- a/lldb/source/Utility/RegisterValue.cpp
+++ b/lldb/source/Utility/RegisterValue.cpp
@@ -188,6 +188,7 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info,
     break;
   case eEncodingUint:
   case eEncodingSint:
+  case eEncodingVector:
     if (reg_info.byte_size == 1)
       SetUInt8(src.GetMaxU32(&src_offset, src_len));
     else if (reg_info.byte_size <= 2)
@@ -217,23 +218,6 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info,
     else if (reg_info.byte_size == sizeof(long double))
       SetLongDouble(src.GetLongDouble(&src_offset));
     break;
-  case eEncodingVector: {
-    m_type = eTypeBytes;
-    assert(reg_info.byte_size <= kMaxRegisterByteSize);
-    buffer.bytes.resize(reg_info.byte_size);
-    buffer.byte_order = src.GetByteOrder();
-    if (src.CopyByteOrderedData(
-            src_offset,          // offset within "src" to start extracting data
-            src_len,             // src length
-            buffer.bytes.data(), // dst buffer
-            buffer.bytes.size(), // dst length
-            buffer.byte_order) == 0) // dst byte order
-    {
-      error = Status::FromErrorStringWithFormat(
-          "failed to copy data for register write of %s", reg_info.name);
-      return error;
-    }
-  }
   }
 
   if (m_type == eTypeInvalid)

(4)

Observe what happens how LLDB will print the content of this register after it was initialized with 128-bit value.

$YOUR/lldb --source ./main
(lldb) target create main
Current executable set to '.../main' (x86_64).
(lldb) breakpoint set --file main.c --line 11
Breakpoint 1: where = main`main + 45 at main.c:11:3, address = 0x000000000000164d
(lldb) settings set stop-line-count-before 20
(lldb) process launch
Process 2568735 launched: '.../main' (x86_64)
Process 2568735 stopped
* thread #1, name = 'main', stop reason = breakpoint 1.1
    frame #0: 0x000055555555564d main`main at main.c:11:3
   1   	#include <stdint.h>
   2   	
   3   	int main() {
   4   	  __asm__ volatile (
   5   	      "pinsrq $0, %[lo], %%xmm0\n\t"  // insert low 64 bits
   6   	      "pinsrq $1, %[hi], %%xmm0"    // insert high 64 bits
   7   	      :
   8   	      : [lo]"r"(0x7766554433221100),
   9   	        [hi]"r"(0xffeeddccbbaa9988)
   10  	  );
-> 11  	  return 0;
   12  	}
(lldb) register read --format hex xmm0
    xmm0 = 0x7766554433221100ffeeddccbbaa9988

You can see that the upper and lower 64-bit wide halves are swapped.

DavidSpickett · 2025-10-16T17:07:21Z

Thanks for the reproducer. Please include it in the PR description as well, as it may be the closest thing we get to a test case.

I'm still not clear how you found this, but if the answer is:

You spotted it by reading code
You have a downstream target with registers like this
You are working on some NDA'd thing

Then the conclusion is the same. Hard to test this with an upstream target.

We have faked targets in testing before, so I think we could fake an AArch64 where V registers were not vectors, as this type is set in the XML sent from the debug server. lldb/test/API/functionalities/gdb_remote_client/TestXMLRegisterFlags.py is an example of this.

Then again, I found that there's no public API for this call. Running an expression on a fake target like that could work but it'd be a lot of mock stuff to implement (for a start, we'd need to pretend it can JIT code).

We have some tests that put a simulated debug sever in front of a real one, but that's too much effort for this fix.

Let me think about it for a bit. If I don't come up with anything then I'll approve this, as it does look correct to me.

sedymrak · 2025-10-16T21:25:56Z

Thanks for the reproducer. Please include it in the PR description as well, as it may be the closest thing we get to a test case.

I am not sure if I understand correctly what you meant I should do.
I have edited my first post on this matter to include the instructions how one can reproduce the problem.
Is this what you meant?

I'm still not clear how you found this, but if the answer is:
* You spotted it by reading code

* You have a downstream target with registers like this

* You are working on some NDA'd thing

The problem floated to the surface when we adapted LLDB for debugging programs written in a custom programming language that supports non-standard integer sizes for registers as well as variables that, in this case, are treated as registers.

Then the conclusion is the same. Hard to test this with an upstream target.

I am not aware of any platform where this bug could be directly reproduced.

I am not sure how much it would help, but I plan (tomorrow) to look at related unit-tests. I need to know:

Why they did not scream when I changed the RegisterValue::SetValueFromData method?
Can I add a test that will cover the faulty branch?

At worst I will learn why it cannot be done...

We have faked targets in testing before, so I think we could fake an AArch64 where V registers were not vectors, as this type is set in the XML sent from the debug server. lldb/test/API/functionalities/gdb_remote_client/TestXMLRegisterFlags.py is an example of this.

Then again, I found that there's no public API for this call. Running an expression on a fake target like that could work but it'd be a lot of mock stuff to implement (for a start, we'd need to pretend it can JIT code).

We have some tests that put a simulated debug sever in front of a real one, but that's too much effort for this fix.

Let me think about it for a bit. If I don't come up with anything then I'll approve this, as it does look correct to me.

+1

DavidSpickett · 2025-10-17T07:20:23Z

Is this what you meant?

Yes, thanks.

Why they did not scream when I changed the RegisterValue::SetValueFromData method?

I doubt this ever had coverage because no target ever tried this. Those unit tests look promising though, if we can set the type of the register value.

DavidSpickett · 2025-10-17T09:16:06Z

My idea with the fake target is going to be way too much work, the unittests is a better idea.

I got this to fail:

diff --git a/lldb/unittests/Utility/RegisterValueTest.cpp b/lldb/unittests/Utility/RegisterValueTest.cpp
index e0863a41605e..55fb3a2b05ad 100644
--- a/lldb/unittests/Utility/RegisterValueTest.cpp
+++ b/lldb/unittests/Utility/RegisterValueTest.cpp
@@ -7,6 +7,9 @@
 //===----------------------------------------------------------------------===//
 
 #include "lldb/Utility/RegisterValue.h"
+#include "lldb/Utility/DataExtractor.h"
+#include "lldb/lldb-enumerations.h"
+#include "lldb/lldb-private-types.h"
 #include "gtest/gtest.h"
 #include <optional>
 
@@ -54,3 +57,28 @@ TEST(RegisterValueTest, GetScalarValue) {
             Scalar((APInt(128, 0xffeeddccbbaa9988ull) << 64) |
                    APInt(128, 0x7766554433221100)));
 }
+
+TEST(RegisterValueTest, SetValueFromData) {
+  RegisterValue rv;
+  RegisterInfo ri {
+    "uint128_register", nullptr, 16, 0, lldb::Encoding::eEncodingUint,
+    lldb::Format::eFormatDefault, {0, 0, 0, LLDB_INVALID_REGNUM, 0},
+    nullptr, nullptr, nullptr
+  };
+
+  uint64_t src[] = {0x7766554433221100, 0xffeeddccbbaa9988};
+  DataExtractor src_extractor(src, 16, lldb::ByteOrder::eByteOrderLittle, 8);
+  rv.SetValueFromData(ri, src_extractor, /*src_offset=*/0, /*partial_data_ok=*/false);
+
+  // Copied from above.
+  const auto &Get = [](const RegisterValue &V) -> std::optional<Scalar> {
+    Scalar S;
+    if (V.GetScalarValue(S))
+      return S;
+    return std::nullopt;
+  };
+
+  EXPECT_EQ(Get(rv),
+    Scalar((APInt(128, 0xffeeddccbbaa9988ull) << 64) |
+                   APInt(128, 0x7766554433221100ull)));
+}

You can use that with some cleanup if you can confirm that your change fixes it. May need repeating for little and big endian extractors, though that may also swap the order within the uint64s so watch out for that.

sedymrak · 2025-10-17T09:35:37Z

I came to the same conclusion a little later than you.
The test I was experimented with:

--- a/lldb/unittests/Utility/RegisterValueTest.cpp
+++ b/lldb/unittests/Utility/RegisterValueTest.cpp
@@ -7,6 +7,8 @@
 //===----------------------------------------------------------------------===//
 
 #include "lldb/Utility/RegisterValue.h"
+#include "lldb/lldb-private-types.h"
+#include "lldb/Utility/DataExtractor.h"
 #include "gtest/gtest.h"
 #include <optional>
 
@@ -54,3 +56,21 @@ TEST(RegisterValueTest, GetScalarValue) {
             Scalar((APInt(128, 0xffeeddccbbaa9988ull) << 64) |
                    APInt(128, 0x7766554433221100)));
 }
+
+TEST(RegisterValueTest, SetValueFromData) {
+  uint8_t bytes[] = {0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77,
+                     0x88, 0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff};
+  lldb::offset_t length = sizeof(bytes);
+  lldb_private::DataExtractor data(bytes, length, lldb::eByteOrderLittle,
+                                   sizeof(nullptr));
+  lldb_private::RegisterInfo reg_info;
+  reg_info.byte_size = 16;
+  reg_info.byte_offset = 0;
+  reg_info.encoding = lldb::eEncodingUint;
+  RegisterValue rv128;
+  EXPECT_TRUE(rv128.SetValueFromData(reg_info, data, 0, false).Success());
+  Scalar s128;
+  EXPECT_TRUE(rv128.GetScalarValue(s128));
+  EXPECT_TRUE(s128 == Scalar((APInt(128, 0xffeeddccbbaa9988ull) << 64) |
+                             APInt(128, 0x7766554433221100ull)));
+}

nicely failed on the main branch but it succeed in the the branch that contains the fix.

I will think about it and push something today.

github-actions · 2025-10-17T12:48:44Z

✅ With the latest revision this PR passed the C/C++ code formatter.

DavidSpickett

Looking good just one refactoring suggestion.

I like the design of having one target value and changing the inputs. I've tried to do it the other way around in the past and it always seemed harder to comprehend.

lldb/unittests/Utility/RegisterValueTest.cpp

DavidSpickett

LGTM.

I can merge it for you if needed, let me know.

sedymrak · 2025-10-18T10:30:33Z

LGTM.

I can merge it for you if needed, let me know.

That would be great. If you can, please do that.

DavidSpickett · 2025-10-20T10:16:23Z

This commit will be authored by 99839051+sedymrak@users.noreply.github.com.

Before I can merge this, please update your GitHub email settings - https://llvm.org/docs/DeveloperPolicy.html#email-addresses

(your email does show up in Co-authored-by: but we need it as the author at minimum)

sedymrak · 2025-10-20T11:41:23Z

This commit will be authored by 99839051+sedymrak@users.noreply.github.com.

Before I can merge this, please update your GitHub email settings - https://llvm.org/docs/DeveloperPolicy.html#email-addresses

(your email does show up in Co-authored-by: but we need it as the author at minimum)

No problem.
I've looked at these settings and I have turned the "Keep my email addresses private" setting OFF.
(It was set to "ON" previously).

[lldb] fix the "RegisterValue::SetValueFromData" method

b2fb392

Fix the "RegisterValue::SetValueFromData" so that it works also for 128-bit registers that contain integers.

sedymrak requested a review from JDevlieghere as a code owner October 15, 2025 21:58

llvmbot added the lldb label Oct 15, 2025

DavidSpickett reviewed Oct 16, 2025

View reviewed changes

DavidSpickett changed the title ~~[lldb] fix the "RegisterValue::SetValueFromData" method~~ [lldb] fix the "RegisterValue::SetValueFromData" method for 128-bit integer registers Oct 16, 2025

[lldb] add tests for the "RegisterValue::SetValueFromData" method

cc54084

[lldb] fixing formatting

fdae92f

DavidSpickett reviewed Oct 17, 2025

View reviewed changes

lldb/unittests/Utility/RegisterValueTest.cpp Show resolved Hide resolved

Matej Košík added 2 commits October 17, 2025 15:49

[lldb] refactoring

592b8f8

[lldb] fixing formatting

08ddc99

DavidSpickett reviewed Oct 17, 2025

View reviewed changes

lldb/unittests/Utility/RegisterValueTest.cpp Outdated Show resolved Hide resolved

lldb/unittests/Utility/RegisterValueTest.cpp Outdated Show resolved Hide resolved

lldb/unittests/Utility/RegisterValueTest.cpp Outdated Show resolved Hide resolved

Matej Košík added 2 commits October 17, 2025 16:09

[lldb] refactoring

812a183

[lldb] refactoring

86e81ee

DavidSpickett approved these changes Oct 17, 2025

View reviewed changes

DavidSpickett changed the title ~~[lldb] fix the "RegisterValue::SetValueFromData" method for 128-bit integer registers~~ [lldb] Fix the "RegisterValue::SetValueFromData" method for 128-bit integer registers Oct 20, 2025

DavidSpickett merged commit b2ad90b into llvm:main Oct 20, 2025
10 checks passed

[lldb] Fix the "RegisterValue::SetValueFromData" method for 128-bit integer registers #163646

[lldb] Fix the "RegisterValue::SetValueFromData" method for 128-bit integer registers #163646

Uh oh!

Conversation

sedymrak commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Oct 15, 2025

Uh oh!

DavidSpickett left a comment

Choose a reason for hiding this comment

Uh oh!

sedymrak commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavidSpickett commented Oct 16, 2025

Uh oh!

sedymrak commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavidSpickett commented Oct 17, 2025

Uh oh!

DavidSpickett commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sedymrak commented Oct 17, 2025

Uh oh!

github-actions bot commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavidSpickett left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DavidSpickett left a comment

Choose a reason for hiding this comment

Uh oh!

sedymrak commented Oct 18, 2025

Uh oh!

DavidSpickett commented Oct 20, 2025

Uh oh!

sedymrak commented Oct 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sedymrak commented Oct 15, 2025 •

edited

Loading

sedymrak commented Oct 16, 2025 •

edited

Loading

sedymrak commented Oct 16, 2025 •

edited

Loading

DavidSpickett commented Oct 17, 2025 •

edited

Loading

github-actions bot commented Oct 17, 2025 •

edited

Loading