Skip to content

Conversation

@tgs-sc
Copy link
Contributor

@tgs-sc tgs-sc commented Oct 21, 2025

While testing baremetal lldb, I came across a situation that if an instruction could not be disassembled, lldb will print nothing as an output which might be a bit strange. I added at least printing warning in this case.

@tgs-sc
Copy link
Contributor Author

tgs-sc commented Oct 21, 2025

@Michael137, @JDevlieghere, Hi! Can you please look at this patch?

@llvmbot
Copy link
Member

llvmbot commented Oct 21, 2025

@llvm/pr-subscribers-lldb

@llvm/pr-subscribers-backend-risc-v

Author: Timur Golubovich (tgs-sc)

Changes

While testing baremetal lldb, I came across a situation that if an instruction could not be disassembled, lldb will print nothing as an output which might be a bit strange. I added at least printing warning in this case.


Full diff: https://github.com/llvm/llvm-project/pull/164413.diff

4 Files Affected:

  • (modified) lldb/source/Core/DumpDataExtractor.cpp (+3-1)
  • (added) lldb/test/API/riscv/decode-failure/Makefile (+3)
  • (added) lldb/test/API/riscv/decode-failure/TestDecodeFailure.py (+32)
  • (added) lldb/test/API/riscv/decode-failure/main.c (+4)
diff --git a/lldb/source/Core/DumpDataExtractor.cpp b/lldb/source/Core/DumpDataExtractor.cpp
index 37dffc72d76ac..904643754574d 100644
--- a/lldb/source/Core/DumpDataExtractor.cpp
+++ b/lldb/source/Core/DumpDataExtractor.cpp
@@ -157,7 +157,9 @@ static lldb::offset_t DumpInstructions(const DataExtractor &DE, Stream *s,
         exe_scope->CalculateExecutionContext(exe_ctx);
         disassembler_sp->GetInstructionList().Dump(
             s, show_address, show_bytes, show_control_flow_kind, &exe_ctx);
-      }
+      } else if (number_of_instructions)
+        s->Printf("warning: failed to decode instructions at 0x%" PRIx64 ".",
+                  addr);
     }
   } else
     s->Printf("invalid target");
diff --git a/lldb/test/API/riscv/decode-failure/Makefile b/lldb/test/API/riscv/decode-failure/Makefile
new file mode 100644
index 0000000000000..10495940055b6
--- /dev/null
+++ b/lldb/test/API/riscv/decode-failure/Makefile
@@ -0,0 +1,3 @@
+C_SOURCES := main.c
+
+include Makefile.rules
diff --git a/lldb/test/API/riscv/decode-failure/TestDecodeFailure.py b/lldb/test/API/riscv/decode-failure/TestDecodeFailure.py
new file mode 100644
index 0000000000000..a3b9621ad35f8
--- /dev/null
+++ b/lldb/test/API/riscv/decode-failure/TestDecodeFailure.py
@@ -0,0 +1,32 @@
+"""
+Test the 'memory read' command when decoding instruction failures.
+"""
+
+import lldb
+import lldbsuite.test.lldbutil as lldbutil
+
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+
+
+class MemoryReadTestCase(TestBase):
+    NO_DEBUG_INFO_TESTCASE = True
+
+    def build_run_stop(self):
+        self.build()
+        lldbutil.run_to_source_breakpoint(
+            self, "// break here", lldb.SBFileSpec("main.c")
+        )
+
+    @skipIf(archs=no_match("^riscv.*"))
+    def test_memory_read(self):
+        """Test the 'memory read' command with instruction format."""
+        self.build_run_stop()
+
+        # asume that 0xffffffff is invalid instruction in RISC-V
+        # (lldb) memory read --format instruction `&my_insns[0]`
+        # warning: failed to decode instructions at 0x
+        self.expect(
+            "memory read --format instruction --count 1 `&my_insns[0]`",
+            substrs=["failed to decode instructions at"],
+        )
diff --git a/lldb/test/API/riscv/decode-failure/main.c b/lldb/test/API/riscv/decode-failure/main.c
new file mode 100644
index 0000000000000..af2e69a76bd89
--- /dev/null
+++ b/lldb/test/API/riscv/decode-failure/main.c
@@ -0,0 +1,4 @@
+int main(int argc, const char *argv[]) {
+  char my_insns[4] = {0xff, 0xff, 0xff, 0xff};
+  return 0; // break here
+}

Comment on lines 160 to 162
} else if (number_of_instructions)
s->Printf("warning: failed to decode instructions at 0x%" PRIx64 ".",
addr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else if (number_of_instructions)
s->Printf("warning: failed to decode instructions at 0x%" PRIx64 ".",
addr);
} else if (number_of_instructions) {
s->Printf("failed to decode instructions at 0x%" PRIx64 ".",
addr);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there precedent for writing errors or warnings to this stream? Depending on how this method is used, this might be fine or it might result in something weird like instruction = warnings: failed .... The print below just says "invalid target". Following that pattern also sidesteps the question of whether this is a warning or an error.

FWIW, even if we always dump the stream to the CommandReturnObject, it may still be nice to separate this out so we can write it to the dedicated error or warning stream (so it gets prefixed correctly with color). But that's a bigger change than what you're doing here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there precedent for writing errors or warnings to this stream?

Actually, I saw printing warning in such way in lldb/source/Process/Target so I decided to do the same. In the DumpDataExtractor there is already such code, so I think there is no much diffrenece between warning and error

    case eFormatBoolean:
      if (item_byte_size <= 8)
        s->Printf("%s", DE.GetMaxU64Bitfield(&offset, item_byte_size,
                                             item_bit_size, item_bit_offset)
                            ? "true"
                            : "false");
      else {
        s->Printf("error: unsupported byte size (%" PRIu64
                  ") for boolean format",
                  (uint64_t)item_byte_size);
        return offset;
      }
      break;

so we can write it to the dedicated error or warning stream

Yes, at the beginning I wanted to report this information in caller of DumpDataExtractor, but 1) There are many places and in some of them return value is ignored 2) Following the logic that an error/warning should be processed in the place where it was received, it should be processed right here.

lldb::offset_t lldb_private::DumpDataExtractor(...)
...
  if (item_format == eFormatInstruction)
    return DumpInstructions(DE, s, exe_scope, start_offset, base_addr,
                            item_count);
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JDevlieghere, Hi! What do you think about the current state?

@tgs-sc tgs-sc force-pushed the users/tgs-sc/lldb-empty-output-on-print-insn branch from 3c3239c to 5b17397 Compare October 21, 2025 18:29
@tgs-sc tgs-sc force-pushed the users/tgs-sc/lldb-empty-output-on-print-insn branch 4 times, most recently from 8a8e095 to aab9c5f Compare October 22, 2025 14:19
Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for helping get the test working on AArch64. Just small comments now from me.

Jonas had some comments idk if they got addressed so I leave it to him to check and approve if so.

While testing baremetal lldb, I came across a situation that if an instruction
could not be disassembled, lldb will print nothing as an output which might be
a bit strange. I added at least printing warning in this case.
@tgs-sc tgs-sc force-pushed the users/tgs-sc/lldb-empty-output-on-print-insn branch from aab9c5f to e923a77 Compare October 22, 2025 14:50
@lenary
Copy link
Member

lenary commented Oct 23, 2025

How does this work now that LLDB doesn't stop on the first unknown instruction? I thought it now printed <unknown> for that and kept going, but maybe that's not right?

@tgs-sc
Copy link
Contributor Author

tgs-sc commented Oct 23, 2025

@lenary, Hi!

How does this work now that LLDB doesn't stop on the first unknown instruction? I thought it now printed for that and kept going, but maybe that's not right?

Since this commit (eb6da94), lldb now really prints <unknown> opcode much more often. But still, there are some situations when it can not do this. These situations differ in different architectures, but a common situation is when fetching length failed (so, disasm even doesn't know when next instruction starts). You can check DumpInstructions(...) function in lldb/source/Core/DumpDataExtractor.cpp to see more detailed pipeline.

@tgs-sc
Copy link
Contributor Author

tgs-sc commented Oct 23, 2025

@DavidSpickett, @JDevlieghere, I think that this patch can be merged.

@DavidSpickett
Copy link
Collaborator

@lenary the fact we can reach this state may be a sign that something is wrong elsewhere, that's true. Maybe something should be skipping forward to try and re-sync.

Someone can try to fix that but in the meantime, I'm ok with this PR making the error a bit more clear.

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jonas is probably busy preparing for US LLVM Dev now, and at the event all next week.

You removed the "warning: " prefix from the message and that seemed to be the crux of his comment. You're now following the local pattern at least, like the "invalid target" message below it.

So this LGTM and any other feedback can come post-commit.

@tgs-sc
Copy link
Contributor Author

tgs-sc commented Oct 23, 2025

Maybe something should be skipping forward to try and re-sync.

I think that, in general, a situation is always possible in which it is unclear how to decode the instruction length, so it is hard to imagine how this can be improved. Now, there are two phases: decode and disassemble. If the decode fails, disassembling makes no sense, and we will see a newly added warning. If the disassemble fails, we will see an <unknown> opcode.

Jonas is probably busy preparing for US LLVM Dev now, and at the event all next week.

You removed the "warning: " prefix from the message and that seemed to be the crux of his comment. You're now following the local pattern at least, like the "invalid target" message below it.

So this LGTM and any other feedback can come post-commit.

@DavidSpickett, ok, so can you probably merge this as I don't have commit access?

@DavidSpickett DavidSpickett merged commit 5d7da0a into llvm:main Oct 23, 2025
10 checks passed
@DavidSpickett
Copy link
Collaborator

I think that, in general, a situation is always possible in which it is unclear how to decode the instruction length, so it is hard to imagine how this can be improved.

For llvm-objdump and Arm Thumb they have something that will keep jumping forward to find a point that finally decodes. Maybe there's something special about Thumb that helps them do that. Probably something like N byte encodings are always N byte aligned (later Thumb can be 16 or 32-bit)

Anyway, no requirement to pursue that.

dvbuka pushed a commit to dvbuka/llvm-project that referenced this pull request Oct 27, 2025
…4413)

While testing baremetal lldb, I came across a situation that if an
instruction could not be disassembled, lldb will print nothing as an
output which might be a bit strange. I added at least printing warning
in this case.
Lukacma pushed a commit to Lukacma/llvm-project that referenced this pull request Oct 29, 2025
…4413)

While testing baremetal lldb, I came across a situation that if an
instruction could not be disassembled, lldb will print nothing as an
output which might be a bit strange. I added at least printing warning
in this case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants