Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Support printing immediate of RISCV MCInst in hexadecimal format #74053

Merged
merged 1 commit into from
Dec 15, 2023

Conversation

MouseSplinter
Copy link
Contributor

Enable the llvm-objdump to disassemble the immediate of RISCV instruction in hexadecimal format with --print-imm-hex flag.

@llvmbot llvmbot added backend:RISC-V mc Machine (object) code labels Dec 1, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 1, 2023

@llvm/pr-subscribers-lld-elf
@llvm/pr-subscribers-lld
@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-mc

Author: Wang Yaduo (MouseSplinter)

Changes

Enable the llvm-objdump to disassemble the immediate of RISCV instruction in hexadecimal format with --print-imm-hex flag.


Full diff: https://github.com/llvm/llvm-project/pull/74053.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp (+4-4)
  • (added) llvm/test/MC/RISCV/print-imm-hex.s (+42)
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp
index a6f3f7f8d18e069..195dda0b8b14059 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp
@@ -91,7 +91,7 @@ void RISCVInstPrinter::printOperand(const MCInst *MI, unsigned OpNo,
   }
 
   if (MO.isImm()) {
-    markup(O, Markup::Immediate) << MO.getImm();
+    markup(O, Markup::Immediate) << formatImm(MO.getImm());
     return;
   }
 
@@ -113,7 +113,7 @@ void RISCVInstPrinter::printBranchOperand(const MCInst *MI, uint64_t Address,
       Target &= 0xffffffff;
     markup(O, Markup::Target) << formatHex(Target);
   } else {
-    markup(O, Markup::Target) << MO.getImm();
+    markup(O, Markup::Target) << formatImm(MO.getImm());
   }
 }
 
@@ -128,7 +128,7 @@ void RISCVInstPrinter::printCSRSystemRegister(const MCInst *MI, unsigned OpNo,
   else if (SysReg && SysReg->haveRequiredFeatures(STI.getFeatureBits()))
     markup(O, Markup::Register) << SysReg->Name;
   else
-    markup(O, Markup::Register) << Imm;
+    markup(O, Markup::Register) << formatImm(Imm);
 }
 
 void RISCVInstPrinter::printFenceArg(const MCInst *MI, unsigned OpNo,
@@ -212,7 +212,7 @@ void RISCVInstPrinter::printVTypeI(const MCInst *MI, unsigned OpNo,
   // or non-zero in bits 8 and above.
   if (RISCVVType::getVLMUL(Imm) == RISCVII::VLMUL::LMUL_RESERVED ||
       RISCVVType::getSEW(Imm) > 64 || (Imm >> 8) != 0) {
-    O << Imm;
+    O << formatImm(Imm);
     return;
   }
   // Print the text form.
diff --git a/llvm/test/MC/RISCV/print-imm-hex.s b/llvm/test/MC/RISCV/print-imm-hex.s
new file mode 100644
index 000000000000000..dd35b5c423a9d15
--- /dev/null
+++ b/llvm/test/MC/RISCV/print-imm-hex.s
@@ -0,0 +1,42 @@
+# RUN: llvm-mc %s -triple=riscv32 -riscv-no-aliases -show-encoding -mattr=+v \
+# RUN:     | FileCheck -check-prefixes=CHECK-ASM %s
+# RUN: llvm-mc %s -triple=riscv32 -riscv-no-aliases -show-encoding -mattr=+v --print-imm-hex \
+# RUN:     | FileCheck -check-prefixes=CHECK-ASM-HEX %s
+# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+v < %s \
+# RUN:     | llvm-objdump -M no-aliases --mattr=+v -d -r - \
+# RUN:     | FileCheck -check-prefixes=CHECK-OBJ %s
+# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+v < %s \
+# RUN:     | llvm-objdump -M no-aliases --mattr=+v --print-imm-hex -d -r - \
+# RUN:     | FileCheck -check-prefixes=CHECK-OBJ-HEX %s
+
+# CHECK-ASM: beq s1, s1, 102
+# CHECK-ASM-HEX: beq s1, s1, 0x66
+# CHECK-OBJ: beq s1, s1, 0x66
+# CHECK-OBJ-HEX: beq s1, s1, 0x66
+beq s1, s1, 102
+
+_sym:
+# CHECK-ASM: beq s1, s1, _sym
+# CHECK-ASM-HEX: beq s1, s1, _sym
+# CHECK-OBJ: beq s1, s1, 0x4
+# CHECK-OBJ-HEX: beq s1, s1, 0x4
+beq s1, s1, _sym
+
+# CHECK-ASM: lw a0, 97(a2)
+# CHECK-ASM-HEX: lw a0, 0x61(a2)
+# CHECK-OBJ: lw a0, 97(a2)
+# CHECK-OBJ-HEX: lw a0, 0x61(a2)
+lw a0, 97(a2)
+
+# CHECK-ASM: csrrwi t0, 4095, 31
+# CHECK-ASM-HEX: csrrwi t0, 0xfff, 0x1f
+# CHECK-OBJ: csrrwi t0, 4095, 31
+# CHECK-OBJ-HEX: csrrwi t0, 0xfff, 0x1f
+csrrwi t0, 0xfff, 31
+
+
+# CHECK-ASM: vsetvli a2, a0, 255
+# CHECK-ASM-HEX: vsetvli a2, a0, 0xff
+# CHECK-OBJ: vsetvli a2, a0, 255
+# CHECK-OBJ-HEX: vsetvli a2, a0, 0xff
+vsetvli a2, a0, 0xff

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! One question: is this behavior the same as current GNU toolchain?

@MouseSplinter
Copy link
Contributor Author

LGTM! One question: is this behavior the same as current GNU toolchain?

Yes, GNU objdump print the immediates in hexadecimal format by default, and the llvm-objdump will do so with this patch. :)

@wangpc-pp
Copy link
Contributor

LGTM! One question: is this behavior the same as current GNU toolchain?

Yes, GNU objdump print the immediates in hexadecimal format by default, and the llvm-objdump will do so with this patch. :)

Then I have no question now, please go ahead and merge this PR. :-)

Enable the llvm-objdump to disassemble the immediate of RISCV
instruction in hexadecimal format with --print-imm-hex flag.
@topperc
Copy link
Collaborator

topperc commented Dec 6, 2023

LGTM! One question: is this behavior the same as current GNU toolchain?

Yes, GNU objdump print the immediates in hexadecimal format by default, and the llvm-objdump will do so with this patch. :)

gnu objdump for RISC-V is inconsistent. Looking at a dump from 2.41 I see
Shift by immediate uses hex.
addi/andi/ori/xori/li use decimal
lui uses hex.
load/store offset use decimal.
label offsets on branches use hex
vmv.v.i uses decimal

@MouseSplinter
Copy link
Contributor Author

gnu objdump for RISC-V is inconsistent. Looking at a dump from 2.41 I see Shift by immediate uses hex. addi/andi/ori/xori/li use decimal lui uses hex. load/store offset use decimal. label offsets on branches use hex vmv.v.i uses decimal

Thanks for your correction, and do you think we should conform to gnu's behavior? Or just be the same as other architecture in llvm.

@topperc
Copy link
Collaborator

topperc commented Dec 6, 2023

gnu objdump for RISC-V is inconsistent. Looking at a dump from 2.41 I see Shift by immediate uses hex. addi/andi/ori/xori/li use decimal lui uses hex. load/store offset use decimal. label offsets on branches use hex vmv.v.i uses decimal

Thanks for your correction, and do you think we should conform to gnu's behavior? Or just be the same as other architecture in llvm.

We can conform to other architectures.

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MouseSplinter
Copy link
Contributor Author

gnu objdump for RISC-V is inconsistent. Looking at a dump from 2.41 I see Shift by immediate uses hex. addi/andi/ori/xori/li use decimal lui uses hex. load/store offset use decimal. label offsets on branches use hex vmv.v.i uses decimal

Thanks for your correction, and do you think we should conform to gnu's behavior? Or just be the same as other architecture in llvm.

We can conform to other architectures.

Thanks.

Copy link
Contributor

@asb asb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed change LGTM, thanks.

@MouseSplinter MouseSplinter merged commit 3dde0d0 into llvm:main Dec 15, 2023
3 checks passed
@vitalybuka
Copy link
Collaborator

@dyung
Copy link
Collaborator

dyung commented Dec 15, 2023

MouseSplinter added a commit that referenced this pull request Dec 15, 2023
vitalybuka added a commit that referenced this pull request Dec 15, 2023
…imal format" (#75561)

Reverts #74053

Breaks https://lab.llvm.org/buildbot/#/builders/5/builds/39291

Co-authored-by: Wang Yaduo <wangyaduo@linux.alibaba.com>

Issue #75563
@MaskRay
Copy link
Member

MaskRay commented Dec 15, 2023

LGTM

When merging a change, especially with a lot of test updates, always ensure that the rebased version still works...

Sometimes, only through rebasing can one notice that the patch needs adjustments to more code or tests due to its interaction with another landed patch:

Another landed patch added some tests that would be changed by the current patch.
Another landed patch added a new use of the function that is renamed by the current patch.

--

You can rebase locally and test, but not push an amended commit to the branch, then use the github to merge.

vitalybuka pushed a commit that referenced this pull request Dec 15, 2023
…mat (#74053)

Enable the llvm-objdump to disassemble the immediate of RISCV
instruction in hexadecimal format with --print-imm-hex flag.
@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 15, 2023

gnu objdump for RISC-V is inconsistent. Looking at a dump from 2.41 I see Shift by immediate uses hex. addi/andi/ori/xori/li use decimal lui uses hex. load/store offset use decimal. label offsets on branches use hex vmv.v.i uses decimal

Thanks for your correction, and do you think we should conform to gnu's behavior? Or just be the same as other architecture in llvm.

We can conform to other architectures.

AArch64 uses decimal, as a data point, as do the various MIPS targets, PowerPC and x86, at least for a simple load with an offset.

To me this makes the output less user-friendly, and it's quite annoying churn for downstream. I've been perfectly happy with the output as it stands... Not to mention that it now deviates from what appears to be the standard practice.

So personally I'm extremely anti this patch.

@topperc
Copy link
Collaborator

topperc commented Dec 15, 2023

gnu objdump for RISC-V is inconsistent. Looking at a dump from 2.41 I see Shift by immediate uses hex. addi/andi/ori/xori/li use decimal lui uses hex. load/store offset use decimal. label offsets on branches use hex vmv.v.i uses decimal

Thanks for your correction, and do you think we should conform to gnu's behavior? Or just be the same as other architecture in llvm.

We can conform to other architectures.

AArch64 uses decimal, as a data point, as do the various MIPS targets, PowerPC and x86, at least for a simple load with an offset.

To me this makes the output less user-friendly, and it's quite annoying churn for downstream. I've been perfectly happy with the output as it stands... Not to mention that it now deviates from what appears to be the standard practice.

So personally I'm extremely anti this patch.

AArch64 and X86 both use formatImm as far as I can see. How do they end up defaulting to decimal and RISC-V is defaulting to hex?

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 15, 2023

gnu objdump for RISC-V is inconsistent. Looking at a dump from 2.41 I see Shift by immediate uses hex. addi/andi/ori/xori/li use decimal lui uses hex. load/store offset use decimal. label offsets on branches use hex vmv.v.i uses decimal

Thanks for your correction, and do you think we should conform to gnu's behavior? Or just be the same as other architecture in llvm.

We can conform to other architectures.

AArch64 uses decimal, as a data point, as do the various MIPS targets, PowerPC and x86, at least for a simple load with an offset.
To me this makes the output less user-friendly, and it's quite annoying churn for downstream. I've been perfectly happy with the output as it stands... Not to mention that it now deviates from what appears to be the standard practice.
So personally I'm extremely anti this patch.

AArch64 and X86 both use formatImm as far as I can see. How do they end up defaulting to decimal and RISC-V is defaulting to hex?

Oh they've all been changed, that's awful. Is there an RFC about this tree-wide behavioural change that, IMO, is a regression?

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 15, 2023

Note that this makes clang -S inconsistent with llvm-objdump -d; the former continues to use friendly decimal for things, whilst the latter uses hex, which isn't so user-friendly and feels rather against the spirit of disassembling things to look as close to human assembly as possible.

@topperc
Copy link
Collaborator

topperc commented Dec 15, 2023

gnu objdump for RISC-V is inconsistent. Looking at a dump from 2.41 I see Shift by immediate uses hex. addi/andi/ori/xori/li use decimal lui uses hex. load/store offset use decimal. label offsets on branches use hex vmv.v.i uses decimal

Thanks for your correction, and do you think we should conform to gnu's behavior? Or just be the same as other architecture in llvm.

We can conform to other architectures.

AArch64 uses decimal, as a data point, as do the various MIPS targets, PowerPC and x86, at least for a simple load with an offset.
To me this makes the output less user-friendly, and it's quite annoying churn for downstream. I've been perfectly happy with the output as it stands... Not to mention that it now deviates from what appears to be the standard practice.
So personally I'm extremely anti this patch.

AArch64 and X86 both use formatImm as far as I can see. How do they end up defaulting to decimal and RISC-V is defaulting to hex?

Oh they've all been changed, that's awful. Is there an RFC about this tree-wide behavioural change that, IMO, is a regression?

AArch64 made the change to use formatImm in 2016 http://reviews.llvm.org/D16929 Not sure what llvm-objdump defaulted to at the time.

@topperc
Copy link
Collaborator

topperc commented Dec 15, 2023

gnu objdump for RISC-V is inconsistent. Looking at a dump from 2.41 I see Shift by immediate uses hex. addi/andi/ori/xori/li use decimal lui uses hex. load/store offset use decimal. label offsets on branches use hex vmv.v.i uses decimal

Thanks for your correction, and do you think we should conform to gnu's behavior? Or just be the same as other architecture in llvm.

We can conform to other architectures.

AArch64 uses decimal, as a data point, as do the various MIPS targets, PowerPC and x86, at least for a simple load with an offset.
To me this makes the output less user-friendly, and it's quite annoying churn for downstream. I've been perfectly happy with the output as it stands... Not to mention that it now deviates from what appears to be the standard practice.
So personally I'm extremely anti this patch.

AArch64 and X86 both use formatImm as far as I can see. How do they end up defaulting to decimal and RISC-V is defaulting to hex?

Oh they've all been changed, that's awful. Is there an RFC about this tree-wide behavioural change that, IMO, is a regression?

AArch64 made the change to use formatImm in 2016 http://reviews.llvm.org/D16929 Not sure what llvm-objdump defaulted to at the time.

Here's where objdump default changed https://reviews.llvm.org/D136972

@MouseSplinter
Copy link
Contributor Author

LGTM

When merging a change, especially with a lot of test updates, always ensure that the rebased version still works...

Sometimes, only through rebasing can one notice that the patch needs adjustments to more code or tests due to its interaction with another landed patch:

Another landed patch added some tests that would be changed by the current patch. Another landed patch added a new use of the function that is renamed by the current patch.

--

You can rebase locally and test, but not push an amended commit to the branch, then use the github to merge.

Thanks for your guidance too much.

qihangkong pushed a commit to rvgpu/rvgpu-llvm that referenced this pull request Apr 23, 2024
…imal format" (#75561)

Reverts llvm/llvm-project#74053

Breaks https://lab.llvm.org/buildbot/#/builders/5/builds/39291

Co-authored-by: Wang Yaduo <wangyaduo@linux.alibaba.com>

Issue #75563
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants