Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hexagon update #1338

Closed
wants to merge 25 commits into from
Closed

Hexagon update #1338

wants to merge 25 commits into from

Conversation

Rot127
Copy link
Member

@Rot127 Rot127 commented Jul 20, 2021

Your checklist for this pull request

  • I've read the guidelines for contributing to this repository
  • I made sure to follow the project's coding style
  • I've added tests that prove my fix is effective or that my feature works (if possible)
  • I've updated the documentation and the rizin book with the relevant information (if needed)

Detailed description

This is an update of the hexagon asm and analysis plugin. It used the LLVM source to generate the plugin code.

This pull request only synchronizes the source code with the rz-hexagon repo.
For more details please refer to the pull request over there: rizinorg/rz-hexagon#4

Test plan

asm tests
The tests for the asm plugin are updated as well, since some of the old ones where incorrect.
The asm tests in this pull requests were written and disassembled with the Hexagon IDE tools.

analysis tests
Those tests got not updated (unfortunately my dev time is up for this month) and fail at the moment. I confirmed the correct working of the hardware loops, jumps and calls manually.

Closing issues
closes rizinorg/rz-hexagon#13

None

librz/analysis/p/analysis_hexagon.c Outdated Show resolved Hide resolved
@XVilka
Copy link
Member

XVilka commented Jul 21, 2021

Failed test:

[XX] db/analysis/hexagon hexagon jumps
RZ_NOPLUGINS=1 rizin -escr.utf8=0 -escr.color=0 -escr.interactive=0 -N -Qc 's sym.main
af
afx
pifc
' bins/elf/analysis/hexagon-hello-loop
-- stdout
--- expected
+++ actual
@@ -1,8 +1,11 @@
-c 0x00005124 -> 0x00005128 jump 0x5128
-c 0x00005134 -> 0x00005138 jump 0x5138
-C 0x00005138 -> 0x000050e0 call sym.pHello
-C 0x0000513c -> 0x000050f8 call sym.pWorld
-c 0x00005140 -> 0x00005144 jump 0x5144
-c 0x00005150 -> 0x00005128 jump 0x5128
-call sym.pHello
-call sym.pWorld
+c 0x00005124 -> 0x00005128 [     jump 0x5128 
+c 0x00005134 -> 0x00005138 [     jump 0x5138 
+C 0x00005138 -> 0x000050e0 [     call sym.pHello 
+C 0x0000513c -> 0x000050f8 [     call sym.pWorld 
+c 0x00005140 -> 0x00005144 [     jump 0x5144 
+c 0x00005150 -> 0x00005128 [     jump 0x5128 
+c 0x00005168 -> 0x00005170 [     jump sym.thread_create_extended 
+C 0x000051c8 -> 0x000062e6 / call sym.free 
+\ call sym.pHello 
+[     call 0x50f8 
+/ call sym.free 

-- stderr
Cannot determine entrypoint, using 0x00005000.
WARNING: No calling convention defined for this file, analysis may be inaccurate.

Would be awesome to not show the block start in case of only one pi instruction, I think. Not sure if possible though.

@Rot127
Copy link
Member Author

Rot127 commented Aug 3, 2021

@XVilka Just to clarify. Do you mean:

A

/     call sym.free
\     call sym.pHello 
[     call 0x50f8 
/     call sym.free 

should be printed as

/     call sym.free
\     call sym.pHello 
      call 0x50f8 
/     call sym.free 

This wouldn't be a problem. And now that I see it I like it even more.

Or do you mean:

B

/     call sym.free
\     call sym.pHello 
[     call 0x50f8 
/     call sym.free 
>>> Disassembly ends <<<

should be printed as:

/     call sym.free
\     call sym.pHello 
[     call 0x50f8 
      call sym.free 
>>> Disassembly ends <<<

Off the top of my hat I would say this is harder to implement. Since the plugin needs to track which is the last instruction currently disassembled.

@XVilka
Copy link
Member

XVilka commented Aug 13, 2021

@Rot127 sorry, didn't notice your question before. What I meant, it would be nice not to show block boundaries depending if it's only one instruction instead of multiple.

@Rot127
Copy link
Member Author

Rot127 commented Aug 16, 2021

@XVilka Unfortunately this is probably a bit too much of a hustle. hex_set_pkt_info() does not know whether it disassembles only a single instruction or an instruction packet.

If you know any way to pass the current block size to the plugin, we could check in hex_set_pkt_info() whether the block size is currently 1 and omit the prefix.
(Besides: is my assumption correct that the block size is one if I execute pi 1 in rizin?)

@wargio
Copy link
Member

wargio commented Aug 16, 2021

you can keep a context structure, if needed.

@Rot127
Copy link
Member Author

Rot127 commented Aug 17, 2021

@XVilka @wargio After looking into it a bit, I would add this feature not now. The current way of identifying instructions packets (and prepend the prefix to the assembly) is a little buggy anyways (rizinorg/rz-hexagon#18).

So I would like to focus more at things like rizinorg/rz-hexagon#14, rizinorg/rz-hexagon#13 or rizinorg/rz-hexagon#5 at the moment. They promise more of the basic functionality.

@XVilka
Copy link
Member

XVilka commented Aug 17, 2021

@Rot127 ok, fine for me. As long as it's tracked in the issue, we can postpone.

@wargio
Copy link
Member

wargio commented Aug 17, 2021

ok for me too

@wargio
Copy link
Member

wargio commented Aug 19, 2021

i have a small request. can you please split the main function in hexagon_disas.c into a function having a switch case calling another function? this will fix ASAN building issues.

@Rot127
Copy link
Member Author

Rot127 commented Aug 25, 2021

What binary are you testing?

@wargio
Copy link
Member

wargio commented Aug 25, 2021

[XX] db/analysis/hexagon hexagon jumps
RZ_NOPLUGINS=1 rizin -escr.utf8=0 -escr.color=0 -escr.interactive=0 -N -Qc 's sym.main
af
aar
afx
s 0x000050e4
pi 2
s 0x0000539c
pi 4
/ai 0xdead
/ai 0xbeef
/ai 0xfffffff8
/ai 0x1c00
' bins/elf/analysis/hexagon-hello-loop
-- stdout
--- expected
+++ actual
@@ -14,34 +14,34 @@
 0x00000b80   # 4: [     R0.h = #0xdead 
 0x00000b84   # 4: [     R0.l = #0xbeef 
 0x00001e68   # 4: [     R4.l = #0xbeef 
+0x0000342c   # 4: | if (!R23) memh(P3++#-0x8) = R11.h 
+0x0000342e   # 4: | if (!R7) memw(P3++#-0x8) = R11 
 0x00005128   # 4: [     R2 = memw(R30+##-0x8) 
 0x00005144   # 4: [     R2 = memw(R30+##-0x8) 
 0x0000514c   # 4: [     memw(R30+##-0x8) = R2 
 0x0000564c   # 4: [     R2 = memw(R0+##-0x8) 
 0x00005654   # 4: [     if (R17.new) P0 = add(R16,##-0x8) 
 0x00005740   # 4: [     R17 = and(R2,##-0x8) 
-0x00005840   # 4: / R4 = add(R2,##-0x8) 
-0x00005970   # 4: \ if (R2.new) P2 = add(R2,##-0x8) 
+0x00005840   # 4: | R4 = add(R2,##-0x8) 
+0x00005970   # 4: [     if (R2.new) P2 = add(R2,##-0x8) 
 0x000059d0   # 4: [     R1 = and(R1,##-0x8) 
 0x00006394   # 4: [     R4 = and(R4,##-0x8) 
 0x000063f8   # 4: [     R4 = and(R4,##-0x8) 
 0x0000641c   # 4: [     R4 = and(R4,##-0x8) 
 0x00006510   # 4: [     R4 = and(R4,##-0x8) 
 0x00006534   # 4: [     R4 = and(R4,##-0x8) 
-0x00007134   # 4: / R3 = mux(P0,#-0x8,#-0x8) 
+0x00007134   # 4: | R3 = mux(P0,#-0x8,#-0x8) 
 0x00007b4e   # 4: | R0 = add(R2,##-0x8) 
 0x00007eae   # 4: | R0 = add(R2,##-0x8) 
 0x00007ee0   # 4: | R17 = add(R17,##-0x8) 
-0x00008484   # 4: \ R2 = add(R16,add(R2,##-0x8)) 
-0x0000848c   # 4: / R2 = add(R2,##-0x8) 
-0x00008598   # 4: \ memd(R2+##-0x8) = R1:0  < endloop0
+0x00008484   # 4: [     R2 = add(R16,add(R2,##-0x8)) 
+0x0000848c   # 4: | R2 = add(R2,##-0x8) 
+0x00008598   # 4: [     memd(R2+##-0x8) = R1:0 
 0x000087b8   # 4: [     memd(R3+##-0x8) = R1:0 
 0x00009b64   # 4: [     R2 = memw(R18+##-0x8) 
-0x0000342c   # 4: / if (!R23) memh(P3++#-0x8) = R11.h 
-0x0000342e   # 4: | if (!R7) memw(P3++#-0x8) = R11 
+0x00000aac   # 4: | immext(##0x1c00) 
+0x00000ab4   # 4: | immext(##0x1c00) 
+0x00000b44   # 4: | immext(##0x1c00) 
 0x00005170   # 4: | immext(##0x1c00) 
 0x00005178   # 4: | immext(##0x1c00) 
 0x00005180   # 4: | immext(##0x1c00) 
-0x00000aac   # 4: | immext(##0x1c00) 
-0x00000ab4   # 4: | immext(##0x1c00) 
-0x00000b44   # 4: | immext(##0x1c00) 

-- stderr
Cannot determine entrypoint, using 0x00005000.
WARNING: No calling convention defined for this file, analysis may be inaccurate.
WARNING: (../librz/asm/arch/hexagon/hexagon.c:104):hex_get_double_regs: code should not be reached
WARNING: (../librz/asm/arch/hexagon/hexagon.c:104):hex_get_double_regs: code should not be reached
WARNING: (../librz/asm/arch/hexagon/hexagon.c:104):hex_get_double_regs: code should not be reached
WARNING: (../librz/asm/arch/hexagon/hexagon.c:104):hex_get_double_regs: code should not be reached
WARNING: (../librz/asm/arch/hexagon/hexagon.c:104):hex_get_double_regs: code should not be reached
WARNING: (../librz/asm/arch/hexagon/hexagon.c:104):hex_get_double_regs: code should not be reached
WARNING: (../librz/asm/arch/hexagon/hexagon.c:104):hex_get_double_regs: code should not be reached
WARNING: (../librz/asm/arch/hexagon/hexagon.c:104):hex_get_double_regs: code should not be reached
WARNING: (../librz/asm/arch/hexagon/hexagon.c:104):hex_get_double_regs: code should not be reached
WARNING: (../librz/asm/arch/hexagon/hexagon.c:104):hex_get_double_regs: code should not be reached
WARNING: (../librz/asm/arch/hexagon/hexagon.c:104):hex_get_double_regs: code should not be reached
WARNING: (../librz/asm/arch/hexagon/hexagon.c:104):hex_get_double_regs: code should not be reached
WARNING: (../librz/asm/arch/hexagon/hexagon.c:104):hex_get_double_regs: code should not be reached
...........

@Rot127
Copy link
Member Author

Rot127 commented Aug 25, 2021

That is weird. The test worked fine for me. Anything I might have done wrong?

@Rot127
Copy link
Member Author

Rot127 commented Aug 25, 2021

@wargio 2aca254 should remove the warnings.
The failing test is pretty weird. It finds all instructions which have the value -8 in it, but it searches from the lowest to the highest address.
On my machine it doesn't. The instructions are ordered as in the test file.
If it is fine for you I could simply upload the test file again. With the instructions in the correct order.

@Rot127
Copy link
Member Author

Rot127 commented Aug 25, 2021

Besides fixing the problems with the test I think I won't add any functionality anymore. The pull request is already huge.
Of cause, if you see any "must haves" before merging it I am happy to add them.

@wargio
Copy link
Member

wargio commented Aug 25, 2021

it is fine for you I could simply upload the test file again. With the instructions in the correct order.

sure

@Rot127
Copy link
Member Author

Rot127 commented Aug 26, 2021

The tests are currently failing all the time, because each build produces different assembly (see blow).

The instructions are correct, but their assignment to different instruction packages or the endloop1 attribute doesn't work very well at the moment (see issue: rizinorg/rz-hexagon#18).
Unfortunately I do not have time to fix this at the moment. Maybe in the next two weeks but I don't want to promise too much.

macos-meson-clang-tests

[XX] db/analysis/hexagon hexagon jumps
RZ_NOPLUGINS=1 rizin -escr.utf8=0 -escr.color=0 -escr.interactive=0 -N -Qc 's sym.main
af
aar
afx
s 0x000050e4
pi 2
s 0x0000539c
pi 4
/ai 0xdead
/ai 0xbeef
/ai 0xfffffff8
/ai 0x1c00
' bins/elf/analysis/hexagon-hello-loop
-- stdout
--- expected
+++ actual
@@ -14,9 +14,11 @@
 0x00000b80   # 4: [     R0.h = #0xdead 
 0x00000b84   # 4: [     R0.l = #0xbeef 
 0x00001e68   # 4: [     R4.l = #0xbeef 
+0x00002f05   # 4: [     R2 = add(R2in,##-0x8) ; R2 = memb(R16+#0x7) 
+0x00003005   # 4: [     R18 = add(R18in,##-0x8) ; R18 = memb(R16+#0x7) 
 0x0000342c   # 4: / if (!R23) memh(P3++#-0x8) = R11.h 
 0x0000342e   # 4: | if (!R7) memw(P3++#-0x8) = R11 
-0x00005128   # 4: \ R2 = memw(R30+##-0x8) 
+0x00005128   # 4: \ R2 = memw(R30+##-0x8)  < endloop01
 0x00005144   # 4: [     R2 = memw(R30+##-0x8) 
 0x0000514c   # 4: [     memw(R30+##-0x8) = R2 
 0x0000564c   # 4: [     R2 = memw(R0+##-0x8) 
@@ -36,11 +38,11 @@
 0x00007ee0   # 4: | R17 = add(R17,##-0x8) 
 0x00008484   # 4: \ R2 = add(R16,add(R2,##-0x8)) 
 0x0000848c   # 4: / R2 = add(R2,##-0x8) 
-0x00008598   # 4: \ memd(R2+##-0x8) = R1:0  < endloop0 
+0x00008598   # 4: \ memd(R2+##-0x8) = R1:0  < endloop0
 0x000087b8   # 4: [     memd(R3+##-0x8) = R1:0 
 0x00009b64   # 4: [     R2 = memw(R18+##-0x8) 
-0x00000aac   # 4: | immext(##0x1c00) 
-0x00000ab4   # 4: / immext(##0x1c00) 
+0x00000aac   # 4: / immext(##0x1c00) 
+0x00000ab4   # 4: | immext(##0x1c00) 
 0x00000b44   # 4: | immext(##0x1c00) 
 0x00005170   # 4: | immext(##0x1c00) 
 0x00005178   # 4: | immext(##0x1c00) 

ubuntu-tcc-test

[XX] db/analysis/hexagon hexagon jumps
RZ_NOPLUGINS=1 rizin -escr.utf8=0 -escr.color=0 -escr.interactive=0 -N -Qc 's sym.main
af
aar
afx
s 0x000050e4
pi 2
s 0x0000539c
pi 4
/ai 0xdead
/ai 0xbeef
/ai 0xfffffff8
/ai 0x1c00
' bins/elf/analysis/hexagon-hello-loop
-- stdout
--- expected
+++ actual
@@ -14,33 +14,33 @@
 0x00000b80   # 4: [     R0.h = #0xdead 
 0x00000b84   # 4: [     R0.l = #0xbeef 
 0x00001e68   # 4: [     R4.l = #0xbeef 
-0x0000342c   # 4: / if (!R23) memh(P3++#-0x8) = R11.h 
+0x0000342c   # 4: | if (!R23) memh(P3++#-0x8) = R11.h 
 0x0000342e   # 4: | if (!R7) memw(P3++#-0x8) = R11 
-0x00005128   # 4: \ R2 = memw(R30+##-0x8) 
+0x00005128   # 4: [     R2 = memw(R30+##-0x8) 
 0x00005144   # 4: [     R2 = memw(R30+##-0x8) 
 0x0000514c   # 4: [     memw(R30+##-0x8) = R2 
 0x0000564c   # 4: [     R2 = memw(R0+##-0x8) 
 0x00005654   # 4: [     if (R17.new) P0 = add(R16,##-0x8) 
 0x00005740   # 4: [     R17 = and(R2,##-0x8) 
-0x00005840   # 4: / R4 = add(R2,##-0x8) 
-0x00005970   # 4: \ if (R2.new) P2 = add(R2,##-0x8) 
+0x00005840   # 4: | R4 = add(R2,##-0x8) 
+0x00005970   # 4: [     if (R2.new) P2 = add(R2,##-0x8) 
 0x000059d0   # 4: [     R1 = and(R1,##-0x8) 
 0x00006394   # 4: [     R4 = and(R4,##-0x8) 
 0x000063f8   # 4: [     R4 = and(R4,##-0x8) 
 0x0000641c   # 4: [     R4 = and(R4,##-0x8) 
 0x00006510   # 4: [     R4 = and(R4,##-0x8) 
 0x00006534   # 4: [     R4 = and(R4,##-0x8) 
-0x00007134   # 4: / R3 = mux(P0,#-0x8,#-0x8) 
+0x00007134   # 4: | R3 = mux(P0,#-0x8,#-0x8) 
 0x00007b4e   # 4: | R0 = add(R2,##-0x8) 
 0x00007eae   # 4: | R0 = add(R2,##-0x8) 
 0x00007ee0   # 4: | R17 = add(R17,##-0x8) 
-0x00008484   # 4: \ R2 = add(R16,add(R2,##-0x8)) 
-0x0000848c   # 4: / R2 = add(R2,##-0x8) 
-0x00008598   # 4: \ memd(R2+##-0x8) = R1:0  < endloop0 
+0x00008484   # 4: [     R2 = add(R16,add(R2,##-0x8)) 
+0x0000848c   # 4: | R2 = add(R2,##-0x8) 
+0x00008598   # 4: [     memd(R2+##-0x8) = R1:0 
 0x000087b8   # 4: [     memd(R3+##-0x8) = R1:0 
 0x00009b64   # 4: [     R2 = memw(R18+##-0x8) 
 0x00000aac   # 4: | immext(##0x1c00) 
-0x00000ab4   # 4: / immext(##0x1c00) 
+0x00000ab4   # 4: | immext(##0x1c00) 
 0x00000b44   # 4: | immext(##0x1c00) 
 0x00005170   # 4: | immext(##0x1c00) 
 0x00005178   # 4: | immext(##0x1c00) 

@wargio
Copy link
Member

wargio commented Aug 26, 2021

make a clean build and run the tests again on your machine

@Rot127
Copy link
Member Author

Rot127 commented Aug 26, 2021

Implemented the packet tracking. So the tests should pass now.

@wargio
Copy link
Member

wargio commented Aug 26, 2021

i would highly suggest you to triplecheck that the local variables that are defined in the generated code are always initialized.

Comment on lines 32 to 33
op->jump = op->fail = -1;
op->ptr = op->val = -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UT64_MAX

@Rot127
Copy link
Member Author

Rot127 commented Sep 1, 2021

i would highly suggest you to triplecheck that the local variables that are defined in the generated code are always initialized.

Could you give a hint which variable you are speaking of? I looked through the code again and again but couldn't find some not initialized ones. Sorry, I might be at a loss.

@wargio
Copy link
Member

wargio commented Sep 1, 2021

i would highly suggest you to triplecheck that the local variables that are defined in the generated code are always initialized.

Could you give a hint which variable you are speaking of? I looked through the code again and again but couldn't find some not initialized ones. Sorry, I might be at a loss.

based on the behavior i thought it was a problem related to uninitialized variables.. but if there are none, then i really have no clue..

@Rot127
Copy link
Member Author

Rot127 commented Sep 7, 2021

Closed in favor of #1614

@Rot127 Rot127 closed this Sep 7, 2021
@Rot127 Rot127 deleted the hexagon-update branch May 21, 2023 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RzAnalysisOp.val won't get set
3 participants