Skip to content

HowTo correctly trace x86 CPU instructions? #2173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Alexco500 opened this issue Apr 25, 2025 · 5 comments
Open

HowTo correctly trace x86 CPU instructions? #2173

Alexco500 opened this issue Apr 25, 2025 · 5 comments

Comments

@Alexco500
Copy link

Alexco500 commented Apr 25, 2025

Hi there,
I have a rather unusual x86 code, which needs to be run in Unicorn. The code needs protected mode (32 bit) and switches to 16 bit segments. If I execute the code, I get an "unmapped memory error", but I don't know why. So I started to single step debugging into Unicorn, but as soon as the translation block is executed, I end up with the error. Means, I can't single step into ret = tcg_qemu_tb_exec(env, tb_ptr);, in line 60 of cpu_exec.c.

The situation is as follows:
Register Dump

EAX: 0x00272024   EBX: 0x00000064   ECX: 0x00000027   EDX: 0x00000000
ESI: 0x00042068   EDI: 0x0004202c   EBP: 0x0000202c   ESP: 0x00042024
EIP: 0x00000006   EFL: 0x00000002
  Reserved (Always 1) (Bit 1)


GS: 0x0000   ES: 0x0027   FS: 0x150b
DS: 0x0027   SS: 0x0027   CS: 0x000f

00010006  ff 5e 00           call far [bp]
00010009  66 ea 0b 01 02 00 5b 00  jmp far 0x005B:0x0002010B
00010011  8c d1              mov cx, ss
00010013  8e d9              mov ds, cx
00010015  8e c1              mov es, cx
00010017  ff 5e 00           call far [bp]`

so the call far [bp] creates the memory error:

Tracing instruction at 0x10000, instruction size = 0x2
--- EFLAGS is 0x2
Tracing instruction at 0x10002, instruction size = 0x2
--- EFLAGS is 0x2
Tracing instruction at 0x10004, instruction size = 0x2
--- EFLAGS is 0x2
Tracing instruction at 0x10006, instruction size = 0x3
--- EFLAGS is 0x2
mem invalid, type 19  @ 0x00000006, address 0x00082068

Failed on uc_emu_start() with error returned 6: Invalid memory read (UC_ERR_READ_UNMAPPED)

PC: 10006 == 0000000f:00000006

I don't know why the error occurs while accessing address 0x82068. If I look at the registers, the call should access 0x4202c, which is inside the stack and looks like that:

0x00042040:    0x00000000
0x0004203c:    0x00000053
0x00042038:    0x00000053
0x00042034:    0x00000053
0x00042030:    0x00042038
0x0004202c:    0xffff0037
0x00042028:    0x00272054
0x00042024:    0x00010000
0x00042024    ESP

So theoretical it should use ffff:0037 as target for the call, but it breaks before.
Any idea how I can get deeper into that?
Maybe the GDT/LDT setup is wrong?
Can I enable some kind of trace for further debugging?

@Alexco500
Copy link
Author

Small update:
If I remove the LDT, the code does not reach address 0x10000, it crashes with GPF while trying to jump to that 16 bit segment.
So yes, it seems that this issue is related to LDT.

@wtdcode
Copy link
Member

wtdcode commented Apr 30, 2025

Sorry for late.

First, have you tried for the dev branch? If so, could you give a reproduction?

If not, you can enable Unicorn internal logging by cmake .. -DUNICORN_LOGGING=y

@Alexco500
Copy link
Author

Okay, I tested dev now also, with same results. But I found the issue, the entry in the LDT was wrong. So not a Unicorn issue.
But I would like to have some kind of feature/enhancement to easily trace CPU instructions and how the execution is computed.

@wtdcode
Copy link
Member

wtdcode commented May 10, 2025

Okay, I tested dev now also, with same results. But I found the issue, the entry in the LDT was wrong. So not a Unicorn issue. But I would like to have some kind of feature/enhancement to easily trace CPU instructions and how the execution is computed.

That's exactly UNICORN_LOGGING and UNICORN_LOG_LEVE env do.

@Alexco500
Copy link
Author

Ouch, I forgot to set the ENV things correctly, sorry.
Now I get lots of debug output

insn_idx=3 ---- 0000000000010006 0000000000000000
 1:  ext16u_i64 tmp2,rbp  dead: 1
 2:  add_i64 tmp2,tmp2,ss_base  dead: 1 2
 3:  ext32u_i64 tmp2,tmp2  dead: 1
 4:  qemu_ld_i64 tmp1,tmp2,leuw,1
 5:  movi_i64 tmp11,$0x128008e00
 6:  movi_i32 tmp14,$0x0
 7:  call check_exit_request_x86_64,$0x0,$0,tmp11,tmp14  dead: 0 1
 8:  movi_i64 tmp11,$0x2
 9:  add_i64 tmp2,tmp2,tmp11  dead: 1 2
 10:  ext32u_i64 tmp2,tmp2  dead: 1
 11:  qemu_ld_i64 tmp0,tmp2,leuw,1  dead: 1
 12:  movi_i64 tmp11,$0x128008e00
 13:  movi_i32 tmp14,$0x0
 14:  call check_exit_request_x86_64,$0x0,$0,tmp11,tmp14  dead: 0 1
 15:  mov_i32 tmp5,tmp0  dead: 1
 16:  movi_i32 tmp14,$0x0
 17:  movi_i64 tmp11,$0x9
 18:  call lcall_protected,$0x0,$0,env,tmp5,tmp1,tmp14,tmp11  dead: 0 1 2 3 4
 19:  call lookup_tb_ptr,$0x6,$1,tmp15,env  dead: 1
 20:  goto_ptr tmp15  dead: 0

*** TCG before optimization:
 0:  movi_i64 tmp11,$0x128008e00
 1:  movi_i32 tmp12,$0x0
 2:  call check_exit_request_x86_64,$0x0,$0,tmp11,tmp12

 insn_idx=0 ---- 000000001fff0037 0000000000000000
 1:  movi_i64 tmp0,$0x1
 2:  mov_i64 tmp1,rax mem_base=0x110048390 
 3:  movi_i32 tmp5,$0x1
 4:  mov_i32 tmp6,tmp1
 5:  call outb,$0x0,$0,env,tmp5,tmp6

I guess, explanations can be found in QEMU docs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants