Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libinjector: skip kernel mode trap frame #1639

Closed

Conversation

BonusPlay
Copy link
Contributor

@BonusPlay BonusPlay commented Apr 9, 2023

Refer to CERT-Polska/drakvuf-sandbox#748 for more info why this is an issue.

@@ -292,6 +292,12 @@ static event_response_t wait_for_target_process_cb(drakvuf_t drakvuf, drakvuf_tr
goto done;
}

if (bp_addr & (1LL << 63))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer checking whether address is kernel-mode with VMI_GET_BIT(addr, 47)

@drakvuf-jenkins
Copy link
Collaborator

Can one of the admins verify this patch?

@tklengyel
Copy link
Owner

It looks like the function where you are adding this check will need to be broken up into smaller chunks to reduce its cognitive complexity.

@BonusPlay
Copy link
Contributor Author

BonusPlay commented May 3, 2023

The threshold is way too low. I'd rather write custom system for applying patches to drakvuf on drakvuf-sandbox side and not upstream fixes rather than refactor each end every function to get below "25 cognitive complexity".

@BonusPlay BonusPlay closed this May 3, 2023
@BonusPlay BonusPlay deleted the fix/ignore-kernel-trap-frame branch May 3, 2023 19:40
@tklengyel
Copy link
Owner

That's disappointing considering the treshold is the standard used by clang-tidy and you would only have to refactor the function you are changing. But hey, you do you, up to you to maintain and test your fork.

@BonusPlay
Copy link
Contributor Author

Such techniques only check for language complexity, without any understanding of underlying logic. You expecting contributors to refactor entire libinjector for windows to comply with random number when fix is a documented single line fix?

> ../src/libinjector/debug_helpers.c:110:6: warning: function 'print_hex' has cognitive complexity of 28 (threshold 25) [readability-function-cognitive-complexity]

Feel free to ping me when someone does refactor print_hex because it's overly complex.

@tklengyel
Copy link
Owner

entire libinjector

I don't know where you are getting this from, only a single function needs to be reworked in a fairly straight forward way - taking parts of it and moving to standalone functions. But you seem to be very frustrated with the fact that there is a CI in your way and because of that don't see that the changes required to get passed it are both reasonably small and also would help you in the long term.

@BonusPlay
Copy link
Contributor Author

BonusPlay commented May 3, 2023

only a single function needs to be reworked

No, not really. I understand that you want to to use active contributors to "reduce code complexity" when they commit new changes to drakvuf, but this is wrong way to do it.

  1. complexity checks do not understand code complexity, readability nor maintainability. Yes, I could split this function into 2 smaller ones, commit, check if I managed to fit under arbitrary number, and if not - forcefully split code into illogical subfunctions. Why not include a CI check for max number of lines a function can have? How about 8. Why 8? Noone knows...
  2. even if I split this function into 2 smaller ones and fit under arbitrary number CI said, it would leave this part of the code less readable, because it would stand out from rest of the functions. Great way to make code less maintainable. The proper way would be to start refactoring the module as a whole.
  3. I'm sure we won't introduce ANY bugs by modifying complex C functions which use goto as their guarantee of not-leaking stuff.
  4. I'm sure helper 5LoC functions which take absolutely random arguments (variables from start of the function) and are used in only one place (well, duh) make code more readable.
  5. Don't event get me started on reducing code complexity in major callback functions (complexity 471). Those functions are probably never to be touched again, because otherwise you're up for additional +20h of refactoring. Even if you want to fix 1 line.

Honestly this feels like a "management newest idea". Let's introduce this arbitrary number and track if it goes higher (or lower in this case) without looking at consequences. But hey, if the number goes down it has to be good, right?

@tklengyel
Copy link
Owner

tklengyel commented May 4, 2023

I understand you have a strong belief that this check to be arbitary but you are wrong. Yes, the complexity threshold of 25 is a number picked out of a hat but if you read the whitepaper you realize it is actually a very reasonable and sufficiently high margin.

You are making guesses when you say splitting complex functions into smaller ones will introduce bugs. Also why would it increase maintainence cost? Smaller functions are easier to understand thus maintain. I did the math on over a 1000 repositories on github and each percent increase of complex code introduces ~1.8 new bugs (using clang as baseline for findig bugs). So unless you can back up your claim with data I will just say stop guessing. My data is available here https://intel.github.io/srs/scans/2023.04.18-4669109317.

So, prove me wrong! Show me the data that backs up your claims. If you just want to argue that the CI is making you do work you don't wanna do, fair enough, that is true. You understand the logic behind it to do this incrementally as people are working on the code base. So the CI check is staying unless a critical vulnerability necessitates patching something fast.

@psrok1
Copy link
Contributor

psrok1 commented Aug 14, 2023

@tklengyel @BonusPlay: I see that you guys have really fascinating discussion, but bug is still there. I tried to setup Drakvuf this week and got the same timeouts on injector. In the same time, new possibly related issues are bubbling up: CERT-Polska/drakvuf-sandbox#808. After applying my patch, problem seems to be fixed.

@tklengyel I agree that function is too complex and I guess it would be nice to split code from if (!injector->is32bit) condition into separate functions. On the other hand, I see that there are comments about "unknown reasons" for using different methods for 64-bit and 32-bit that could be caused by another simple bug and split won't be necessary:

     * This method is workable on 32-bit Windows as well but finding the trapframe
     * sometimes fail for yet unknown reasons.

So I would like to split my work into two simple steps:

  1. Merge a simple patch that fixes a problem (I can open another PR if @BonusPlay doesn't like to discuss it)
  2. Do the rework on this function in another PR that allows to use single method for making a trap in user-mode. If it's really impossible, I would do a refactor to make this function less complex.

Let me know what you think about it 😉

@tklengyel
Copy link
Owner

@psrok1 sounds reasonable to me

@psrok1
Copy link
Contributor

psrok1 commented Sep 12, 2023

Ok, I came back from holiday, made some debugging and looked for that "unknown reason" mentioned in comment. So as we already know: wait_for_target_process_cb uses int3 trap on KTRAP_FRAME.Rip for x64 Windows and memtraps on all user-only pages for x86 Windows (https://github.com/tklengyel/drakvuf/blob/main/src/libinjector/win/win_injector.c#L315)

I checked why the first method doesn't work for x86 as well and the reason is that syscalls on x86 Windows are made through system call dispatcher ntdll!KiFastSystemCall that puts ring 3 return address in EDX and makes a syscall via SYSENTER instruction (https://www.geoffchappell.com/studies/windows/km/cpu/sep.htm)

But it appears that syscall doesn't return directly to EDX but to the instruction after SYSENTER labeled as ntdll!KiFastSystemCallRet and that place appears every time in KTRAP_FRAME.Eip. As it is a single point where all system calls returns, if we put int3 trap there, it will catch all system call exits including syscalls made by injector methods themselves, which probably wasn't intended.

So current injection handler for x86 traps on 0x77576a04 which in my case is ret instruction from ntdll!NtWaitForMultipleObjects:

1694523161.347977 CR3 cb on vCPU 0: 0xbdece240
1694523161.348114 CR3 cb on vCPU 0: 0xbdece0a0
1694523161.348176 CR3 cb on vCPU 0: 0x185000
1694523161.348228 CR3 cb on vCPU 0: 0xbdece200
1694523161.348285 INT3 Callback @ 0x77576a04. CR3 0xbdece200. vcpu 0. TID 868
1694523161.348293 INT3 received but '\Device\HarddiskVolume2\Windows\System32\svchost.exe' PID (860) doesn't match target process (1588)
1694523161.348308 Switching altp2m and to singlestep on vcpu 0
1694523161.348363 reset trap on vCPU 0, switching altp2m 0->1
1694523161.348417 CR3 cb on vCPU 0: 0xbdece3c0
1694523161.348502 INT3 Callback @ 0x77576a04. CR3 0xbdece3c0. vcpu 0. TID 2544
1694523161.348510 Writing to allocated virtual memory to allocate physical memory..
1694523161.348515 Payload is at: 0x78f0000
1694523161.348744 Switching altp2m and to singlestep on vcpu 0
1694523161.348785 reset trap on vCPU 0, switching altp2m 0->1
1694523161.349535 INT3 Callback @ 0x77576a04. CR3 0xbdece3c0. vcpu 0. TID 2544
1694523161.349544 Expanding shell...
1694523161.349758 Switching altp2m and to singlestep on vcpu 0
1694523161.349801 reset trap on vCPU 0, switching altp2m 0->1
1694523161.349848 INT3 Callback @ 0x77576a04. CR3 0xbdece3c0. vcpu 0. TID 2544
1694523161.349856 Env expand status: 1e
1694523161.350473 Reading expanded variable
1694523161.350496 Converting target to UTF-8
1694523161.350505 Expanded: C:\Windows\system32\ntdll.dll
1694523161.350511 Opening file...
1694523161.351629 Switching altp2m and to singlestep on vcpu 0
1694523161.351663 reset trap on vCPU 0, switching altp2m 0->1
1694523161.351940 INT3 Callback @ 0x77576a04. CR3 0xbdece3c0. vcpu 0. TID 2544
1694523161.351950 File create result b74
1694523161.351998 Reading file...
1694523161.352188 Switching altp2m and to singlestep on vcpu 0
1694523161.352231 reset trap on vCPU 0, switching altp2m 0->1
1694523161.352898 INT3 Callback @ 0x77576a04. CR3 0xbdece3c0. vcpu 0. TID 2544
1694523161.352908 File read result: 1
1694523161.352914 Reading Payload chunk
1694523161.356415 Reading next chunk
1694523161.356629 Switching altp2m and to singlestep on vcpu 0
1694523161.356671 reset trap on vCPU 0, switching altp2m 0->1
1694523161.356988 INT3 Callback @ 0x77576a04. CR3 0xbdece3c0. vcpu 0. TID 2544
1694523161.356997 File read result: 1
1694523161.357003 Reading Payload chunk
1694523161.357377 Reading next chunk
1694523161.357572 Switching altp2m and to singlestep on vcpu 0
1694523161.357615 reset trap on vCPU 0, switching altp2m 0->1
1694523161.357684 INT3 Callback @ 0x77576a04. CR3 0xbdece3c0. vcpu 0. TID 2544
1694523161.357692 File read result: 1
1694523161.357698 Reading Payload chunk
1694523161.357905 Finishing
1694523161.358014 Switching altp2m and to singlestep on vcpu 0
1694523161.358052 reset trap on vCPU 0, switching altp2m 0->1
1694523161.358127 INT3 Callback @ 0x77576a04. CR3 0xbdece3c0. vcpu 0. TID 2544
1694523161.358135 Close handle RAX: 0x1
1694523161.358160 File operation executed OK
1694523161.358212 Removing breakpoint trap from 0x69edda04.
1694523161.358247 Removed memtrap for GFN 0x69edd in altp2m view 1
1694523161.358258 Removed memtrap for GFN 0xff00c in altp2m view 1
1694523161.358281 DRAKVUF polling loop finished
1694523161.358287 Finished injection loop
{"Plugin": "inject", "TimeStamp": "1694523161.358299", "Method": "ReadFile", "Status": "Success", "ProcessName": "C:\\Windows\\system32\\ntdll.dll", "Arguments": "", "InjectedPid": 0, "InjectedTid": 0}

but if we unify code and use method from x64, it traps on 0x775770b4 (KiFastSystemCallRet) which causes PrematureBreak due to failing step methods:

1694523579.989903 Breakpoint VA 0x775770b4 -> PA 0x69fde0b4
1694523579.989923 Physmap populated? 1
1694523579.989960 Copied trapped page to new location
1694523579.989963 Activating remapped gfns in the altp2m views!
1694523579.990002         Trap added @ PA 0x69fde0b4 RPA 0xff02a0b4 Page 434142 for entry.
1694523579.990018 Got return address 0x775770b4 from trapframe and it's now trapped!
1694523579.990099 INT3 Callback @ 0x775770b4. CR3 0xbdece3c0. vcpu 0. TID 2544
1694523579.990106 Target TID not provided by the user, pinning TID to 2544
1694523579.990109 Saving registers
1694523579.990310 Switching altp2m and to singlestep on vcpu 0
1694523579.990359 reset trap on vCPU 0, switching altp2m 0->1
1694523579.990469 INT3 Callback @ 0x775770b4. CR3 0xbdece3c0. vcpu 0. TID 2544
1694523579.990478 Writing to allocated virtual memory to allocate physical memory..
1694523579.990480 Payload is at: 0x0
1694523579.990654 Switching altp2m and to singlestep on vcpu 0
1694523579.990707 reset trap on vCPU 0, switching altp2m 0->1
1694523579.990808 INT3 Callback @ 0x775770b4. CR3 0xbdece3c0. vcpu 0. TID 2544
1694523579.990814 Expanding shell...
1694523579.991002 Switching altp2m and to singlestep on vcpu 0
1694523579.991046 reset trap on vCPU 0, switching altp2m 0->1
1694523579.991104 INT3 Callback @ 0x775770b4. CR3 0xbdece3c0. vcpu 0. TID 2544
1694523579.991110 Failed to expand environment variables!
1694523579.991112 Exiting prematurely
1694523579.991168 Removing breakpoint trap from 0x69fde0b4.
1694523579.991189 Removed memtrap for GFN 0x69fde in altp2m view 1
1694523579.991195 Removed memtrap for GFN 0xff02a in altp2m view 1
1694523579.991224 DRAKVUF polling loop finished
1694523579.991226 Finished injection loop
1694523579.991236 Injection premature break
1694523579.991256 Finished with injection. Ret: 0.
1694523579.991259 Injector freed
1694523579.991277 close_vmi starting
1694523580.000980 close_vmi finished
{"Plugin": "inject", "TimeStamp": "1694523579.991239", "Method": "ReadFile", "Status": "PrematureBreak"}

The only thing I don't understand is why KiFastSystemCallRet is actually worse place for making another injection step than ret instruction from NtWaitForMultipleObjects. Let's look at handle_readfile_x64 (which actually works for x86 as well):

switch (base_injector->step)
{
case STEP1: // allocate virtual memory
{
// save registers
PRINT_DEBUG("Saving registers\n");
memcpy(&injector->x86_saved_regs, info->regs, sizeof(x86_registers_t));
if (!setup_virtual_alloc_stack(injector, info->regs))
{
PRINT_DEBUG("Failed to setup virtual alloc for passing inputs!\n");
return cleanup(drakvuf, info);
}
info->regs->rip = injector->exec_func;
return VMI_EVENT_RESPONSE_SET_REGISTERS;
}

STEP1 calls VirtualAlloc by preparing arguments on stack and setting EIP and ESP. Then I guess we're resuming execution and waiting for another trap on that thread.

case STEP2: // write payload to virtual memory
{
// any error checks?
PRINT_DEBUG("Writing to allocated virtual memory to allocate physical memory..\n");
injector->payload_addr = info->regs->rax;
PRINT_DEBUG("Payload is at: 0x%lx\n", injector->payload_addr);
if (!setup_memset_stack(injector, info->regs))
{
PRINT_DEBUG("Failed to setup memset stack for passing inputs!\n");
return cleanup(drakvuf, info);
}
info->regs->rip = injector->memset;
return VMI_EVENT_RESPONSE_SET_REGISTERS;
}

and we're landing in STEP2 where we're setting up memset based on returned address from VirtualAlloc via EAX.

Finally there is my question: I understand why KiFastSystemCallRet doesn't work, as it traps probably just after in NtAllocateVirtualMemory syscall that returns address via argument. Expected behavior is to have a trap after proper translation made by VirtualAlloc that returns address by EAX. But why the trap on completely unrelated function NtWaitForMultipleObjects works? Is there any additional trap I'm missing or context is somehow preserved? @BonusPlay @tklengyel @manorit2001

@manorit2001
Copy link
Contributor

Hi @psrok1,

2 cents..

From the blog article that you have shared about KiFastSystemCall, it does look like it ain't reliable that much given in the blog itself they've written, I might be wrong but it does seem like they are not guaranteeing the return path from KiFastSystemCall.

Note that KiFastSystemCall is not required to use SYSENTER, and that even if it does use SYSENTER, the kernel does not necessarily return by executing SYSEXIT. What concerns the kernel is only that if it is entered at the address it has programmed into the machine-specific registers as the ring 0 target address for SYSENTER, then it returns to user mode at whatever address is in the SystemCallReturn member. How it gets there is nobody’s business but the kernel’s. If it wants to get there by executing an IRET, it may.

Regarding your mentioned point -

But it appears that syscall doesn't return directly to EDX but to the instruction after SYSENTER labeled as ntdll!KiFastSystemCallRet and that place appears every time in KTRAP_FRAME.Eip. As it is a single point where all system calls returns, if we put int3 trap there, it will catch all system call exits including syscalls made by injector methods themselves, which probably wasn't intended.

We do want to catch the calls made by injector themselves as that is how injector functions. You do a syscall, now to get the status of the syscall that happened you need to hit a trap right after the syscall, if the trap that you have set ain't reliable as it seems like in-case of x86 trap being set by trying to hook onto ntdll!KiFastSystemCallRet, then you won't hit the trap right after the syscall, when the trap is hitting later then you might be in some different context altogether causing those premature breaks in the injector.

From the looks of it when you are setting a trap in (NtWaitForMultipleObjects) instead of the trap in kernel syscall ret in x86, we can be sure that the call does return to that specific trap and the injector is correctly trapping right after the syscall returns to the it.

@psrok1
Copy link
Contributor

psrok1 commented Sep 13, 2023

Ok, I found what I was missing 🤔

https://github.com/tklengyel/drakvuf/blob/main/src/libinjector/injector_stack.c#L503-L507

Return address is set along with arguments, soVirtualAlloc is returning back to the initial trap (in my case somewhere in NtWaitForMultipleObjects). If we get any additional, unexpected trap within apicall made by injector, injection process will break.

@manorit2001 Thanks for explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants