Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HLIL optimization is aggressive. #3049

Closed
Stolas opened this issue Mar 28, 2022 · 11 comments
Closed

HLIL optimization is aggressive. #3049

Stolas opened this issue Mar 28, 2022 · 11 comments
Labels
Component: Core Issue needs changes to the core Core: HLIL Issue involves High Level IL Effort: Medium Issue should take < 1 month Impact: Medium Issue is impactful with a bad, or no, workaround Type: Bug Issue is a non-crashing bug with repro steps

Comments

@Stolas
Copy link
Contributor

Stolas commented Mar 28, 2022

Binary Ninja version 3.0.3233 Personal Build e250f0a3.

Rootme System 1:

#include <unistd.h>
#include <sys/types.h>
#include <stdlib.h>
#include <stdio.h>
 
int main()
{
 
  int var;
  int check = 0x04030201;
  char buf[40];
 
  fgets(buf,45,stdin);
 
  printf("\n[buf]: %s\n", buf);
  printf("[check] %p\n", check);
 
  if ((check != 0x04030201) && (check != 0xdeadbeef))
    printf ("\nYou are on the right way!\n");
 
  if (check == 0xdeadbeef)
   {
     printf("Yeah dude! You win!\nOpening your shell...\n");
     setreuid(geteuid(), geteuid());
     system("/bin/bash");
     printf("Shell closed! Bye.\n");
   }
   return 0;
}

Results in the following HLIL:

08048546  int32_t main(int32_t argc, char** argv, char** envp)

0804854d      void* const var_4 = __return_addr
08048555      int32_t* var_14 = &argc
0804857d      void var_4c
0804857d      fgets(buf: &var_4c, n: 0x2d, fp: *stdin)
08048593      printf(format: "\n[buf]: %s\n", &var_4c)
080485a8      printf(format: "[check] %p\n", 0x4030201)
0804863b      return 0
@Stolas
Copy link
Contributor Author

Stolas commented Mar 28, 2022

ch13.zip

@fuzyll fuzyll added Core: HLIL Issue involves High Level IL Type: Bug Issue is a non-crashing bug with repro steps labels Mar 28, 2022
@Stolas
Copy link
Contributor Author

Stolas commented Mar 28, 2022

Fyi expected output (IDA)

int __cdecl main(int argc, const char **argv, const char **envp)
{
  __uid_t v3; // esi
  __uid_t v4; // eax
  char s[40]; // [esp+0h] [ebp-44h] BYREF
  int v7; // [esp+28h] [ebp-1Ch]
  int *p_argc; // [esp+38h] [ebp-Ch]

  p_argc = &argc;
  v7 = 67305985;
  fgets(s, 45, stdin);
  printf("\n[buf]: %s\n", s);
  printf("[check] %p\n", v7);
  if ( v7 != 67305985 && v7 != -559038737 )
    puts("\nYou are on the right way!");
  if ( v7 == -559038737 )
  {
    puts("Yeah dude! You win!\nOpening your shell...");
    v3 = geteuid();
    v4 = geteuid();
    setreuid(v4, v3);
    system("/bin/bash");
    puts("Shell closed! Bye.");
  }
  return 0;
}

@plafosse plafosse added Effort: Medium Issue should take < 1 month Impact: Medium Issue is impactful with a bad, or no, workaround Component: Core Issue needs changes to the core labels Apr 4, 2022
@themaks
Copy link

themaks commented Apr 8, 2022

In your binary, if you declare var_4c as a char[0x2d] (which it kind of is according to the fgets n argument), the HLIL looks like what you expected.

image

Without this, there would have been no cross reference to the "check" variable, so it seems pretty legitimate to me that BN treats it as constant and eliminates dead branches.

PS: By the way, an optimizing compiler would perform the same optimizations, since buffer overflows are undefined behavior. Here is your code compiled with -O3:
image

@Stolas
Copy link
Contributor Author

Stolas commented Apr 8, 2022

Reply, you are missing the point of the issue in question.

IDA finds the var_4c size without any interaction, Binary Ninja does not.
That is all.

@psifertex
Copy link
Member

If the only difference in analysis is the creation of a stack array then this is a duplicate of #1637 so it's a useful comment, thanks! I will double check and if I can confirm I will close this after this comment links it to the other issue as extra samples are always helpful but we wouldn't need an extra issue.

@Stolas
Copy link
Contributor Author

Stolas commented Apr 8, 2022

Well its related for sure, yet assigning it a buffer size of 0x2d (45) would be incorrect as the buffer size of 0x28 (40).
When setting to 0x28 HLIL is still to aggressive with filtering the 'unused' values.

08048546  int32_t main(int32_t argc, char** argv, char** envp)

0804854d      void* const var_4 = __return_addr
08048555      int32_t* var_14 = &argc
0804857d      char var_4c[0x28]
0804857d      fgets(buf: &var_4c, n: 0x2d, fp: *stdin)
08048593      printf(format: "\n[buf]: %s\n", &var_4c)
080485a8      printf(format: "[check] %p\n", 0x4030201)
0804863b      return 0

@themaks
Copy link

themaks commented Apr 8, 2022

The thing is the source code is "incorrect" from the start, so we could argue that the buffer size is in fact 45, not 40, since it is the size used by fgets. There is not a unique way to "correcty" decompile the program.

I guess if we compile the following program, we would have more or less the same binary, thus only having a 45 bytes buffer :

#include <unistd.h>
#include <sys/types.h>
#include <stdlib.h>
#include <stdio.h>
#define CHECK *(int*)&buf[40]

int main()
{
 
  int var;
  char buf[45];

  fgets(buf,45,stdin);
 
  printf("\n[buf]: %s\n", buf);
  printf("[check] %p\n", CHECK);
 
  if ((CHECK != 0x04030201) && (CHECK != 0xdeadbeef))
    printf ("\nYou are on the right way!\n");
 
  if (CHECK == 0xdeadbeef)
   {
     printf("Yeah dude! You win!\nOpening your shell...\n");
     setreuid(geteuid(), geteuid());
     system("/bin/bash");
     printf("Shell closed! Bye.\n");
   }
   return 0;
}

Personally, I think it is a good thing that HLIL eliminates dead branches "aggressively" the way it does, to me it is one aspect where BN is better than IDA, because the control flow presented in the decompiled view directly reflects the information provided by the user (i.e. user variable values, or in your case, variable types and sizes).

EDIT: however, like you said, the problem is that BN does not come up with any size for the var_4c during analysis, despite its use in fgets for instance.

@plafosse
Copy link
Member

plafosse commented Apr 8, 2022

Additionally it would be nice if we offered some way to allow people to see that branches have been eliminated. Also it would be nice if we supported "volatile" for stack variables to tell HLIL that it should assume that value can change at anytime.

@CouleeApps
Copy link
Member

Issue tracking dead branch display: #2219

@plafosse
Copy link
Member

plafosse commented Jan 5, 2024

This issue has been largely addressed due to improved stack variable aliasing. The detection of the non-constant stack arrays still exists but the original purpose of this issue has been addressed. This is the current output of Binary Ninja
image

@plafosse plafosse closed this as completed Jan 5, 2024
@plafosse
Copy link
Member

plafosse commented Jan 5, 2024

Follow: #2570 for updates on stack array creation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Core Issue needs changes to the core Core: HLIL Issue involves High Level IL Effort: Medium Issue should take < 1 month Impact: Medium Issue is impactful with a bad, or no, workaround Type: Bug Issue is a non-crashing bug with repro steps
Projects
None yet
Development

No branches or pull requests

6 participants