Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to make up for the problem that the decompiler cannot restore the assignment and judgment #5005

Closed
lddfg opened this issue Feb 18, 2023 · 2 comments
Assignees

Comments

@lddfg
Copy link

lddfg commented Feb 18, 2023

Describe the question
In one simple case, I found that the decompiler ignored assignment and judgment statements. May I ask why this problem occurs and how to avoid it.

To Reproduce
use code like below:

#include<stdio.h>

int main() {
    int a;
    scanf("%d", &a);
    if(a == 3) {
        a = 4;
    }
    printf("tet: %d\n", a);
    return 0;
}

compile it:

gcc test.c -o test2

import Ghidra and auto Analyze it.
ghidraIssue

Expected behavior
The decompiled code has if jump and assignment operation.
Not only is there no assignment operation in the above code, but also there is no assignment operation in code like the following.

#include<stdio.h>

int main() {
    int a = 3, b = 4, c = 5;
    printf("tet: %d\n", a);
    return 0;
}

The decompiled code will look like this:

undefined4 entry(void)

{
  int in_w1;
  
  __stubs::_printf("tet: %d\n",in_w1);
  return 0;
}

Environment (please complete the following information):

  • OS: [macOS 12.5]
  • Java Version: [openjdk 17.0.6]
  • Ghidra Version: [10.2.3]
  • Ghidra Origin: [ghthub releases]

Additional context
The computer is: [MacBook Air with M1 chip]

gcc --version
Apple clang version 13.1.6 (clang-1316.0.21.2.5)
Target: arm64-apple-darwin21.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

file test2
test2: Mach-O 64-bit executable arm64

Thanks.

@LukeSerne
Copy link
Contributor

The decompiler optimises the if-check away because it thinks the if-check has no effect. The reason it thinks this is because you specified the wrong calling convention for the scanf and printf functions. From the assembly, it appears that the first argument (a pointer to the "format" string) is passed in x0, and the remaining arguments are passed on the stack. This seems similar to the calling convention described in #3927. This is a calling convention that Ghidra doesn't support or recognise by default, so you have do some extra work to get the decompiler output to appear correctly.
You should go to the function address for printf or scanf and then right-click the function name in the decompiler window. Click Edit Function Signature. In the pop-up window, tick Use Custom Storage and manually set the correct storage (first argument goes into x0, the second argument goes into Stack[0x0]). If you do this for both printf and scanf, the decompiler output for the main function should be similar to the output in the screenshot below.


image


Some notes:


Regarding your second example, I'd say that the decompiled code is correct. While the decompiled code does not contain the assignments to b and c, both of these variables are only assigned to and never used. This means that the compiler may have completely removed those assignments, making it impossible to reconstruct. Alternatively, it could be the decompiler that optimises the assignments out to make the decompiled code as readable as possible. While it might be nice in individual cases to still show those assignments, (in general) there are many more such assignments that the compiler might introduce. If the decompiler would also show those, the "real" operations the function performs would be needlessly obscured.


All in all, I don't think this issue should remain open in its current form. Maybe the issue could be rephrased as "It is not possible to set the custom storage using the 'Override signature' feature", although it might be cleaner to just open a new issue for that.

@lddfg
Copy link
Author

lddfg commented Apr 23, 2023

Thank you for your response. As a beginner, I only recently learned about function signatures and didn't initially consider that it might be the issue.

Regarding the first example, I had already deleted the program and currently do not have an m1 mac for testing (though I will test it in the future). Previously, I had tested it on my Raspberry Pi 32-bit system and the compiled code worked fine (as seen in the attached image), so at the time I thought it was an issue with m1.
image

As for the second example, generating temporary variables and using them in the function but then discarding the assigned value is unacceptable. I think this is a loss of detail, which means that when reading the disassembled code, I would have to go back to the assembly code to look for the current value of the variable, which could be time-consuming.

Of course, I will verify the first question and supplement the situation as soon as possible. I would love to discuss the second question with you and explore ways to alleviate the issue.

@NationalSecurityAgency NationalSecurityAgency locked and limited conversation to collaborators Apr 25, 2023
@ryanmkurtz ryanmkurtz converted this issue into discussion #5264 Apr 25, 2023
@ryanmkurtz ryanmkurtz removed the Status: Triage Information is being gathered label Apr 25, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

4 participants