Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function name annotation annotator implemented #123

Merged
merged 15 commits into from Jul 11, 2020
Merged

Function name annotation annotator implemented #123

merged 15 commits into from Jul 11, 2020

Conversation

NirmalManoj
Copy link
Member

@NirmalManoj NirmalManoj commented Jul 3, 2020

Detailed description
This PR implements an annotator that will annotate function names with a new type of annotation (R_CODE_ANNOTATION_TYPE_FUNCTION_NAME) introduced by PR #17204 in radare2. This will be required for implementing some features in the decompiler widget in Cutter.

I have also implemented offset annotation for the currently decompiled function's name in this PR.

...

Test plan

  1. Fetch PR #17204 in radare2 and compile it before compiling this.
  2. Make sure the output is working as expected from the JSON output (check r2 PR first for making sure JSON output is implemented correctly for function names).
  3. Check code and other things that should be checked.

...

Closing issues

...

@karliss
Copy link
Member

karliss commented Jul 3, 2020

Same as with r2 changes, and cutter should better go into decompiler-refactoring branch until the whole changes are more complete and API is stabilized.

@karliss
Copy link
Member

karliss commented Jul 3, 2020

Since json output should contains this you can easily create a test for this.

@karliss
Copy link
Member

karliss commented Jul 3, 2020

What kind of actions are you planning to implement using this annotation?

@NirmalManoj
Copy link
Member Author

NirmalManoj commented Jul 3, 2020

@karliss The next action I am going to add is an action to rename functions. This annotation is required for it. Other than this, I don't have anything else in mind where we will need this necessarily. But I believe it will be useful in some other place.

@karliss
Copy link
Member

karliss commented Jul 3, 2020

Take a look at help for afn command used to rename function. It is `afn[?] name [addr] rename name for function at address (change flag too)". It would be more reliable to use address of the function directly instead of the name. While the name can be evaluated to obtain address it's messy and you need the correct name. There is function name, real name, flag name, demangled name. The address of function referenced at position of code would also have other uses.

void AnnotateFunctionName(ANNOTATOR_PARAMS){
auto func_name = node.child_value();
Copy link
Member Author

@NirmalManoj NirmalManoj Jul 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not related to the code here, making this comment to start a nested discussion)

It would be more reliable to use address of the function directly instead of the name. While the name can be evaluated to obtain address it's messy and you need the correct name. There is function name, real name, flag name, demangled name. The address of function referenced at position of code would also have other uses.

I didn't think that renaming using the name would get messy. But initially, I wanted to have the address in the function name annotations, just so that it might become useful later. But unfortunately, we don't have this information in XML. So I am not sure how I could do it @karliss @thestr4ng3r

Copy link
Member Author

@NirmalManoj NirmalManoj Jul 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/radareorg/radare2/blob/master/libr/include/r_anal.h#L277-L280
Does this name always refer to the 'function name' and not real name/flag name/demangled name.

Copy link
Member

@karliss karliss Jul 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The address can be clearly seen in the xml.

Copy link
Member Author

@NirmalManoj NirmalManoj Jul 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karliss I am unable to find it till now. We have the address of the current function in the XML(and I knew we could get that). But how can I find the addresses of all the functions in the decompiled code from the XML?

Copy link
Member

@thestr4ng3r thestr4ng3r Jul 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NirmalManoj Can you post an example xml here for reference?

Copy link
Member

@karliss karliss Jul 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a) How did you came to conclusion that IPTR_FSPEC is the correct address space kind?
b) did you to dump all the varnodes you get with their properties: type, whatever offset returns, the id displayed in xml, anything else potentially interesting by printing it to terminal?
c) The offsets in xml looked reasonable, take a look a t how xml is printed

Copy link
Member Author

@NirmalManoj NirmalManoj Jul 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a) In space.hh in Ghidra,

/// \brief Fundemental address space types
///
/// Every address space must be one of the following core types
enum spacetype {
  IPTR_CONSTANT = 0,	       ///< Special space to represent constants
  IPTR_PROCESSOR = 1,	       ///< Normal spaces modelled by processor
  IPTR_SPACEBASE = 2,	       ///< addresses = offsets off of base register
  IPTR_INTERNAL = 3,	       ///< Internally managed temporary space
  IPTR_FSPEC = 4,	       ///< Special internal FuncCallSpecs reference
  IPTR_IOP = 5,                ///< Special internal PcodeOp reference
  IPTR_JOIN = 6		       ///< Special virtual space to represent split variables
};

From this and the code that I have read, I guessed IPTR_FSPEC will be the only spacetype for function calls. So I tried it.
c) I was trying to find a way to get the offsets without extracting it from the XML. I did check saveXml functions (but not ruling out the possibility that I won't get anything from it in future)
b) I haven't done all of it. I will try it now.

Copy link
Member Author

@NirmalManoj NirmalManoj Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used a debugger to see the offsets of all the varnodes inside ctx->func.vbank.loc_tree. I will attach the binary below. It has 25 nodes under varnodes tag in XML output and it displays the offsets of all those correctly except those that give us function offsets. (Binary)
( Attached image shows the location that I am talking about clearly)
Screenshot from 2020-07-07 18-48-29 Screenshot from 2020-07-07 18-53-32

It will be great if someone with experience of Ghidra API could help me in understanding it more. I have spent a lot of time in the past two days trying to figure this out. @karliss @thestr4ng3r

Copy link
Member Author

@NirmalManoj NirmalManoj Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have finally found a way to extract offsets reliably.

Copy link
Member Author

@NirmalManoj NirmalManoj Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karliss @thestr4ng3r This PR is ready for more review.

@NirmalManoj
Copy link
Member Author

NirmalManoj commented Jul 5, 2020

global_variable.xml.txt
@thestr4ng3r Couldn't upload this file under that comment. After Karliss's comment, I did notice that the offsets are present under varnodes(but they are not directly connected by a reference), but I want help in parsing it.

@NirmalManoj
Copy link
Member Author

NirmalManoj commented Jul 7, 2020

a.zip
This is the binary mentioned in #123 (comment)

NirmalManoj added 4 commits Jul 7, 2020
1.formatted code
2. removed unnecessary parts
offset annotation for function name, replaced auto keyword, removed useless parts
// Code below makes an offset annotation for the function name(for the currently decompiled function)
RCodeAnnotation offsetAnnotation = {};
offsetAnnotation.type = R_CODE_ANNOTATION_TYPE_OFFSET;
offsetAnnotation.offset.offset = annotation.function_name.offset;
out->push_back(offsetAnnotation);
Copy link
Member Author

@NirmalManoj NirmalManoj Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the currently pushed commits, I have also implemented offset annotation for the currently decompiled function's name. See gif
offset_annotation_for_function_name

@NirmalManoj NirmalManoj changed the base branch from master to decompiler-refactoring Jul 8, 2020
return;
}
PcodeOp *op = opit->second;
FuncCallSpecs *call_func_spec = ctx->func->getCallSpecs(const_cast<PcodeOp *>(op));
Copy link
Member

@karliss karliss Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you decide to do a const cast here?

Copy link
Member Author

@NirmalManoj NirmalManoj Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FuncCallSpecs *getCallSpecs(const PcodeOp *op) const;
Because the argument is const PcodeOp. Now I think I should remove it.

Copy link
Member

@karliss karliss Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code should almost never contain contain const_casts. Valid uses cases for it are very rare. In this case you are not even changing the type using the cast. You might want to read a bit more about how const works.

Copy link
Member Author

@NirmalManoj NirmalManoj Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I understood why it's useless in this case.

Copy link
Member Author

@NirmalManoj NirmalManoj Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed const cast.

@NirmalManoj NirmalManoj changed the title function name annotation annotator implemented Function name annotation annotator implemented Jul 9, 2020
.travis.yml Outdated Show resolved Hide resolved
}
else
{
annotation.function_name.name = strdup("INVALID: Doesn't look like a function.");
Copy link
Member

@thestr4ng3r thestr4ng3r Jul 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just return without adding an annotation in these cases instead of adding an invalid one. Same for the other "INVALID" ones.

Copy link
Member Author

@NirmalManoj NirmalManoj Jul 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change this as suggested

Copy link
Member Author

@NirmalManoj NirmalManoj Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made all the suggested changes.

Copy link
Member Author

@NirmalManoj NirmalManoj Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I misread this suggestion. I will change it now. All the other changes are as suggested.

Copy link
Member Author

@NirmalManoj NirmalManoj Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Changed as suggested.

src/CodeXMLParse.h Outdated Show resolved Hide resolved
src/CodeXMLParse.cpp Outdated Show resolved Hide resolved
src/CodeXMLParse.cpp Outdated Show resolved Hide resolved
src/CodeXMLParse.cpp Outdated Show resolved Hide resolved
@NirmalManoj NirmalManoj requested a review from karliss Jul 10, 2020
@NirmalManoj NirmalManoj requested a review from thestr4ng3r Jul 10, 2020
NirmalManoj added 4 commits Jul 10, 2020
some still left, pushing to see result in CI
color coded output
@NirmalManoj
Copy link
Member Author

NirmalManoj commented Jul 10, 2020

@thestr4ng3r @karliss I have fixed tests. Jessie is still failing because of reasons I don't understand. I think this PR is good to merge now.

@thestr4ng3r
Copy link
Member

thestr4ng3r commented Jul 11, 2020

Looks like jessie is using r2 master. But that shouldn't be a blocker.

@thestr4ng3r thestr4ng3r merged commit afeca94 into rizinorg:decompiler-refactoring Jul 11, 2020
2 of 3 checks passed
@NirmalManoj NirmalManoj moved this from In progress to Done in Improving Decompiler Widget (GSoC) Jul 15, 2020
thestr4ng3r pushed a commit that referenced this pull request Aug 11, 2020
thestr4ng3r pushed a commit that referenced this pull request Aug 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants