Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IL String Rendering and Auto-detection Disabled for Read+Write Sections/Segments #2971

Closed
0bs3n opened this issue Feb 15, 2022 · 6 comments
Closed
Assignees
Labels
Effort: Low Issue should take < 1 week Impact: Medium Issue is impactful with a bad, or no, workaround
Milestone

Comments

@0bs3n
Copy link

0bs3n commented Feb 15, 2022

Version and Platform (required):

  • Binary Ninja Version: 3.0.3233-stable, 3.0.3255-dev
  • OS: Manjaro Linux
  • OS Version: Up to date

Bug Description:
The detection and IL rendering of ascii strings is dependent on the read/write flags of the segment and section. As far as I can tell, if a segment is loaded with SegmentFlag.SegmentWritable from a view plugin, no string detection is performed, and no string rendering is seen in IL referencing those strings. This is not desirable for many file formats which do not feature a defined distinction between writable and read-only data, such as ELF's .data/.rodata distinction. For example, the bFLT (Binary Flat) format just has one big .data section, with writable and read-only data intermixed, and no markers indicating which is which. This behavior is maintained even if the string data variables are manually defined as char/const char */[].

Confusingly, this behavior regarding segments does not occur if the segment is not mapped at all in the view plugin, and instead is mapped manually from the console once the view (with all the other segments) has been loaded into the binaryninja UI. In this case, issuing a bv.add_user_segment or bv.add_auto_segment with SegmentFlag.SegmentWritable does result in ascii strings being rendered in IL, though the data variable types are not automatically set to const char[] as they would if the segment was loaded read-only from the view.

Furthermore, regardless of segment flags, if a section is defined such that is contains the ascii strings being referenced/rendered in the IL which has ReadWriteDataSectionSemantics, the rendering of the strings in IL is removed and the variables are instead displayed as &data_12345. This is true both when defining sections from a view plugin, as well as manually from the UI console. Also, this behavior holds true even when the ascii strings in question have been manually defined as char/const char */[].

Steps To Reproduce:
Reproduction of this issue is a little unwieldy, as it requires fiddling with segments/sections before/after the associated view plugin has complete initialization. The most easily reproducible PoC will be to use two plugins I've developed for the bFLT file format and Blackfin architecture, which can be found here (bFLT) and here (blackfin, and opening the binary file seen in these screenshots, which I can provide on request (github won't allow me to attach it to this issue)

  1. Open the binary which has the .data section and associated segment loaded read-write
  2. Navigate to the main function (0x10000134) and observe that calls to puts (0x10000170) take char * as arguments, but they are not rendered
  3. Manually set the types of the data variables being passed to puts to const char [], observe no change to IL even after reanalysis
  4. Remove the .data section with bv.remove_auto_section(".data")
  5. redefine the .data section with bv.add_user_section(".data", 0x100011a0, 0x1e0, SectionSemantics.ReadOnlyDataSectionSemantics)
  6. Observe that strings are now rendered properly, but control flow analysis has become incorrect due to interpreting data in that section as read-only when it is not.

Alternatively, for a generic binary/view:

  1. Open a binary in mapped view, and define segments and sections such that ascii strings being referenced by code (such as a call to puts) are within a read/write data section.
  2. Observe that no string rendering takes place.

Expected Behavior:
The same behavior which applies to read-only segments and sections for the detection and rendering of ascii strings in IL should apply to read/write segments and sections as well. Alternatively, if there is some reason this is not desired by the BN team, it should be possible to specify something like SectionSemantics.MixedDataSectionSemantics when defining a section to indicate to Binary Ninja that the section is writable and thus control flow decisions should not be made based on the contents of that section, but may still contain constant values for the sake of string rendering and similar functionality.

A scenario in which an ascii string is passed to a call to puts where that string is actually writable and cannot be determined at analysis time is obviously a possibility (though the practice of setting a char buffer to some valid string before changing it is uncommon), and in that case the rendering of the string in the function call would be incorrect. So perhaps this behavior as the default for ReadWriteSectionSemantics isn't the way to go, but there are plenty of file formats which do require this behavior and it should be possible to specify.

Screenshots:
Main function where a global in .data is compared to a constant, and one of two strings (also in .data) are printed depending on the result
20220215_11h36m48s_grim
Setting the .data section as ReadOnly does fix the string rendering, but now control flow is broken since the Binary Ninja interprets the value to compare as constant
20220215_11h36m15s_grim

Additional Information:
Possibly related to #2969

@plafosse plafosse added Impact: Medium Issue is impactful with a bad, or no, workaround and removed Impact: Medium Issue is impactful with a bad, or no, workaround labels Feb 21, 2022
@plafosse plafosse assigned plafosse and bpotchik and unassigned plafosse Feb 21, 2022
@bpotchik bpotchik added Effort: Low Issue should take < 1 week Impact: Medium Issue is impactful with a bad, or no, workaround labels May 12, 2022
@bpotchik bpotchik removed their assignment May 12, 2022
@plafosse plafosse added this to the 3.2 (Windows) milestone Jun 28, 2022
@0cyn
Copy link
Member

0cyn commented Aug 10, 2022

This should be resolved in builds >= 3.1.3617

@0cyn 0cyn closed this as completed Aug 10, 2022
@0bs3n
Copy link
Author

0bs3n commented Aug 10, 2022

Could you explain the resolution so I can test? I've just updated to 3.1.3617-dev and I don't see a change in behavior. Thanks!

@0cyn
Copy link
Member

0cyn commented Aug 10, 2022

Interesting, we now convert constants to constant pointers in MLIL if they point to any allocated address, which should be inlining it here; could you send the binary? (you should be able to upload it here if it's wrapped in a .zip file; if not, something like https://bashupload.com/ also works fine)

@0cyn 0cyn reopened this Aug 10, 2022
@0bs3n
Copy link
Author

0bs3n commented Aug 10, 2022

Sure, I've attached a zip including the binary in question and also the required plugins for analyzing the file in BN. libarch-blackfin.so and binaryninja-bflt are the plugins, and lighthouse the binary. I'm assuming here that the architecture shouldn't matter, but let me know if this isn't a suitable PoC.

In the binary, 0x10000fdc is main(), and the calls to sub_10000ce4 you can find there all take a char * as their first argument and are a good example of the behavior.

(If you'd rather not load an .so from a stranger you can pull and build the blackfin plugin from source here)

string_render_poc.zip

@0cyn
Copy link
Member

0cyn commented Aug 12, 2022

Alright, this should be actually fixed in builds >= 3.1.3624 😊

@0cyn 0cyn closed this as completed Aug 12, 2022
@0bs3n
Copy link
Author

0bs3n commented Aug 15, 2022

Thanks! I just want to note that after this change (which did effectively force the rendering of probable strings for read/write sections/segments) made it so that the bug described in #2969 (which originally only affected read-only sections/segments) now also affects read-write sections/segments, breaking rendering of structures or other types which happen to have a string at offset zero.

I've attached a bndb of the same "lighthouse" binary supplied previously which has structures defined to demonstrate the issue, as well as a plugin built for the most recent API version. bndb_and_plugin.zip

Here are some images as well:

Before the fix, see address 0x10000ff8
20220815_10h06m38s_grim

After the fix, see address 0x10000ff8
20220815_10h07m58s_grim

The fix here did address this issue so I think it should remain closed, and thanks very much for taking care of it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Effort: Low Issue should take < 1 week Impact: Medium Issue is impactful with a bad, or no, workaround
Projects
None yet
Development

No branches or pull requests

4 participants