New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
demangle_ms bugs/improvements #1653
Comments
Thanks for all the details here. Demangler problems has actually been a common complaint. I'll start to look at these. |
Any chance of improved support being in the next release? I don't expect everything to be fixed perfectly at once, but even some slow improvements really would help out. |
Two more issues I've come across: Part of the name stored in type
Hidden retptr when returning structures
|
Another issue: Array extents demangled in reverse order
|
Another minor issue: int/long ambiguityBecause the demangler uses the fixed width integer types, it looses the difference between
|
Seems like demangling some special symbols broke at some point:
|
Creating a task list for these:
|
Here are a couple more I just found:
I'm not sure the PDB is correct about these:
|
Here's another one that is wrong and messy:
And one that binja doesn't get at all:
Checked using |
So I had a significant amount of failures with names that weren't demangling and I ran a variety of them through undname.exe from VS 2019. I'm not sure which parts of the mangled names aren't being handled and are actually useful so I apologize for the size of this list. All of the following failed on 3.1.3718
|
|
I've fixed many of these issues as of 3.1.3761 I'm not closing this issue but I am removing the milestone. |
The results are significantly better in the binaries I've been working with, I appreciate the time you've spent on it @plafosse. Another couple examples that are being problematic on a x86_64 binary compiled with visual studio: I picked these two because they're much shorter than many of the functions that I posted above and fixing issues with these smaller symbols may have wider reaching effects. |
As of 3.2.3963-dev, virtual offset thunk functions are also handled. These are generally of the form foo::bar`something{1234,5678}' |
Here's another one from the minecraft bedrock server:
FWIW LLVM gets this one properly. Maybe we should investigate using their MS demangler at some point. |
As of builds 3.5.4276, names of the form |
Here's a few more from notepad.exe:
|
Addressed in 3.5.4468 though not exactly a demangler problem: some PDB symbol names start with a DEL ( |
Do you know the reason for that by any chance? If not does anyone else here know? |
Nope, I have no clue why they do that. This only happens on non-mangled names though. It looks like the various null thunk data symbols stored around the module imports are the only places that have this, so it may be related to that. I'm not about to dive into reversing msvc to see why it is generated though. |
Here's another one. It looks like the backref is getting eaten somewhere and it cannot reference it:
|
Both of these failed to demangle in BNinja but were able to be demangled fine by undname
I was a bit surprised to see this one fail, it was mangled by MSVC 14.38
|
Now included, even if nobody here asked for it: bare names of the construction
I've only seen these used in RTTI TypeDescriptor name fields, so if you're working on an RTTI plugin this may be of use :) |
I really appreciate seeing movement in this area @CouleeApps :) |
As of 3.6.4615, a bug in template back-references has been fixed. This has fixed the following:
Couple slight deviations from MSVC/LLVM but I think they are just thisptr insertion and lack of type system support for public/protected/static status. |
That change improved things significantly, thanks a ton @CouleeApps :) Now, I really do hate to be that type of person but I found another symbol that undname.exe demangles that BNinja fails on...
|
A couple more complex cases for good measure :'(
->
and
->
Curtesy of GraalVM Native Image :( |
3.6.4629 added slightly more stuff for type info names since I found the code in LLVM that parses them and was able to expand our parser to cover the cases. I don't think anyone will actually find one of these, but technically now you can demangle const and volatile type names too, so |
Some basic names with the construction |
Where did you find mangled symbols that uses this structure? |
There are a couple examples in this thread, and someone sent me a few |
Here's another one I've found: cpp type conversion operators:
Notably, the type name for the thisptr is empty string. |
Binary Ninja Version: 2.0.2138-dev, d031c340
Platform: Windows 10 Version 1903
After using demangle_ms a lot lately, here's a list of problems I have come across:
Vftables have type_class TypeClass.NamedTypeReferenceClass
Multiple inheritance vftables produce invalid types and contain the parent in the type tokens instead of the name
Certain symbols contain the attributes in their type
Attributes (static, virtual, access modifier, etc.) could really do with being returned separately instead of discarding them or making them part of the type string.
Functions with no parameters have unnecessary void parameter
Functions not given a calling convention (#1390)
Non-static member functions are missing hidden 'this' parameter
Checking whether the symbol is a non static function with an access modifier (public/protected/private) seems to be the best way to determine if the parameter is needed.
Cannot demangle member function pointers
Not sure how feasible it is to create types for these since the size of member function pointers varies.
Incorrect demangled names
The text was updated successfully, but these errors were encountered: