Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ghidra does not correctly import dylibs from dyld shared caches #682

Closed
pwmoore opened this issue Jun 12, 2019 · 15 comments
Closed

Ghidra does not correctly import dylibs from dyld shared caches #682

pwmoore opened this issue Jun 12, 2019 · 15 comments
Assignees
Milestone

Comments

@pwmoore
Copy link

pwmoore commented Jun 12, 2019

Describe the bug
iOS devices ship with all shared libraries (dylibs) in a unified cache. Ghidra seems to offer the feature to import individual dylibs from a shared cache, but does not seem to parse them correctly.

To Reproduce
Steps to reproduce the behavior:

  1. Open a new project
  2. Choose "File->Import File"
  3. Select the dyld_shared_cache_arm64[e] file.
  4. Choose File System.
  5. Select the dylib to import (i.e. Foundation)
  6. Continue normal import steps

Expected behavior
IDA Pro successfully imports a dylib from a shared cache as if they were individual dylib files. It correctly locates the Mach-O header and processes the dylib.

Screenshots
The first screenshot shows IDA 7.2's listing of Foundation.framework from an iPhone XS 12.3.1 shared cache. The cache can be retrieved from the IPSW here.

Screen Shot 2019-06-11 at 11 10 15 PM

Note that IDA has correctly located the dylib's Mach-O header and parsed it correctly. However, in Ghidra, the load address isn't a Mach-O header at all.

Screen Shot 2019-06-12 at 9 15 41 AM

In the next two screenshots I am trying to show the NSLog() function. The correct function is displayed in IDA here:

Screen Shot 2019-06-12 at 9 21 40 AM

But Ghidra has just placed the NSLog() label in the middle of some other code.

Screen Shot 2019-06-12 at 9 21 18 AM

Environment (please complete the following information):

  • OS: macOS 10.14.5
  • Java Version: 11.0.3
  • Ghidra Version: 9.0.4
@pwmoore pwmoore added the Type: Bug Something isn't working label Jun 12, 2019
@ryanmkurtz ryanmkurtz self-assigned this Jun 12, 2019
@ryanmkurtz
Copy link
Collaborator

Thanks for reporting this, it's something we are aware of and its in the queue. I chose to implement a new loader for the entire DYLD cache first, since things like jtool do a good job of extracting DYLIB's which can then be fed into Ghidra. The new loader will be released in Ghidra 9.1, but it's currently in master if you want to try it out...I would love some feedback. It's main limitation right now is speed, since there are so many embedded DYLIB's to analyze.

I do plan on investigating this issue though.

@pwmoore
Copy link
Author

pwmoore commented Jun 12, 2019

Awesome, I'll try and get the master branch going sometime this week and let you know how it goes.

@pwmoore
Copy link
Author

pwmoore commented Jun 24, 2019

So I started playing around a little with the master branch and loading an entire dyld cache, and I have to say I'm impressed. It managed to load the entire cache... I had to first bump up the amount of memory, and as expected its a little slow, but it did it, which is something IDA can't do. :) I'm still playing around with it, but I did notice that it doesn't seem to be picking up symbol names. See the attached screenshot for NSLog (the same function above). Should I open a new issue for this or any other issues with the master branch?

Thanks for the awesome work so far on this, I'm excited to see where you guys take it.

Screen Shot 2019-06-24 at 1 29 46 AM

@ryanmkurtz
Copy link
Collaborator

ryanmkurtz commented Jun 24, 2019

Thanks for the positive feedback! Sure, I think new issues that pertain to the DYLD loader would be best...we can keep this issue about the DYLD filesystem and extracting individual DYLIB's from it.

Regarding the symbols though...there can be so many that it takes a really long time for the symbol tree and symbol table to display them. It looks like your symbol tree is currently trying to filter them. But, if that function you are on should currently be labeled as NSLog, that is new issue that I can look into.

@ryanmkurtz
Copy link
Collaborator

Hmmm, all the symbol addresses seem to have been truncated to 32-bits. When I fix that hopefully that will fix the behavior you are seeing.

@ryanmkurtz
Copy link
Collaborator

The symbol issue should be fixed now.

@blacktop
Copy link

blacktop commented Sep 1, 2019

Another way to split the dyld is to use Xcode, however, when I load the split dylib into Ghidra it does not recognize it as a macho or what processor to use??

jtool split:

$ head -c 100 dyld_shared_cache.JavaScriptCore|hexdump -C

00000000  cf fa ed fe 0c 00 00 01  02 00 00 00 06 00 00 00  |................|
00000010  16 00 00 00 d8 0e 00 00  85 00 11 c2 00 00 00 00  |................|
00000020  19 00 00 00 68 03 00 00  5f 5f 54 45 58 54 00 00  |....h...__TEXT..|
00000030  00 00 00 00 00 00 00 00  00 f0 14 88 01 00 00 00  |................|
00000040  00 90 d3 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000050  00 90 d3 00 00 00 00 00  05 00 00 00 05 00 00 00  |................|
00000060  0a 00 00 00                                       |....|
00000064

Xcode split:

$ head -c 4200 System/Library/Frameworks/JavaScriptCore.framework/JavaScriptCore|hexdump -C
00000000  ca fe ba be 00 00 00 01  01 00 00 0c 00 00 00 02  |................|
00000010  00 00 10 00 01 03 d0 00  00 00 00 0c 00 00 00 00  |................|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001000  cf fa ed fe 0c 00 00 01  02 00 00 00 06 00 00 00  |................|
00001010  16 00 00 00 d8 0e 00 00  85 00 11 42 00 00 00 00  |...........B....|
00001020  19 00 00 00 68 03 00 00  5f 5f 54 45 58 54 00 00  |....h...__TEXT..|
00001030  00 00 00 00 00 00 00 00  00 f0 14 88 01 00 00 00  |................|
00001040  00 90 d3 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00001050  00 90 d3 00 00 00 00 00  05 00 00 00 05 00 00 00  |................|
00001060  0a 00 00 00 00 00 00 00                           |........|

Notice that Xcode seems to be wrapping the machO in a CAFEBABE ??

I have heard that Xcode only splits the dyld well enough to attach a debugger to, so the branch islands etc etc aren't handled and symbols might not be fixed. It would be amazing if Ghidra could handle this as I waited for many hours for Ghidra to load the entire dyld_shared_cache only to have my instance crash :(

@blacktop
Copy link

blacktop commented Sep 1, 2019

UPDATE: I tried using XCode 11 beta and it seems to not wrap it in the CAFEBABE so... nevermind 😖

👁 ❤️ 🐉

@ryanmkurtz ryanmkurtz removed their assignment Jan 28, 2020
@tmm1
Copy link

tmm1 commented Jul 26, 2020

Is the best way to use ghidra still to either load the entire cache or use another tool to split first?

@charlesomer
Copy link

Is it still only possible to decompile a framework successfully by loading the whole cache? If so, does anyone know of a working guide to extract frameworks?

@ryanmkurtz
Copy link
Collaborator

You could use something like jtool to extract the component you care about, and then import it into Ghidra.

@charlesomer
Copy link

charlesomer commented Dec 30, 2020

Does jtool still work? I can't get jtool or jtool2 to extract a framework from the dyld_shared_cache_arm64e or dyld_shared_cache_x86_64 found on macOS. I'm using this command for jtool2: jtool2 -e AirPlayReceiver ./dyld_shared_cache_x86_64 which results in errors similar to: Warning: File is likely truncated (or header corrupt?) Binding opcodes falls outside file.

@ryanmkurtz
Copy link
Collaborator

I haven't tested jtool in a long time so it's definitely possible this solution will no longer work.

@charlesomer
Copy link

Yeah it looks like (at least for what I’m trying), jtool doesn’t work with the current macOS cache. Which leaves little options left by the looks of it. I tried increasing the ram allocated to 15GB and Ghidra still couldn’t open the entire cache unfortunately either :(

@mrakers
Copy link
Contributor

mrakers commented Feb 16, 2021

@ryanmkurtz @charlesomer
This is a bit late but the best way to extract dylibs from the cache is to use the sample binary Apple provides. You simply compile the sample binary found at the bottom of the file found here - https://opensource.apple.com/source/dyld/dyld-750.5/launch-cache/dsc_extractor.cpp.auto.html

Note: However this is by no means a perfect solution and is annoying because this requires you to have a Mac.

@ryanmkurtz ryanmkurtz self-assigned this Feb 17, 2021
@ryanmkurtz ryanmkurtz added this to the 10.0 milestone May 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants