-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Teach memray to patch libraries that are part of the macOS linker cache #401
Conversation
77eb601
to
e961828
Compare
Codecov ReportPatch coverage has no change and project coverage change:
Additional details and impacted files@@ Coverage Diff @@
## main #401 +/- ##
==========================================
- Coverage 85.00% 84.94% -0.06%
==========================================
Files 29 29
Lines 3621 3621
==========================================
- Hits 3078 3076 -2
- Misses 543 545 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
b184793
to
cb6a9ea
Compare
23d7a19
to
b7ff4bf
Compare
// Constants for ARM64 architecture | ||
constexpr uint32_t ADRP_MASK = 0x9F000000; | ||
constexpr uint32_t ADRP_INSTRUCTION = 0x90000000; | ||
constexpr uint32_t ADRP_ARG_MASK1 = 0x00FFFFE0; | ||
constexpr uint32_t ADRP_ARG_SHIFT1 = 3; | ||
constexpr uint32_t ADRP_ARG_MASK2 = 0x60000000; | ||
constexpr uint32_t ADRP_ARG_SHIFT2 = 29; | ||
constexpr uint32_t ADRP_ARG_NEGATIVE = 0x00800000; | ||
constexpr int32_t ADRP_ARG_SIGN_EXTEND = 0xFFF00000; | ||
constexpr uint32_t ADD_INSTRUCTION_MASK = 0xDFC00000; | ||
constexpr uint32_t ADD_INSTRUCTION = 0x91000000; | ||
constexpr uint32_t ADD_ARG_MASK = 0x00000FFF; | ||
|
||
// Constants for x86_64 architecture | ||
constexpr uint32_t JMP_INSTRUCTION = 0x25ff; | ||
constexpr size_t X86_64_PLT_ENTRY_SIZE = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I don't find these named constants particularly helpful. Each one is used in only a single function, their names are super arbitrary (ADRP_ARG_SHIFT2
is a terrible name), and having to jump around in the code to reference them makes it harder to understand this code than it would be if you could just see the constants' values at the point where they're used.
I've taken a swing at making this code more readable - as readable as possible, at least. I also made sort of a lot of changes, considering that I can't easily test this myself - sorry 😅
I believe I've left all of the implementation the same, and just rearranged and renamed things to make what's happening clearer for future readers, but I'll need you to check my work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After my changes, this all LGTM, so if my changes all seem correct to you, I think this is good to land.
Since macOS Ventura there have been changes in the way symbols are included in libraries in the shared linker cache. Specifically, the symbols that we care about are now stored in a Procedure Linkage Table (PLT) that does't have an associated GOT in a separate DATA segment that can be easily read. To understand this change, let's first explain the purpose of the PLT and the GOT. In dynamic linking, the PLT is a data structure used to handle the resolution of function calls to external symbols. It contains a sequence of instructions that redirect the control flow to the appropriate symbol implementation. The PLT acts as a dynamic dispatch table, allowing the resolution of symbols at runtime. On the other hand, the GOT is a data structure used for position-independent code. It stores addresses of external symbols that are referenced by the shared object. The addresses in the GOT are initially set to point to stub functions, which are small pieces of code responsible for resolving the actual addresses of the symbols during runtime. When a function is called, the stub code in the PLT is executed, looks for the address in the GOT and redirects the control flow to it. In previous versions of macOS, the symbols of interest were stored in a GOT that resided in a separate DATA segment that could be easily read and modified. This allowed us to know immediately where to patch by reading the metadata in the relevant DATA section. However, starting from macOS Ventura, the symbols of interest are no longer stored in a GOT that we can easily locate by reading DATA segments. This commit adds functionality to patch symbols in the __stubs/__auth_stubs PLT section of shared libraries that are part of the shared cache. This is necessary because the symbols in this section point to a Global Offset Table (GOT) that cannot be accessed directly through data sections. The commit implements the logic to analyze the assembly code of the PLT stubs and calculate the address of the GOT entry. The patching process is performed for each hooked function in the section, applying the necessary modifications to redirect the symbols to the corresponding hook functions. This allows for proper interception and manipulation of the function calls in system libraries (like the C++ standard template library) that are part of the linker shared cache. Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
No description provided.