Skip to content

shadowhook v2.0.1

Latest

Choose a tag to compare

@caikelun caikelun released this 15 Jun 11:46
· 1 commit to main since this release
v2.0.1
854c775

Announcement

1. Added compatibility with Android 17.

  • Supported Android OS versions: Android 4.1 - 17 QPR1 Beta 4.

New Features

1. Added debugging information for hook/intercept operations.

  • A "trace" data item has been added to the existing "operation records." This is a compact plain-text string recording debug info related to the current operation: such as the original and replacement instructions for inline hooks, as well as the addresses and instructions of various trampolines.
  • Added the tools/record_parser.py script to parse "operation records" (including the new trace data); it supports parsing one or multiple records at a time.

Bug Fixes

1. Fixed an intermittent ANR bug.

  • This bug was introduced in version 2.0.0.
  • An ANR could occur if one thread was executing dlclose() while another was executing shadowhook_hook_sym_name() within the same process.

2. Fixed a bug where threads could not enter the shared-mode proxy function after executing pthread_key_clean_all().

  • In the bionic pthread implementation, pthread_exit() is called during thread termination; this invokes pthread_key_clean_all(), which performs four rounds of TLS cleanup. After pthread_key_clean_all() returns, pthread_exit() proceeds to call functions like munmap().
  • Shared-mode proxy functions rely on TLS to store context information, but Shadowhook's own TLS data was being cleared during the first round of pthread_key_clean_all(). If the proxy function was entered after this point, the original shared-mode logic could no longer function correctly. To minimize potential side effects, the previous approach was to skip the "proxy function" and call the "original function" directly. - Scope of impact: In addition to the munmap() call within pthread_exit(), this also covers functions called by any TLS destructor functions executed during the final three rounds of TLS cleanup.

3. Fixed a bug where the target ELF might not be found on Android 6.0.

  • On Android 6.0, if an APK has android:extractNativeLibs=false set, the ELF pathname returned by dl_iterate_phdr() does not include the !/lib/<ABI>/<lib_name> suffix. This caused the target ELF to be missed when using dl_iterate_phdr(). We fixed this issue by dynamically parsing the ELF .dynamic section in memory.

4. Fixed a bug in shared mode where proxy functions could not be entered after the stack exceeded 16 frames.

  • This bug was introduced in version 2.0.0.
  • In shared mode, to prevent recursive calls, each thread uses a stack structure to track proxy functions that have been entered but have not yet returned. In version 2.0.0, the stack size constant SH_HUB_STACK_FRAME_MAX was reduced from 127 to 16 for memory optimization; this violated an implicit API contract, preventing entry into proxy functions in certain scenarios. In the current version, SH_HUB_STACK_FRAME_MAX has been reverted to 127.

5. Fixed an intermittent memory corruption bug.

  • This bug was introduced in version 2.0.0.
  • When the gap size of the final memory page of an arm64 ELF executable segment fell within the range of [4, 16) bytes, the hooking process would overwrite the subsequent memory area, leading to subtle issues or crashes.

Improvements

1. Improved the execution speed of shadowhook_dlopen().

  • We have implemented caching for frequently accessed system libraries such as libart.so. Subsequent calls to shadowhook_dlopen() for these libraries now retrieve information from the cache, eliminating the need to acquire the linker's global mutex lock.
  • As a result of this optimization, APIs involving symbol lookup, such as shadowhook_hook_sym_name() also execute faster.

2. Improved concurrency for hook/intercept APIs.

  • In previous versions, all hook or intercept API calls were executed serially. Starting with this version, calls targeting different addresses are executed concurrently (though calls targeting the same address remain serial). This significantly alleviates thread blocking and waiting when multiple threads perform hook or intercept operations during app startup.

3. Optimized specific atomic operations and memory ordering.

  • In previous versions, the implementation of certain atomic operations and memory ordering was imprecise; in isolated cases, this could theoretically lead to issues on the arm64 architecture.

4. Optimized global virtual memory usage.

  • The total virtual memory usage of the stack in the hub module has been reduced from 4M bytes to 3M bytes.

公告

1. 兼容 Android 17。

  • 支持的 Android OS 版本:Android 4.1 - 17 QPR1 Beta 4

新特性

1. 新增了 hook / intercept 操作的调试信息。

  • 在原有的“操作记录”中增加了一个数据项“trace”,它是一段紧凑的纯文本信息,记录了当前操作相关的调试信息,例如 inline hook 替换的原指令和新指令,各级跳板的地址以及其中的指令等。
  • 增加 tools/record_parser.py 脚本,用于解析“操作记录”(包括新增的 trace),一次可以解析一条或多条操作记录。

Bugs 修复

1. 修复了偶发的 ANR bug。

  • 这个 bug 是在 2.0.0 版本中引入的。
  • 在进程中,如果一个线程正在执行 dlclose(),另一个线程正在执行 shadowhook_hook_sym_name(),这时可能会发生 ANR。

2. 修复了线程执行 pthread_key_clean_all() 之后无法进入 shared 模式代理函数的 bug。

  • 在 bionic pthread 实现中,线程退出过程中会执行 pthread_exit(),其中会调用 pthread_key_clean_all(),这个函数中会执行 4 轮清理 TLS 的操作。pthread_key_clean_all() 返回后,pthread_exit() 还会继续调用 munmap() 等操作。
  • shared 模式的代理函数需要通过 TLS 保存上下文信息,shadowhook 自身的 TLS 会在 pthread_key_clean_all() 的第 1 轮就被清理掉。此时如果再进入代理函数,shared 模式的原有逻辑已经不可继续,为了最小化可能产生的副作用,之前的做法是跳过“代理函数”直接调用“原函数”。
  • 影响范围:除了 pthread_exit() 中的 munmap() 以外,还包括 TLS 的后 3 轮清理中调用的所有 TLS 销毁函数(destructor function)中调用的其他函数。

3. 修复了 Android 6.0 中可能无法找到目标 ELF 的 bug。

  • 在 Android 6.0 中,如果 APK 设置了 android:extractNativeLibs=false,此时 dl_iterate_phdr() 返回的 ELF pathname 末尾不会包含 '!/lib//<lib_name>' 部分,这会导致通过 dl_iterate_phdr() 找不到目标 ELF。我们通过动态解析内存中 ELF .dynamic 的方式修复了这个 bug。

4. 修复了 shared 模式中 stack 超过 16 层后无法进入代理函数的 bug。

  • 这个 bug 是在 2.0.0 版本中引入的。
  • 在 shared 模式中,为了避免循环调用,每个线程使用一个 stack 结构保存“已进入但还未返回”的代理函数信息。在 2.0.0 中,为了内存优化,把 stack 大小 SH_HUB_STACK_FRAME_MAX127 改成了 16,这破环了潜在的 API 契约,导致了在某些情况下无法进入代理函数。在现有版本中把 SH_HUB_STACK_FRAME_MAX 改回了127

5. 修复了偶发的内存踩踏 bug。

  • 这个 bug 是在 2.0.0 版本中引入的。
  • 当 arm64 ELF 的可执行 segment 的最后一个内存页 GAP 大小在 [4, 16) 字节范围内时,hook 会覆盖其后的内存区域,导致难以察觉的问题或崩溃。

改进

1. 提升了 shadowhook_dlopen() 的执行速度。

  • 我们对 libart.so 等高频关注的系统库信息做了缓存,再次 shadowhook_dlopen() 这些系统库时支持从缓存中获取信息,不会再去持有 linker 的全局 mutex 锁。
  • 由于这项优化,所有涉及到符号查找的 API 也会执行的更快,比如 shadowhook_hook_sym_name() 等。

2. 改进了 hook / intercept API 的并发性。

  • 在之前的版本中,所有的 hook 或 intercept API 调用都是串行执行的。从当前版本开始,对于不同目标地址的 hook 或 intercept API 调用都会被并发的执行(对于同一个目标地址的 hook 或 intercept API 调用依然是串行执行的)。在 APP 启动阶段,多个线程并发执行 hook / intercept 操作时,能显著的缓解线程阻塞等待的情况。

3. 优化了部分 atomic 操作以及 memory order。

  • 在之前的版本中,部分 atomic 操作以及 memory order 是不严谨了,个别在 arm64 中理论上会存在问题。

4. 优化了全局虚拟内存占用。

  • hub 模块中的 stack 总虚拟内存占用从 4M bytes 减少到 3M bytes。