feat(background-thread): add restart(mode, reason) and fix iOS reload crash#54
Conversation
… crash - New TurboModule method `restart(mode, reason)` to replace `react-native-restart`: `mode='ui'` reloads the main JS runtime only (bg stays hot — language/devSettings), `mode='all'` reloads both runtimes (OTA install/switch, resetData). - iOS: sequence SharedRPC quiesce → optional bg host release → `RCTTriggerReloadCommandListeners` on the main thread; AppDelegate.hostDidStart re-arms the main SharedRPC listener and re-spawns the bg host for `mode='all'`. - Android: invalidate SharedRPC listeners via new JNI `nativeInvalidateSharedRpc` then process-restart (parity with react-native-restart-newarch). - Fix iOS EXC_BAD_ACCESS on main runtime reload: SharedRPC carries a per-listener `alive` atomic flipped synchronously in `invalidate(runtimeId)`, and executor lambdas on both iOS sites now capture `RCTInstance` `__weak` so a torn-down instance no-ops cleanly instead of dispatching into invalidated memory. - Bump `@onekeyfe/react-native-background-thread` to 3.0.32.
…('ui')
Replaces the v1 placeholder where Android did a full-process restart for
both modes. Now matches iOS semantics: `mode='ui'` keeps the bg runtime
hot, only the main JS runtime reloads.
- Capture `mainReactHost` lazily in `installSharedBridgeInMainRuntime` via
`(applicationContext as ReactApplication).reactHost`. Stays null on
non-bridgeless hosts.
- `restart()` with `mode='ui'`: after SharedRPC invalidate('main'),
resolve the promise, then post to UI thread and call
`ReactHost.reload(reason)`. JS bootstrap re-invokes `installSharedBridge`
which refreshes `mainRuntimePtr` and re-arms the "main" SharedRPC
listener — symmetric with iOS where AppDelegate.hostDidStart does the
same job.
- Fallback to process restart when `mainReactHost` is null or
`reload()` throws — keeps semantics consistent across host setups.
- `mode='all'` unchanged: process restart via `Runtime.exit +
makeRestartActivityTask`. OTA install/switch wants a clean re-read of
both bundles from disk, and Android has a clean self-relaunch path that
iOS lacks, so we don't pay the sequencing complexity of soft-reloading
both runtimes.
- Clear `mainReactHost` in `destroy()` to match other lifecycle fields.
Not yet verified end-to-end — needs a build against the host app to
exercise: bridgeless detection, listener re-arm timing, bg-to-main RPC
during the reload window. Filed for follow-up testing.
There was a problem hiding this comment.
Pull request overview
This PR introduces a new restart(mode, reason) TurboModule API for coordinated runtime restarts (replacing direct react-native-restart usage) and hardens SharedRPC/runtime teardown paths to prevent iOS reload crashes.
Changes:
- Add
restart(mode, reason)to the BackgroundThread TurboModule surface and wire it through iOS/Android implementations. - Fix iOS reload
EXC_BAD_ACCESSby adding SharedRPC per-listener liveness invalidation and switching executor lambdas to__weakRCTInstancecaptures. - Add Android JNI hook to invalidate SharedRPC listeners before restart, and bump the module/changelog to
3.0.32.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| native-modules/react-native-background-thread/src/NativeBackgroundThread.ts | Adds restart(mode, reason) to the TurboModule spec with usage docs. |
| native-modules/react-native-background-thread/package.json | Bumps module version to 3.0.32. |
| native-modules/react-native-background-thread/ios/BackgroundThreadManager.mm | Implements restart sequencing (SharedRPC invalidate → optional bg teardown → RCTTriggerReloadCommandListeners) and weak-capture executor. |
| native-modules/react-native-background-thread/ios/BackgroundThreadManager.h | Declares restartWithMode:reason:completion: with documentation. |
| native-modules/react-native-background-thread/ios/BackgroundThread.mm | Exposes restart(mode, reason) as a TurboModule Promise method. |
| native-modules/react-native-background-thread/ios/BackgroundThread.h | Adds the restart method declaration. |
| native-modules/react-native-background-thread/ios/BackgroundRunnerReactNativeDelegate.mm | Uses weak-capture executor for bg runtime SharedRPC installation. |
| native-modules/react-native-background-thread/cpp/SharedRPC.h | Adds per-listener alive flag and declares invalidate(runtimeId). |
| native-modules/react-native-background-thread/cpp/SharedRPC.cpp | Implements listener invalidation and double-check liveness in notify dispatch. |
| native-modules/react-native-background-thread/android/src/main/java/com/backgroundthread/BackgroundThreadModule.kt | Wires TurboModule restart to the manager. |
| native-modules/react-native-background-thread/android/src/main/java/com/backgroundthread/BackgroundThreadManager.kt | Implements restart flow and JNI invalidation call before process restart. |
| native-modules/react-native-background-thread/android/src/main/cpp/cpp-adapter.cpp | Adds nativeInvalidateSharedRpc JNI implementation calling SharedRPC::invalidate. |
| CHANGELOG.md | Documents 3.0.32 feature + iOS crash fix details and version bump. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Reviewer flagged a mix of integration-contract gaps, an Android error- propagation bug, and a couple of C++ housekeeping items. Fixes: iOS — host-integration contract (#1, high) - Document the post-reload contract explicitly on `installSharedBridgeInMainRuntime:` and `restartWithMode:` headers. The host AppDelegate's `hostDidStart:` must re-invoke both the SharedBridge install (every reload) and `startBackgroundRunner` (mode='all' only). - `restartWithMode:` now arms a one-shot `RCTJavaScriptDidLoadNotification` observer that fires ~1.5s after the new main bridge loads: * mode='all': self-respawns the bg runtime if the host AppDelegate didn't (idempotent via the existing `isStarted` guard, so a correctly- wired host pays nothing). * Both modes: logs `[BTLogger error:]` if the host failed to re-call `installSharedBridgeInMainRuntime:`. Tracked via a new atomic flag `mainSharedBridgeInstalled` flipped in the install lambda. Surfaces integration omissions in production logs instead of silent RPC drop. JS spec — cross-platform asymmetry (#2, high) - `NativeBackgroundThread.ts` JSDoc now spells out: iOS preserves the process across `mode='all'` (soft reload); Android kills it via `Runtime.exit(0)`. Callers depending on native process-level state (timers, singletons, in-memory caches, foreground services) must not assume cross-platform survival. Android — false-success promise resolution (#3, medium) - `triggerProcessRestart` returns `Boolean` so failure (intent not resolvable, `startActivity` `SecurityException`) propagates instead of being swallowed; caller rejects the Promise with `BG_RESTART_ERROR`. - `mode='ui'` soft-reload moves `promise.resolve(null)` inside the `Handler.post` success branch — a synchronous `ReactHost.reload()` throw now routes through the same fallback-then-reject path instead of reporting false success because resolve already happened. SharedRPC — clean up dead listener entries (#6, low) - `invalidate(runtimeId)` now erases the listener entry from `listeners_` instead of leaving it with `alive=false`. Already-dispatched executor lambdas hold their own `shared_ptr<alive>` snapshot so erasing is race-safe; it only prevents NEW `notifyOtherRuntime()` snapshots from picking up the dead entry. Stops a permanently-dead listener from hanging around when a `mode='all'` restart never gets its post-reload re-install. - `install()`'s defensive dedup loop is now documented as the legacy / no-prior-invalidate fallback path (#7, low). Reviewer #4 (no tests) and #2's Android-side TODO (soft-reload both runtimes via `ReactHost.reload`) remain follow-ups — flagged in PR description, not in scope here.
Round-1 self-heal observer hinged on RCTJavaScriptDidLoadNotification, which reviewer flagged as unreliable in bridgeless / NewArch. The fix also surfaced two secondary hazards (observer leak when notification never fires; cross-generation flag misreporting on concurrent restarts). All three close together by dropping the notification entirely. iOS — replace notification observer with plain dispatch_after (#1, #2, high) - The two signals the post-reload check needs (mainSharedBridgeInstalled, isStarted) are this module's own state. No external timing anchor is required; the notification was only ever an attempt at one. A direct dispatch_after on the main queue at kRestartHealthCheckDelaySeconds (3s, constant — empirical 6× margin over baseline hostDidStart chain on low-end devices) eliminates the bridgeless-fire-or-not gamble AND the observer-leak/cross-generation misfire that the observer pattern introduced. iOS — generation counter for concurrent restart() safety (#3, medium) - Adds `restartGeneration` (nonatomic; main-thread-only by construction). Each restartWithMode: bumps it and the health-check captures its own `myGen`; on fire, it bails if `restartGeneration != myGen` so a later restart's flag reset can't make an earlier restart's check misreport. iOS — self-respawn loses custom OTA entry URL (#6, low) - The bg self-respawn path falls through `startBackgroundRunner` → default URL (`background.bundle` / DEBUG URL), losing any custom entry URL the host set via startBackgroundRunnerWithEntryURL: (typically an OTA-resolved bundle path) because reactNativeFactoryDelegate is released for mode='all'. Added an explicit `[BTLogger warn:]` so this trade-off is visible in logs when the host's AppDelegate wiring is broken AND OTA paths matter. Android — soften over-confident resolve-delivery comment (#4, medium) - Round-1 comment claimed "the JS thread is still live and can deliver the success callback before being destroyed". Closer to truth: the CallInvoker is bound to the outgoing ReactInstance which reload invalidates very quickly, and `await restart(...)` callers' continu- ations are typically superseded by the reload. Position-of-resolve only matters for the synchronous-throw path. Updated the comment. Not changed: - 1.5s → 3s on the health-check delay is the right call but isn't itself a magic-number fix — `kRestartHealthCheckDelaySeconds` is a named constant with rationale comment. Polling-until-flag-flips was considered and skipped: extra complexity for a one-line log output. Still pending end-to-end verification on the OneKey host app: - mode='ui' / mode='all' on iOS bridgeless: confirm health-check logs fire only when integration is actually broken (deliberate AppDelegate.hostDidStart: omission test). - Android: confirm BG_RESTART_ERROR Promise rejection on synthetic startActivity failure (e.g. invalid intent).
Round-2 left three open concerns reviewer flagged as merge-blockers for the OTA path plus a memory-ordering nit. Fixes: iOS — cache lastEntryURL for OTA-safe self-respawn (#1, medium) - Round-2's self-respawn fell through to `startBackgroundRunner` → default `background.bundle`. On OTA-equipped hosts that's a moduleId- mismatch crash waiting to happen: main runs the OTA-updated bundle, bg loads the IPA's bundled bg bundle, next cross-runtime RPC crashes. - Add `lastEntryURL` (nonatomic, copy) cached unconditionally inside `startBackgroundRunnerWithEntryURL:`. Self-respawn now replays the cached URL via `startBackgroundRunnerWithEntryURL:` instead of the default, keeping the bg moduleId table aligned with main. - Falls back to the default-URL path only when no URL was ever cached (host never called start before triggering restart('all') — implausible in practice), with the original warn log preserved. iOS — two-stage health-check tolerates slow devices (#2, medium) - The 3s `dispatch_after` was measured from restart() dispatch time, not new-host ready. On low-end devices an OTA-multi-bundle reload chain can eat most of that budget, leaving the host's hostDidStart no time to flip the flags — false self-respawn / false error log. - Extract the health-check into `scheduleHealthCheckForRestart:isAll: generation:retried:`. Stage 1 at +3s decides: * Both halves healthy → log OK, done. * Main ready but bg not (mode='all') → STABLE signal that the host didn't re-call startBackgroundRunner; self-respawn now. * Main NOT ready → could be a slow device; reschedule stage 2. Stage 2 at +6s total is the final verdict; whatever's still missing is logged as integration failure (and bg self-healed if it can be). Generation check at each stage entry bails on supersession. - Added detailed stage1/stage2 diagnostic logs so production logs distinguish "host omitted the call" from "host was just slow". iOS — isStarted nonatomic + cross-thread reads (#3, medium) - Pre-existing nonatomic property is now read on main thread by the health-check while writes happen on caller's thread (the module's public surface does not pin start... to main). One-line change to (atomic, assign, readwrite); inline comment documents the TOCTOU race on concurrent first-time starts as a separate pre-existing concern. Not changed: - TOCTOU race in start... (concurrent first-time call → double init) is real but pre-existing and out of scope here; flagged in the new property comment so the next maintainer sees it. - kRestartHealthCheckDelaySeconds stays at 3.0s — total budget is now 6s via stage 2, which the reviewer's worst-case "1.5–2s reload + slow hostDidStart" easily fits inside. Health-check method declared in the class extension as a forward declaration so call-site readability is preserved (definition lives after restartWithMode: where the restart flow naturally starts).
…lish) Round-3 left four low-risk items reviewer recommended resolving for consistency / debuggability. All four close cleanly: #1 — lastEntryURL → atomic - Same rationale as the round-3 isStarted change: written from caller's thread (public API doesn't pin start... to main), read on main by the health-check. NSString* assignment isn't atomic under ObjC ARC, so cross-thread readers could observe a torn pointer or stale value. One-line change to (atomic, copy); pre-existing nonatomic was an inconsistency with the round-3 fix. #2 — race-debug warn on startBackgroundRunnerWithEntryURL: early return - The early-return short-circuit (`if (self.isStarted) return;`) is the natural collision point for the self-respawn-vs-host-async-start race: self-respawn at stage 2 boots with the cached URL, then the host's late async start call comes in with a different URL and gets silently dropped. Added a `[BTLogger warn:]` that fires when the requested URL differs from the active one — production traces now show "bg is on URL X but Y was requested" instead of a mystery. #3 — stage 1 always defers bg self-respawn to stage 2 - Round-3 stage 1 had a fast-path that self-respawned immediately when it saw mainReady=YES but bgReady=NO, on the assumption that mainReady was a stable signal the host wasn't going to call start. That's true for hosts whose hostDidStart: synchronously calls both install AND start, but hosts that gate startBackgroundRunner on async work (feature flag fetch, login, network) can be mainReady=YES while the real start is still inflight — the fast-path would race them and the host's late start would be dropped silently (#2 makes that case loud now, but losing the URL is still wrong). Stage 1 now always reschedules stage 2 if anything is missing; stage 2 (+~3s) is sized to give async host paths time to land. Cost is +3s on the actual self-respawn moment, only on broken-integration paths. #4 — treat bundled default "background.bundle" as no-real-cache - startBackgroundRunnerWithEntryURL: caches unconditionally, including the bundled default when host calls the no-arg startBackgroundRunner in release builds. Round-3 self-respawn's "cachedURL.length > 0" branch was treating that as a real cache and skipping the OTA- mismatch warn — but a host that bootstrapped with the default and never swapped to an OTA URL still has the same crash risk. Now filters: `cachedURL.length > 0 && ![cachedURL isEqualTo String:@"background.bundle"]`. Hosts that immediately call startBackgroundRunnerWithEntryURL:OTAUrl overwrite the cache so this filter only triggers in the genuinely-no-custom-URL case. #5 (cosmetic explicit ivar init) is deliberately skipped — reviewer flagged it as not necessary.
Replace the bare `mode: string` parameter on the TurboModule spec with a
typed `RestartMode = 'ui' | 'all'` string-literal union exported from
NativeBackgroundThread.ts.
Why a literal union (not a TS enum):
- Zero runtime cost — a regular `enum` compiles to a JS object that
ships with the bundle; a literal union erases to nothing.
- RN TurboModule codegen's supported-type whitelist is primitives,
Object/Array, Promise, and string-literal unions. Plain TS `enum` is
not on that list; behaviour across RN versions is unstable and best
avoided in spec files.
- Consumers can `import type { RestartMode } from '...'` and pass a
plain string; no runtime import dependency.
Migration note: if a downstream consumer was using
`EAppRestartMode` (TS string enum from `@onekeyhq/shared`) as the value,
its members aren't assignable to a string-literal union by default.
Two options at the call site:
1. `restart(EAppRestartMode.UI as RestartMode, reason)`.
2. Migrate `EAppRestartMode` to a string-literal union / `as const`
object so the two stay aligned.
JSDoc updated to reference {@link RestartMode}; the native code still
validates at runtime (it's the source of truth for which strings the
platforms accept), so any cast that smuggles in an unknown mode is
caught and rejects the Promise.
…C.cpp std::remove_if is used in both install() and invalidate(); the symbol was only resolved transitively via SharedRPC.h's STL chain, which is fragile across toolchain/header changes. Make the dependency explicit so a future header diet on SharedRPC.h cannot break the build.
|
Copilot 两条评论均已处理: 1. 2. Android
|
- iOS: align isStarted public-header declaration with .mm redeclaration
(atomic, readonly). The previous nonatomic/atomic mismatch trips
Clang -Wproperty-attribute-mismatch and fails CI under -Werror.
- src: re-export RestartMode from the package entry so the JSDoc-promised
`import type { RestartMode } from '@onekeyfe/react-native-background-thread'`
actually resolves; consumers were getting a TS2305 against the prepared
index.d.ts.
- iOS: refresh the scheduleHealthCheckForRestart: docblock — earlier rounds
removed the stage-1 self-respawn-on-bg-not-ready short-circuit, but the
function-level comment still described it. Comment now matches the
unconditional-reschedule behaviour the body actually implements.
|
三条 findings 均已修复,提交 8080d00: [P2] iOS [P2] [P3] iOS health-check 文档与实现不一致 — |
- Android restart('ui'): handle async reload-task fault, not just the
synchronous throw. ReactHost.reload(reason) returns TaskInterface<Void>
in RN 0.83 and the public TaskInterface surface only exposes
waitForCompletion + isFaulted/isCancelled/getError (no continueWith).
Watch the task on a short-lived daemon thread; on fault, cancellation,
or 15s timeout, post back to the main thread and trigger the existing
process-restart fallback. Without this, an async reload fault leaves
SharedRPC main listener invalidated and main runtime torn down with no
rebuild path — app would be permanently broken until manual relaunch.
- src/NativeBackgroundThread.ts: fix the Prettier error that was the
only true exit-1 lint failure (loadSegmentInBackground signature
collapsed onto one line). Also drop the two stale eslint-disable
directives in SharedRPC.ts / SharedStore.ts now that the underlying
no-var hits no longer fire (warning-level but flagged in the audit).
- src/NativeBackgroundThread.ts JSDoc: replace the stale 'post-reload
observer' wording with 'host hostDidStart: + dispatch_after health-
check fallback' so the public docs match the implementation (which
explicitly avoids RCTJavaScriptDidLoadNotification because of its
unreliable timing in bridgeless / NewArch).
|
三条 findings 均已修复,提交 0c99595: [P2] Android [P2] 包级 lint Prettier error — [P3] 公开 TS 文档「post-reload observer」措辞陈旧 — |
The 3.0.32 release-it commit (0247ebf) updated the yarn.lock entry for background-thread's bundle-update peerDep but missed updating the package.json itself, breaking CI's `yarn install --immutable`. Past releases have hit the same shape — the peerDep version is bumped on every bundle-update release, and any miss in either file desyncs the lockfile. The version floor wasn't load-bearing in practice (bundle-update hasn't shipped a breaking API change for background-thread integration since the floor was first introduced), so drop the floor and use `*`. This eliminates the release-it sync hazard going forward and keeps the runtime contract (consumer must have bundle-update installed) intact via the peerDep entry.
Three buttons on BackgroundThreadTestPage:
- restart('ui'): exercises the main-runtime soft reload path. iOS goes
through RCTTriggerReloadCommandListeners; Android goes through
ReactHost.reload (bridgeless) with the new TaskInterface watcher
falling back to process restart on async fault.
- restart('all'): on iOS, soft-reloads main and re-spawns bg; on
Android, process-restarts via Runtime.exit + makeRestartActivityTask.
- bogus mode: drives the synchronous validation reject path so the
log shows a visible BG_RESTART_ERROR rejection.
Imports RestartMode from the package entry to verify the new re-export
is reachable from a real consumer (the previously-broken JSDoc claim).
cocoapods 1.15.x (which the old Gemfile constraint `>= 1.13, != 1.15.0, != 1.15.1` resolves to) crashes on Ruby 3.4 with `cannot load such file -- kconv` — kconv was removed from the Ruby 3.4 stdlib. 1.16.2 dropped the direct kconv dependency, but xcodeproj still caps CFPropertyList to `< 4.0`, and CFPropertyList 3.x has a leftover `require 'kconv'` at file head (no actual Kconv calls). Pin cocoapods to ~> 1.16.2, relax xcodeproj to >= 1.27.0 (1.16.x requires it), and add a tiny local `kconv-shim` gem with an empty `kconv.rb` so the orphan require in CFPropertyList resolves. `bundle exec pod install` is now reproducible across machines and locks the toolchain to a known-good version.
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
The example target had no DEVELOPMENT_TEAM set (intentionally — it's per-developer and never committed). With Automatic signing, xcodebuild errored "Signing for 'example' requires a development team" even on simulator runs (`yarn ios`, fresh-clone build, CI), because Xcode 14+ asks for a team regardless of SDK when CODE_SIGN_STYLE is Automatic. Add `CODE_SIGNING_ALLOWED[sdk=iphonesimulator*] = NO` on both Debug and Release configs of the example target. Simulator builds now skip signing entirely (the binary doesn't need it to run in the sim); device builds (iphoneos SDK) still require a team, which devs can configure locally in Xcode without committing personal identifiers.
…igning Previous attempt (cfee72c) set CODE_SIGNING_ALLOWED=NO for the iphonesimulator SDK to bypass the 'requires a development team' error. That worked for the example target's main binary but it also cleared signing on the embedded frameworks Copy-Frameworks step — GPChannelSDKCore.framework (vendored by react-native-lite-card) and hermesvm.framework ended up unsigned. iOS 26 simulator dyld refuses to load unsigned frameworks at startup, so the app SIGABRT'd at launch with 'Library not loaded: GPChannelSDKCore'. Replace the all-off switch with ad-hoc signing on simulator: - CODE_SIGN_STYLE = Manual (skip automatic team resolution) - CODE_SIGN_IDENTITY[sdk=iphonesimulator*] = '-' (Sign to Run Locally) Simulator builds still get a valid (ad-hoc) signature on the main binary and every embedded framework, so dyld accepts them. No team is required because Manual style doesn't try to provision. Device (iphoneos) builds remain untouched — devs configure their own team in Xcode locally as before.
Nitro hybrid view methods can be invoked off the UI thread (JS thread on iOS, arbitrary thread on Android). UIKit's becomeFirstResponder / resignFirstResponder and Android's requestFocus / clearFocus / IMM calls must run on the main thread. On iOS the previous implementation crashed with SIGTRAP via FrontBoardServices' assertBarrierOnQueue. Android would raise CalledFromWrongThreadException. Wrap both focus() and blur() with a main-thread dispatch (Thread.isMainThread short-circuit on iOS, Looper.getMainLooper() + View.post on Android). The isDisposed check is re-run inside the posted block on Android to avoid touching a recycled view.
Summary
restart(mode, reason)TurboModule 方法,用以替代直接调用react-native-restart:mode='ui'只重启主 JS runtime(bg 保留热状态,用于切语言 / DevSettings),mode='all'重启两个 runtime(OTA 切包 / resetData 等磁盘 bundle 变化场景)。reason透传至RCTTriggerReloadCommandListeners和宿主日志,便于生产环境归因。EXC_BAD_ACCESS:SharedRPC 新增 per-listeneralive原子标志,并在invalidate(runtimeId)时同步翻转;两处 iOS executor lambda 改为__weak捕获RCTInstance,让被销毁的 instance 安全 no-op。Android 同步加入 C++ 防御,作为 defense-in-depth(Android 实测不会复现该崩溃)。BackgroundThreadManager.restartWithMode:reason:completion:在主线程串行执行 SharedRPC quiesce → 可选释放 bg host(mode='all') →RCTTriggerReloadCommandListeners;AppDelegate.hostDidStart重启后重新挂回 "main" SharedRPC 监听,mode='all'时还会重新启动 bg host。BackgroundThreadManager.restart(...)通过新增的 JNInativeInvalidateSharedRpc使 SharedRPC 监听失效,然后进程级重启(与react-native-restart-newarch行为对齐)。TODO:后续切到ReactHost.reload(reason)让mode='ui'时 bg runtime 保持热状态。@onekeyfe/react-native-background-thread至 3.0.32(其他包不受影响)。Test plan
EXC_BAD_ACCESS,且 reload 后 SharedRPC 仍可双向通信mode='all'),验证 bg host 被重新拉起、moduleId 表一致restart('ui'/'all', reason),验证 SharedRPC 监听被失效、进程重启后行为与react-native-restart-newarch对齐mode传入非法值时 Promise 被 reject