Skip to content

register_wallet: partial-commit on restore-phase failure (persisted record orphaned) #3736

@Claudius-Maginificent

Description

@Claudius-Maginificent

Summary

PlatformWalletManager::register_wallet (in packages/rs-platform-wallet/src/manager/wallet_lifecycle.rs, around lines 215–360) is not fully atomic. The function persists the registration changeset before running the second-phase restore work (load_persisted() and initialize_from_persisted(...)). If either of those later steps fails, the in-memory wallet is removed and Err(WalletCreation) is returned, but the durable record committed earlier is left on disk — there is no compensating delete in the persister API today.

Result: callers observe an apparent registration failure even though the wallet metadata + account snapshot are already committed and will be rehydrated by the next load_from_persistor() on startup. The API advertises all-or-nothing semantics but the durable side-effects are only partially rolled back.

Surfaced by @thepastaclaw on the #3659 review for 40177b48 — see the "register_wallet can return Err after leaving the wallet durably persisted" suggestion.

User Story

As a wallet integrator I want register_wallet to either fully commit a wallet (durable + in-memory) or leave no trace, so that I don't have to teach my UI to distinguish between "creation failed, retry safely" and "creation half-succeeded, the next launch will resurrect this wallet behind my back."

Failure-mode matrix

Step What runs On failure today
1. wm.insert_wallet() In-memory key-wallet register ? early return — clean
2. persister.store() Durable write remove_wallet; clean (fixed by #3659)
3. load_persisted() Rebuild address provider from local persister remove_wallet + return Errdisk record orphaned
4. initialize_from_persisted() / initialize() Hydrate balances / address provider remove_wallet + return Errdisk record orphaned
5. self.wallets.insert() Public registry (no failure path)
6. identity().sync() Background discovery logged, swallowed, returns Ok

The orphan-record problem applies to steps 3 and 4. Most-likely failure source is step 4 if it does any network I/O (DAPI / SPV / identity probe); step 3 is mostly local-compute and rare.

Severity

Medium. The bug requires:

  1. Step 2 (persister.store) succeeds
  2. Step 3 or 4 fails (most plausibly step 4, if it touches network)
  3. The user retries — which now hits WalletAlreadyExists at step 1, because the orphan record is still on disk

Net effect: user-visible "can't recreate, can't recover" state with no clean public API to escape. Not fund-loss, but it strands the wallet from the UI's perspective.

Recommended approaches (any one would close the gap)

  1. Two-phase persist (preferred). persister.store_pending(wallet_id, changeset) → restore phase → persister.promote(wallet_id). On step-3/4 failure: persister.discard_pending(wallet_id) removes the orphan. Cleanest semantics.

  2. Defer persister.store() to the very end. Run steps 3 and 4 from the in-memory registration_changeset, then persist last. Equivalent to two-phase without API change to the persister trait. Simpler, but requires steps 3/4 to be runnable from in-memory state (verify before committing to this).

  3. Compensating delete. Add persister.delete(wallet_id) and call it from the step-3/4 rollback arms. Cheapest patch; leaves the asymmetric "persist before restore" shape in place.

  4. Documented manual recovery (interim). Add a purge_orphaned_record(wallet_id) admin/repair API and surface a clear error message on the WalletAlreadyExists retry path ("wallet persisted but failed to initialize — run X"). Cheapest stopgap; doesn't actually fix atomicity but makes the failure mode recoverable without code changes to integrators.

Acceptance criteria

  • Triggering a load_persisted or initialize_from_persisted failure during register_wallet leaves zero on-disk artefacts.
  • A retry of the same registration after a step-3/4 failure succeeds (no WalletAlreadyExists).
  • New in-crate regression test injecting a fault at each rollback site.

Out of scope

  • The pre-existing duplication of rollback blocks at lines 287–288 / 328–329 / 343–344 (a separate nitpick in the same review — should be extracted to a rollback_inserted_wallet helper, but lives in its own follow-up).

References

🤖 Co-authored by Claudius the Magnificent AI Agent

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions