Skip to content

Gracefully reset the StateProxy internals on error#6169

Merged
masenf merged 2 commits intomainfrom
masenf/state-proxy-error-recovery
Mar 12, 2026
Merged

Gracefully reset the StateProxy internals on error#6169
masenf merged 2 commits intomainfrom
masenf/state-proxy-error-recovery

Conversation

@masenf
Copy link
Collaborator

@masenf masenf commented Mar 11, 2026

If we catch an error during StateProxy.__aenter__, we have to manually unwind the internal state changes to allow the proxy to be used again.

If we catch an error during `StateProxy.__aenter__`, we have to manually unwind
the internal state changes to allow the proxy to be used again.
@codspeed-hq
Copy link

codspeed-hq bot commented Mar 11, 2026

Merging this PR will not alter performance

✅ 8 untouched benchmarks


Comparing masenf/state-proxy-error-recovery (c97268a) with main (3c11451)

Open in CodSpeed

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 11, 2026

Greptile Summary

This PR fixes a correctness bug in StateProxy where an error raised inside __aenter__ (e.g. a CancelledError from the underlying modify_state lock acquisition) left internal fields (_self_actx, _self_mutable, _self_actx_lock_holder) in a dirty state, making the proxy permanently unusable. The fix wraps the __aenter__ body in a try/except that manually calls __aexit__ to unwind state, and refactors __aexit__ to skip calling the inner context manager's exit when _self_mutable is False (i.e. when __aenter__ never fully completed).

Key changes:

  • __aenter__: wrapped in try/except (Exception, asyncio.CancelledError) that delegates to __aexit__ for cleanup on failure.
  • __aexit__: removed the early-return-on-None guard; now conditionally calls the inner __aexit__ only when _self_mutable is True, then always performs cleanup.
  • tests/units/conftest.py: new mock_app fixture wires a real App into the mocked module for proxy-level unit tests.
  • tests/units/istate/test_proxy.py: new test_state_proxy_recovery test validates re-entry after CancelledError from modify_state.__aenter__.

Issue found: The new __aexit__ uses contextlib.suppress(Exception) to silence errors from the inner __aexit__, but asyncio.CancelledError is a BaseException (not Exception) in Python 3.8+ and will not be suppressed. If the inner modify_state context manager's exit raises CancelledError (e.g. during state delta emission over the socket), all cleanup lines after the with block are skipped and _self_actx_lock is never released — causing a deadlock on the next async with self call. The original code's try/finally pattern covered this correctly.

Confidence Score: 3/5

  • Safe to merge with caution — the core intent is correct, but the __aexit__ refactor introduces a lock-leak regression for CancelledError.
  • The fix correctly addresses the deadlock on re-entry after __aenter__ failure. However, replacing try/finally in __aexit__ with contextlib.suppress(Exception) creates a new failure mode: if the inner context manager's __aexit__ raises asyncio.CancelledError (a BaseException, not Exception), the lock is never released. The test suite also does not cover this scenario.
  • reflex/istate/proxy.py — specifically the __aexit__ method's use of contextlib.suppress(Exception) instead of try/finally.

Important Files Changed

Filename Overview
reflex/istate/proxy.py Core fix wraps __aenter__ body in try/except to call __aexit__ on failure; however __aexit__ now uses contextlib.suppress(Exception) which won't catch asyncio.CancelledError (BaseException in Python 3.8+), risking a permanent lock deadlock if the inner context manager's exit raises CancelledError.
tests/units/conftest.py Adds a mock_app fixture that wires a real App instance into the mocked app module, enabling StateProxy unit tests that need a running app context.
tests/units/istate/test_proxy.py Adds test_state_proxy_recovery which validates re-entry after a CancelledError from modify_state.__aenter__. Does not cover the case where modify_state.__aexit__ raises CancelledError, which is the scenario that exposes the lock-leak bug in __aexit__.

Sequence Diagram

sequenceDiagram
    participant BG as Background Task
    participant SP as StateProxy
    participant Lock as asyncio.Lock
    participant MCM as modify_state ctx mgr

    BG->>SP: async with state_proxy (aenter)
    SP->>Lock: acquire()
    Lock-->>SP: acquired

    SP->>MCM: modify_state().__aenter__()
    alt aenter raises Exception/CancelledError
        MCM-->>SP: raises error
        SP->>SP: __aexit__(*sys.exc_info()) [cleanup]
        SP->>Lock: release()
        SP-->>BG: re-raises error
        BG->>SP: async with state_proxy (retry — works!)
    else aenter succeeds
        MCM-->>SP: mutable_state
        SP->>SP: _self_mutable = True, __wrapped__ = substate
        SP-->>BG: returns self (mutable)
        BG->>SP: async with state_proxy (aexit)
        SP->>MCM: __aexit__() [emit delta, clean state]
        alt aexit raises CancelledError ⚠️
            MCM-->>SP: CancelledError (BaseException)
            Note over SP: contextlib.suppress(Exception)<br/>does NOT catch CancelledError!
            SP-->>BG: lock never released — DEADLOCK
        else aexit succeeds
            MCM-->>SP: ok
            SP->>Lock: release()
        end
    end
Loading

Last reviewed commit: fd05a8b

@masenf masenf merged commit 52a351c into main Mar 12, 2026
47 checks passed
@masenf masenf deleted the masenf/state-proxy-error-recovery branch March 12, 2026 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants