Summary: _record_reliability_event performs a read-modify-write on the reliability summary without holding any lock during the entire operation. It calls _load_reliability_summary() (reads from disk), modifies the in-memory dict, then calls _maybe_save_reliability_summary() (writes to disk). Two concurrent script executions can both read the same old summary, each apply their own result, and the second write silently overwrites the first - causing one execution's metrics to be lost entirely.
Location: app.py, lines 1493–1568 (_record_reliability_event function).
Steps to reproduce:
- Start the DevShell server with a multi-threaded WSGI server (or trigger two rapid concurrent script executions via /api/scripts/run).
- Execute two scripts simultaneously so both finish at nearly the same time.
- Check the reliability summary (GET /api/reliability/summary); observe that total_runs incremented by 1 instead of 2, or that one execution's failure/success is not reflected.
Expected: Every completed execution is atomically recorded - concurrent executions each contribute their outcome to the summary without overwriting each other.
Actual: Under concurrent load the read-modify-write cycle is unsynchronized, so one update can silently overwrite another, leaving total_runs, failures, and reliability_score incorrect.
Suggested fix: Wrap the entire read-modify-write body of _record_reliability_event inside the existing _reliability_cache_lock mutex (or introduce a dedicated write lock), so only one thread at a time performs the load → modify → save cycle.
@siddu-k Assign me this issue
Summary: _record_reliability_event performs a read-modify-write on the reliability summary without holding any lock during the entire operation. It calls _load_reliability_summary() (reads from disk), modifies the in-memory dict, then calls _maybe_save_reliability_summary() (writes to disk). Two concurrent script executions can both read the same old summary, each apply their own result, and the second write silently overwrites the first - causing one execution's metrics to be lost entirely.
Location: app.py, lines 1493–1568 (_record_reliability_event function).
Steps to reproduce:
Expected: Every completed execution is atomically recorded - concurrent executions each contribute their outcome to the summary without overwriting each other.
Actual: Under concurrent load the read-modify-write cycle is unsynchronized, so one update can silently overwrite another, leaving total_runs, failures, and reliability_score incorrect.
Suggested fix: Wrap the entire read-modify-write body of _record_reliability_event inside the existing
_reliability_cache_lockmutex (or introduce a dedicated write lock), so only one thread at a time performs the load → modify → save cycle.@siddu-k Assign me this issue