Problem
ISOProber._stats is mutated from coroutines dispatched via asyncio.gather without explicit synchronization. This is safe today because asyncio's cooperative scheduling guarantees single-threaded execution within the event loop — but this safety guarantee is implicit and undocumented. The project already uses asyncio.to_thread() for matches_for_users in monitor.py:233, establishing a precedent that a future contributor could follow when adding a new blocking data source. If a contributor wraps a new data source call in asyncio.to_thread() and that code path touches _stats, the implicit safety invariant breaks without any change to ISOProber itself.
Acceptance Criteria
Bugfix bundle — WG21Index / self.papers contract (paperscout_bugfix_bundle_27f91caa.plan.md §5)
Implementation Notes
The lightweight fix is documentation + a defensive lock. The _stats dict is small and updated infrequently (once per probe batch), so lock contention is negligible. The WG21Index docstring at sources.py:33-41 already warns against cross-thread access — extend this pattern to ISOProber. The open-std.org scraper at sources.py:607-649 is the most likely extension point; ensure its integration pattern is documented.
References
- Eval finding: Compound-6 (Implicit Concurrency / Extension), T13 residual + T23
- Related files:
src/paperscout/sources.py (ISOProber._stats, WG21Index), src/paperscout/monitor.py (asyncio.to_thread usage)
Problem
ISOProber._statsis mutated from coroutines dispatched viaasyncio.gatherwithout explicit synchronization. This is safe today because asyncio's cooperative scheduling guarantees single-threaded execution within the event loop — but this safety guarantee is implicit and undocumented. The project already usesasyncio.to_thread()formatches_for_usersinmonitor.py:233, establishing a precedent that a future contributor could follow when adding a new blocking data source. If a contributor wraps a new data source call inasyncio.to_thread()and that code path touches_stats, the implicit safety invariant breaks without any change toISOProberitself.Acceptance Criteria
ISOProber._statsdocumenting the single-thread invariant: "This dict is mutated from async coroutines on the event loop. Thread-safety depends on asyncio cooperative scheduling. Do NOT access fromasyncio.to_thread()or thread pool executors."CONTRIBUTING.mdordocs/architecture.mdsection documenting the concurrency model: what runs on the event loop, what runs in threads, and the rules for adding new data sourcesthreading.Lockguard around_statsmutations as defense-in-depth (low overhead for dict updates, eliminates the implicit contract)_statsintegrity whenprobe_all()processes concurrent batchesBugfix bundle —
WG21Index/self.paperscontract (paperscout_bugfix_bundle_27f91caa.plan.md §5)WG21Index.self.papers(or immediately below the class docstring insources.py):self.papersis replaced wholesale on everyrefresh(); never mutate in place — this is whylen(index.papers)from Bolt / health threads is safe today (no separatepaper_count()API required unless the implementation later mutates in place).Implementation Notes
The lightweight fix is documentation + a defensive lock. The
_statsdict is small and updated infrequently (once per probe batch), so lock contention is negligible. TheWG21Indexdocstring atsources.py:33-41already warns against cross-thread access — extend this pattern toISOProber. The open-std.org scraper atsources.py:607-649is the most likely extension point; ensure its integration pattern is documented.References
src/paperscout/sources.py(ISOProber._stats, WG21Index),src/paperscout/monitor.py(asyncio.to_thread usage)