Tracks two CodeRabbit findings on PR #450 that are real but out of scope for Pass 1 (which targets single-worker deployments).
1. Idempotency cache: in-flight reservation
CodeRabbit on tinyagentos/routes/agents.py:211:
The `cache.get()` releases its lock immediately. The actual write (`config.agents.append` + `save_config_locked`) happens without any coordination, and `cache.set()` doesn't run until after. A client retrying the same Idempotency-Key while the first request is mid-execution will find the cache empty, proceed with its own write, and create a second agent. The `-2` slug suffix loop masks this as a silent duplicate rather than an error.
Fix: extend IdempotencyCache with reservation semantics — try_reserve(key) returns a sentinel that pending callers can await, then set() resolves it. Apply to add_agent and deploy_agent_endpoint.
2. AgentTokensStore: multi-worker race
CodeRabbit on tinyagentos/stores/agent_tokens_store.py:80:
The asyncio.Lock protects only a single process. In multi-worker deployments, concurrent requests across workers can bypass the in-process lock and produce multiple revoked_at IS NULL rows for the same agent_id.
Pass 1 ships a unique partial index (uniq_agent_active_token) that catches the race at the DB level — the second worker's INSERT raises. But the current issue() path doesn't use BEGIN IMMEDIATE, so the failure can happen mid-transaction with a confusing error. Fix: switch issue() to BEGIN IMMEDIATE and surface a clean error to the caller when the constraint hits.
Why not now
Both fixes target multi-worker deploys. Pass 1 ships single-worker and the constraint + in-process lock are sufficient there. These land when worker fan-out / horizontal scaling enters scope.
Tracks two CodeRabbit findings on PR #450 that are real but out of scope for Pass 1 (which targets single-worker deployments).
1. Idempotency cache: in-flight reservation
CodeRabbit on
tinyagentos/routes/agents.py:211:Fix: extend
IdempotencyCachewith reservation semantics —try_reserve(key)returns a sentinel that pending callers canawait, thenset()resolves it. Apply toadd_agentanddeploy_agent_endpoint.2. AgentTokensStore: multi-worker race
CodeRabbit on
tinyagentos/stores/agent_tokens_store.py:80:Pass 1 ships a unique partial index (
uniq_agent_active_token) that catches the race at the DB level — the second worker's INSERT raises. But the currentissue()path doesn't useBEGIN IMMEDIATE, so the failure can happen mid-transaction with a confusing error. Fix: switchissue()toBEGIN IMMEDIATEand surface a clean error to the caller when the constraint hits.Why not now
Both fixes target multi-worker deploys. Pass 1 ships single-worker and the constraint + in-process lock are sufficient there. These land when worker fan-out / horizontal scaling enters scope.