feat: Per-user ingestion coordination to prevent cross-source race conditions by anirudhaacharyap · Pull Request #171 · XortexAI/XMem

anirudhaacharyap · 2026-05-11T07:10:51Z

Fixes #152

Problem

Ingestion requests can originate from multiple sources (MCP, SDK, browser extension) simultaneously. When concurrent requests hit the pipeline for the same user, they race against shared state — causing profile overwrites, temporal event duplicates, and stale summary deduplication.

The existing asyncio.Semaphore(5) caps total system concurrency but does not distinguish between users.

Solution

Introduces a UserIngestionCoordinator that serialises ingestion per user_id using async FIFO locks, while allowing different users to proceed fully in parallel.

New: `src/api/ingestion_coordinator.py`

Per-user asyncio.Lock with lazy creation and automatic cleanup
Clean async with coordinator.acquire(user_id) context-manager API
Zero external dependencies — pure stdlib
Designed for future swap to distributed lock (Redis) with no interface changes

Modified: `src/api/routes/memory.py`

/v1/memory/ingest — pipeline.run wrapped in per-user lock (nested inside existing global semaphore)
/v1/memory/batch-ingest — entire batch acquired under a single user lock, preserving sequential item processing

Modified: `server.py`

Both test-frontend ingest routes (/v1/memory/ingest, /api/ingest) wrapped with coordinator

Guarantees

Guarantee	Mechanism
Only one ingestion pipeline per user at a time	Per-user `asyncio.Lock`
FIFO request ordering	`asyncio.Lock` waiter queue
Cross-user parallelism preserved	Separate lock per `user_id`
System-wide backpressure	Existing `Semaphore(5)` retained
No memory leaks from idle locks	Automatic cleanup when waiter count hits 0

Testing

7 unit tests at 100% coverage on the coordinator module:

Sequential execution for same user (non-overlapping timestamps)
Parallel execution for different users (overlapping timestamps)
FIFO completion ordering
Lock cleanup after all tasks complete
Exception safety (lock released even on failure)
Deadlock prevention under 20 concurrent acquires
Mixed-user interleaving correctness

Introduce UserIngestionCoordinator that serialises ingestion pipeline execution per user_id using async FIFO locks, while allowing different users to proceed in parallel. - New src/api/ingestion_coordinator.py with lazy lock creation, automatic cleanup, and a clean async context-manager interface - Wrap /v1/memory/ingest and /v1/memory/batch-ingest routes with per-user lock (existing global Semaphore(5) retained as backpressure) - Wrap both server.py test-frontend ingest routes with coordinator - Prevents profile overwrites, temporal duplicates, and summary drift caused by concurrent cross-source requests for the same user Closes #per-user-coordination

gemini-code-assist

Code Review

This pull request introduces a UserIngestionCoordinator to ensure that ingestion tasks are serialized per user, preventing concurrent pipeline runs for the same user_id. Feedback focuses on optimizing the locking order in ingest_memory to prevent system-wide starvation caused by the global semaphore being acquired before the per-user lock. Additionally, it is recommended to apply the global semaphore and structured error handling to the batch_ingest_memory route to maintain consistent backpressure and error reporting.

gemini-code-assist · 2026-05-11T07:12:10Z

        async with _ingest_semaphore:
-            result = await asyncio.wait_for(
-                pipeline.run(
-                    user_query=req.user_query,
-                    agent_response=req.agent_response or "Acknowledged.",
-                    user_id=user_id,
-                    session_datetime=req.session_datetime,
-                    image_url=req.image_url,
-                    effort_level=req.effort_level,
-                ),
-                timeout=120.0
-            )
+            async with _user_coordinator.acquire(user_id):


Acquiring the global _ingest_semaphore before the per-user _user_coordinator lock can lead to system-wide starvation.

If a single user sends multiple concurrent requests, they could fill all available slots in the global semaphore while waiting for their own sequential user lock. This would prevent other users from acquiring a semaphore slot, even if the system has capacity to process their requests. Swapping the order ensures that a user only occupies a global concurrency slot when they are actually ready to run.

async with _user_coordinator.acquire(user_id): async with _ingest_semaphore:

gemini-code-assist · 2026-05-11T07:12:10Z

+            result = await asyncio.wait_for(
+                pipeline.run(
+                    user_query=item.user_query,
+                    agent_response=item.agent_response or "Acknowledged.",
+                    user_id=user_id,
+                    session_datetime=item.session_datetime,
+                    image_url=item.image_url,
+                    effort_level=item.effort_level,
+                ),
+                timeout=120.0
+            )


The batch_ingest_memory route currently bypasses the _ingest_semaphore, which is intended to provide system-wide backpressure. To maintain consistent concurrency control across the API, each pipeline run within the batch should respect the global semaphore.

Additionally, consider wrapping this route in a try/except block (similar to the single ingest route) to provide structured error responses instead of a generic 500 internal server error if a pipeline run fails.

Suggested change

result = await asyncio.wait_for(

pipeline.run(

user_query=item.user_query,

agent_response=item.agent_response or "Acknowledged.",

user_id=user_id,

session_datetime=item.session_datetime,

image_url=item.image_url,

effort_level=item.effort_level,

),

timeout=120.0

)

async with _ingest_semaphore:

result = await asyncio.wait_for(

pipeline.run(

user_query=item.user_query,

agent_response=item.agent_response or "Acknowledged.",

user_id=user_id,

session_datetime=item.session_datetime,

image_url=item.image_url,

effort_level=item.effort_level,

),

timeout=120.0

)

ishaanxgupta · 2026-05-11T08:40:56Z

hi @anirudhaacharyap please have a look on the gemini suggestions

anirudhaacharyap requested review from ishaanxgupta and ved015 as code owners May 11, 2026 07:10

github-actions Bot added the api label May 11, 2026

gemini-code-assist Bot reviewed May 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Per-user ingestion coordination to prevent cross-source race conditions#171

feat: Per-user ingestion coordination to prevent cross-source race conditions#171
anirudhaacharyap wants to merge 1 commit into
XortexAI:mainfrom
anirudhaacharyap:feat/batch-ingest

anirudhaacharyap commented May 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

ishaanxgupta commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

anirudhaacharyap commented May 11, 2026

Problem

Solution

New: src/api/ingestion_coordinator.py

Modified: src/api/routes/memory.py

Modified: server.py

Guarantees

Testing

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

ishaanxgupta commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

New: `src/api/ingestion_coordinator.py`

Modified: `src/api/routes/memory.py`

Modified: `server.py`