We should implement a dedicated batch ingestion endpoint to handle multiple message pairs in a single HTTP request.
Proposed Solution:
- Create a new
BatchIngestRequest schema that accepts a list of message pairs.
- Add a POST
/v1/memory/batch-ingest endpoint in src/api/routes/memory.py.
- Use
asyncio.gather() inside the endpoint to concurrently process the pairs through the ingest pipeline, using the existing _ingest_semaphore to limit internal concurrency.
- Return a
BatchIngestResponse that includes a summary of successes and any failures.
/Context page should be using Batch Ingest API. Audit of Global Queue will also be required. A batch endpoint could eventually be optimized to do bulk inserts into Neo4j and your vector store, drastically reducing database transaction overhead.
Better Error Handling is required: can return partial successes (e.g., "48 pairs succeeded, 2 failed due to LLM timeout")
We should implement a dedicated batch ingestion endpoint to handle multiple message pairs in a single HTTP request.
Proposed Solution:
BatchIngestRequestschema that accepts a list of message pairs./v1/memory/batch-ingestendpoint insrc/api/routes/memory.py.asyncio.gather()inside the endpoint to concurrently process the pairs through the ingest pipeline, using the existing_ingest_semaphoreto limit internal concurrency.BatchIngestResponsethat includes a summary of successes and any failures./Context page should be using Batch Ingest API. Audit of Global Queue will also be required. A batch endpoint could eventually be optimized to do bulk inserts into Neo4j and your vector store, drastically reducing database transaction overhead.
Better Error Handling is required: can return partial successes (e.g., "48 pairs succeeded, 2 failed due to LLM timeout")