Skip to content

Add batch API endpoints for asynchronous LLM processing#46

Open
tbille wants to merge 13 commits intomainfrom
add-batch-api-support
Open

Add batch API endpoints for asynchronous LLM processing#46
tbille wants to merge 13 commits intomainfrom
add-batch-api-support

Conversation

@tbille
Copy link
Copy Markdown
Contributor

@tbille tbille commented Apr 15, 2026

Summary

  • Adds five new batch API endpoints under /v1/batches (create, retrieve, cancel, list, results) that proxy batch operations to LLM providers through the any-llm SDK
  • Implements full request/response models, authentication, provider validation (SUPPORTS_BATCH), usage logging, JSONL temp file construction with cleanup, and BatchNotCompleteError → 409 mapping
  • Includes unit tests for Pydantic model validation and integration tests covering auth, error handling, provider errors, and result serialization

Endpoints

Method Path Description
POST /v1/batches Create a batch from JSON request array
GET /v1/batches/{batch_id}?provider= Retrieve batch status
POST /v1/batches/{batch_id}/cancel?provider= Cancel a batch
GET /v1/batches?provider= List batches with pagination
GET /v1/batches/{batch_id}/results?provider= Retrieve batch results (409 if not complete)

Files Changed

  • src/gateway/api/routes/batches.py — New route module with all five handlers, Pydantic models, usage logging
  • src/gateway/api/main.py — Router registration (standalone mode only)
  • tests/unit/test_batches_route.py — Pydantic validation tests
  • tests/integration/test_batches_endpoint.py — Full integration test suite with mocked SDK
  • docs/public/openapi.json — Regenerated OpenAPI spec
  • pyproject.toml — SDK dependency pinned to local source with batch types

tbille added 11 commits April 15, 2026 16:33
Implement five /v1/batches endpoints (create, retrieve, cancel, list,
results) that proxy batch operations through the any-llm SDK. Includes
JSONL temp file construction for batch creation, provider support
validation (SUPPORTS_BATCH), 409 handling for incomplete batch results,
usage logging, and automatic temp file cleanup.

Registered in standalone mode only, consistent with other LLM proxy
endpoints. Regenerated OpenAPI spec.
…gging, error handling, and missing tests

Address review feedback for the batch API implementation:
- Add FIXME/docstring documenting that retrieve_batch_results uses
  aretrieve_batch as workaround (SDK lacks aretrieve_batch_results)
- Add db parameter to log_batch_usage for spec consistency
- Add usage logging to retrieve_batch_results endpoint
- Handle ValueError from split_model_provider for invalid model format (400)
- Replace deprecated HTTP_422_UNPROCESSABLE_ENTITY with HTTP_422_UNPROCESSABLE_CONTENT
- Move Batch import under TYPE_CHECKING
- Add test_create_batch_invalid_model_format integration test
- Add test_retrieve_batch_results_logs_usage integration test
- Regenerate OpenAPI spec
Import aretrieve_batch_results, BatchNotCompleteError, and Batch
directly instead of guarding behind TYPE_CHECKING. Also import
BackgroundTasks and switch get_db to get_db_if_needed in preparation
for subsequent fixes.
The db parameter was accepted but never used. Remove it from the
function signature and all call sites. Also replace model='batch'
with empty string in results logging to avoid polluting usage data.
Use get_db_if_needed with AsyncSession | None for consistency with
chat.py pattern. Add BackgroundTasks to create_batch signature per
spec.
…oint

Replace workaround that called aretrieve_batch and returned batch
metadata with proper aretrieve_batch_results call that returns
per-request results. Catch BatchNotCompleteError for 409 response.
Serialize BatchResult items matching the spec response shape.
Isolate test from SDK behavior by mocking split_model_provider to
raise ValueError directly instead of relying on SDK internals.
Mock aretrieve_batch_results instead of aretrieve_batch. Verify
per-request results response shape with BatchResult, BatchResultItem,
and BatchResultError. Use BatchNotCompleteError for 409 test.
Add uv source override to use the local SDK which includes
aretrieve_batch_results, BatchResult, BatchResultItem,
BatchResultError, and BatchNotCompleteError.
Update the batch results endpoint description to remove the obsolete
workaround note now that the SDK ships batch result types.
@tbille tbille had a problem deploying to integration-tests April 15, 2026 15:36 — with GitHub Actions Failure
CI fails because uv.lock references ../any-llm which doesn't exist
in CI. Switch to git branch source so CI can resolve the dependency.
@tbille tbille temporarily deployed to integration-tests April 15, 2026 15:41 — with GitHub Actions Inactive
@tbille tbille temporarily deployed to integration-tests April 15, 2026 15:45 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant