Skip to content

feat(asyncio): async DB layer, budget strategies, and log writer#35

Merged
tbille merged 1 commit intomainfrom
feat/async-db-budget-strategies
Apr 13, 2026
Merged

feat(asyncio): async DB layer, budget strategies, and log writer#35
tbille merged 1 commit intomainfrom
feat/async-db-budget-strategies

Conversation

@tbille
Copy link
Copy Markdown
Contributor

@tbille tbille commented Apr 13, 2026

Summary

  • Converts entire DB layer from sync psycopg2 + Session to async asyncpg + AsyncSession. Contended DB calls now yield to the event loop instead of blocking it.
  • Adds pluggable budget_strategy config: for_update (default/legacy), cas (lock-free, recommended), disabled (skip budget checks)
  • Adds pluggable log_writer_strategy config: single (inline, default), batch (background flush for high-throughput)
  • Adds DB pool tuning config: db_pool_size, db_max_overflow, db_pool_timeout, db_pool_recycle
  • Adds new Prometheus metrics for log writer queue depth, batch size, flush duration, and row counts
  • App startup uses FastAPI lifespan for proper async initialization and graceful shutdown
  • Alembic migrations remain sync (psycopg2) for compatibility

Ported from mozilla-ai/any-llm#1001

Co-Authored-By: Julian Bright brightsparc@gmail.com

…og writer

Swap synchronous psycopg2 + Session for asyncpg + AsyncSession across the
entire gateway. Contended DB calls now yield to the event loop instead of
blocking it.

Key changes:
- create_engine -> create_async_engine with configurable pool sizing
- sessionmaker -> async_sessionmaker(expire_on_commit=False)
- All route handlers, services, repositories converted to async DB calls
- budget_strategy config: 'for_update' (default), 'cas' (lock-free), 'disabled'
- log_writer_strategy config: 'single' (inline), 'batch' (background flush)
- LogWriter Protocol with SingleLogWriter and BatchLogWriter implementations
- DB pool tuning: db_pool_size, db_max_overflow, db_pool_timeout, db_pool_recycle
- App startup uses FastAPI lifespan for async initialization
- Alembic migrations remain sync (psycopg2)
- New Prometheus metrics for log writer queue/batch/flush

Co-Authored-By: Julian Bright <brightsparc@gmail.com>
@tbille tbille had a problem deploying to integration-tests April 13, 2026 15:49 — with GitHub Actions Failure
@tbille tbille merged commit d3a0ba5 into main Apr 13, 2026
1 of 4 checks passed
@brightsparc
Copy link
Copy Markdown
Contributor

@tbille looks like you didn't bring over the load testing script - I think that could still be useful, should I create a PR ?

@tbille
Copy link
Copy Markdown
Contributor Author

tbille commented Apr 14, 2026

@brightsparc good catch. Can you make a PR, I'll review it today

@brightsparc
Copy link
Copy Markdown
Contributor

@brightsparc good catch. Can you make a PR, I'll review it today

Here is the PR, I also noticed a small bug when re-running, so added graceful shutdown to capture all logs
#42

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants