Production-grade, serverless-first template for async RAG with agentic workflows using:
- Gemini via PydanticAI
- PostgreSQL + pgvector
- YAML prompt files
- Typed tools and workflow interfaces
- Keeps interfaces explicit so components are swappable.
- Uses staged workflow state for predictable agentic behavior.
- Avoids framework lock-in for serverless deployment targets.
src/core code (no nested root package)prompts/versionable YAML promptstests/unit tests for core subsystemsdocs/architecture and extension guidance
The runtime flow is designed to be explicit and easy to extend:
handle_query()receives a serverless event.build_workflow()wires adapters, prompts, tools, and settings.RagAgentWorkflow.run()executes staged steps:- route intent (
router.yaml) - optional tool execution (
expand_filters) - vector retrieval (
pgvector) - answer synthesis (
synthesis.yaml) - optional critique and retry (
critic.yaml)
- route intent (
- Response returns answer + citations + workflow trace.
Indexing follows a separate path:
handle_index()receives a document event.- Document is chunked.
- Chunks are embedded.
- Embeddings are upserted into
rag_chunks.
Use this template as a composition root and replace components by interface:
- Keep orchestrator logic in
src/workflows/rag_agent.py. - Add or replace providers in
src/adapters/(LLM, embeddings, vector store). - Keep prompt behavior in
prompts/*.yamlinstead of hardcoding prompt text. - Register business tools in
src/tools/builtin.pyand route throughToolRegistry. - Expose deployment entrypoints in
src/handlers/for each serverless action.
Recommended customization order:
- Define your domain schema in
src/models.py. - Add metadata strategy for retrieval filters.
- Update prompt files for your policy/tone/citation style.
- Add domain tools (search APIs, policy checks, calculators).
- Add integration tests for your expected end-to-end behavior.
src/handlers/index_handler.py: ingest + chunk + embed + upsertsrc/handlers/query_handler.py: query + retrieve + synthesizesrc/workflows/rag_agent.py: agentic control flow and retriessrc/prompts/loader.py: YAML prompt renderingsrc/tools/registry.py: typed tool executionsrc/adapters/gemini.py: generation and embedding adaptersrc/adapters/pgvector_store.py: retrieval store adapter
- Install dependencies:
uv sync --extra dev
- Copy env file:
- PowerShell:
Copy-Item .env.example .env - Bash:
cp .env.example .env
- PowerShell:
- Ensure PostgreSQL has
vectorextension enabled.- Example:
psql "$PG_DSN" -f sql/001_init_pgvector.sql
- Example:
- Run tests:
uv run --extra dev pytest
- Run a query locally:
uv run python -m src.cli query "What is this repository for?"
Use:
src/handlers/query_handler.pyfor query executionsrc/handlers/index_handler.pyfor document indexing
These handlers are plain async functions intended to be wrapped by your cloud runtime adapter.
See .env.example for complete list. Core variables:
GOOGLE_API_KEYGEMINI_MODELPG_DSNEMBEDDING_DIMENSIONPROMPTS_DIRRAG_TOP_K
- This template favors reliability and clarity over hidden automation.
- You can add tracing backends (OpenTelemetry, vendor APM) without changing workflow contracts.
- CI is provided at
.github/workflows/ci.ymlwithruff,mypy, andpytest.