This document is a standalone overview of what this repository is, why it exists, and how the pieces fit together. For step-by-step setup and teaching material, see TUTORIAL.md. For quick commands, see the root README.md.
This is an educational Python project that demonstrates how to move from classic RAG (“retrieve → answer”) toward Agentic RAG (“route → retrieve → optionally act with tools → reason in steps → answer”), using LlamaIndex and a small FastAPI backend with a minimal web UI.
The domain is marketplace listings, stylistically inspired by Blocket (Swedish classifieds). Blocket is part of Vend. This repo is not an official Blocket or Vend product, and it does not imply any endorsement or production integration.
Through the browser UI or the /chat API, you can ask questions such as:
- Summarize groups of listings (summary-oriented queries).
- Find or compare products (retrieval-oriented queries).
- Ask for answers that combine retrieval with simple decisions (for example, picking a “best value” candidate in the sample dataset).
- Request currency conversion via small deterministic tools (so numeric conversion is not left to pure free-form generation).
The UI shows two outputs per question:
- Router answer — output from the Router Query Engine, which chooses between a vector (semantic search) path and a summary path.
- Agent answer — output from a tutorial agent that runs an explicit reasoning loop (router context → filter candidates → simple decision → optional tool use → optional LLM polish).
| Layer | Technology |
|---|---|
| Web framework | FastAPI |
| ASGI server | Uvicorn |
| LLM / RAG framework | LlamaIndex |
| LLM & embeddings (defaults) | OpenAI (via LlamaIndex integrations) |
| Frontend | Static HTML, CSS, JavaScript (no separate SPA build) |
| Config | python-dotenv (.env) |
User (browser or API client)
│
▼
FastAPI (main.py)
│
├── GET / → chat UI
├── GET /health → startup status
└── POST /chat → router answer + agent answer
│
▼
ProductAssistantEngine (services/engine.py)
│
├── load_products() → JSON listings → LlamaIndex Documents + metadata
├── build_indexes() → VectorStoreIndex + SummaryIndex
├── build_router() → RouterQueryEngine (vector vs summary tools)
└── build_agent() → ProductReasoningAgent (reasoning loop + tools)
Startup loads data/products.json, embeds documents for the vector index, builds the router and agent once per process. Each /chat request runs the router and the agent on the user query.
| Path | Role |
|---|---|
main.py |
FastAPI app, routes, startup hook |
config.py |
LlamaIndex Settings (LLM + embeddings) from .env |
data/products.json |
Sample listing records (tutorial dataset) |
data/sources/ |
Example CSV/JSON sources for ingestion |
ingestion/ |
Loader, schema, importers, optional polite fetcher, CLI pipeline |
indexes/ |
Vector + summary index construction |
router/ |
Router Query Engine setup |
tools/ |
Currency / conversion helpers (+ LlamaIndex FunctionTool wrappers) |
agents/ |
Tutorial reasoning agent |
services/engine.py |
Orchestrates loader → indexes → router → agent |
frontend/ |
Minimal UI |
- Default: The app reads
data/products.json. Each array element is one listing; each becomes one multi-document entry (text + metadata mirroring the JSON fields). - Phase 2 pipeline:
python -m ingestion.pipelinecan normalize data from CSV or JSON, and optionally run a gated network fetch (domain allowlist,robots.txtcheck, explicit legal acknowledgement flag). This is designed for responsible experimentation, not bulk scraping.
Answers in the UI reflect whatever is in products.json (or the file you point ingestion at). Replacing sample data with live listings is a matter of feeding the same schema through ingestion or a supported integration—subject to terms of service, robots, and API availability for any real marketplace.
- Multi-document RAG — one
Documentper listing; metadata preserved for filtering and explanation. - Router Query Engine — routes between summary-style and search-style query engines.
- Tool calling (tutorial) — deterministic functions for exchange rate and price conversion; the agent calls them where relevant.
- Agent reasoning loop — explicit steps in code (not a full production planner), intended for learning and debugging.
- Safe ingestion — local-first imports; optional, guarded fetch path.
- Not a production marketplace integration.
- Not a substitute for legal review of data collection from third-party sites.
- Not a full multi-agent orchestration platform; the agent loop is intentionally small and readable.
| Document | Contents |
|---|---|
| README.md | Prerequisites, setup, run commands, quick links |
| docs/TUTORIAL.md | Full tutorial, workshop script, startup internals, PyCharm debugging |
| This file | High-level project description and architecture |
Use this repo for learning and experimentation. If you fork or present it publicly, keep the sample-data and non-affiliation context clear when referring to Blocket, Vend, or any real marketplace brand.