Project description — Blocket-style Agentic RAG (tutorial)

This document is a standalone overview of what this repository is, why it exists, and how the pieces fit together. For step-by-step setup and teaching material, see TUTORIAL.md. For quick commands, see the root README.md.

Purpose

This is an educational Python project that demonstrates how to move from classic RAG (“retrieve → answer”) toward Agentic RAG (“route → retrieve → optionally act with tools → reason in steps → answer”), using LlamaIndex and a small FastAPI backend with a minimal web UI.

The domain is marketplace listings, stylistically inspired by Blocket (Swedish classifieds). Blocket is part of Vend. This repo is not an official Blocket or Vend product, and it does not imply any endorsement or production integration.

What you can do with it

Through the browser UI or the /chat API, you can ask questions such as:

Summarize groups of listings (summary-oriented queries).
Find or compare products (retrieval-oriented queries).
Ask for answers that combine retrieval with simple decisions (for example, picking a “best value” candidate in the sample dataset).
Request currency conversion via small deterministic tools (so numeric conversion is not left to pure free-form generation).

The UI shows two outputs per question:

Router answer — output from the Router Query Engine, which chooses between a vector (semantic search) path and a summary path.
Agent answer — output from a tutorial agent that runs an explicit reasoning loop (router context → filter candidates → simple decision → optional tool use → optional LLM polish).

Tech stack

Layer	Technology
Web framework	FastAPI
ASGI server	Uvicorn
LLM / RAG framework	LlamaIndex
LLM & embeddings (defaults)	OpenAI (via LlamaIndex integrations)
Frontend	Static HTML, CSS, JavaScript (no separate SPA build)
Config	`python-dotenv` (`.env`)

Architecture (conceptual)

User (browser or API client)
        │
        ▼
   FastAPI (main.py)
        │
        ├── GET  /              → chat UI
        ├── GET  /health        → startup status
        └── POST /chat          → router answer + agent answer
                │
                ▼
   ProductAssistantEngine (services/engine.py)
        │
        ├── load_products()     → JSON listings → LlamaIndex Documents + metadata
        ├── build_indexes()     → VectorStoreIndex + SummaryIndex
        ├── build_router()      → RouterQueryEngine (vector vs summary tools)
        └── build_agent()       → ProductReasoningAgent (reasoning loop + tools)

Startup loads data/products.json, embeds documents for the vector index, builds the router and agent once per process. Each /chat request runs the router and the agent on the user query.

Repository layout (high level)

Path	Role
`main.py`	FastAPI app, routes, startup hook
`config.py`	LlamaIndex `Settings` (LLM + embeddings) from `.env`
`data/products.json`	Sample listing records (tutorial dataset)
`data/sources/`	Example CSV/JSON sources for ingestion
`ingestion/`	Loader, schema, importers, optional polite fetcher, CLI pipeline
`indexes/`	Vector + summary index construction
`router/`	Router Query Engine setup
`tools/`	Currency / conversion helpers (+ LlamaIndex `FunctionTool` wrappers)
`agents/`	Tutorial reasoning agent
`services/engine.py`	Orchestrates loader → indexes → router → agent
`frontend/`	Minimal UI

Data and ingestion

Default: The app reads data/products.json. Each array element is one listing; each becomes one multi-document entry (text + metadata mirroring the JSON fields).
Phase 2 pipeline: python -m ingestion.pipeline can normalize data from CSV or JSON, and optionally run a gated network fetch (domain allowlist, robots.txt check, explicit legal acknowledgement flag). This is designed for responsible experimentation, not bulk scraping.

Answers in the UI reflect whatever is in products.json (or the file you point ingestion at). Replacing sample data with live listings is a matter of feeding the same schema through ingestion or a supported integration—subject to terms of service, robots, and API availability for any real marketplace.

Concepts demonstrated

Multi-document RAG — one Document per listing; metadata preserved for filtering and explanation.
Router Query Engine — routes between summary-style and search-style query engines.
Tool calling (tutorial) — deterministic functions for exchange rate and price conversion; the agent calls them where relevant.
Agent reasoning loop — explicit steps in code (not a full production planner), intended for learning and debugging.
Safe ingestion — local-first imports; optional, guarded fetch path.

What this project is not

Not a production marketplace integration.
Not a substitute for legal review of data collection from third-party sites.
Not a full multi-agent orchestration platform; the agent loop is intentionally small and readable.

Documentation map

Document	Contents
README.md	Prerequisites, setup, run commands, quick links
docs/TUTORIAL.md	Full tutorial, workshop script, startup internals, PyCharm debugging
This file	High-level project description and architecture

License / usage

Use this repo for learning and experimentation. If you fork or present it publicly, keep the sample-data and non-affiliation context clear when referring to Blocket, Vend, or any real marketplace brand.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project description — Blocket-style Agentic RAG (tutorial)

Purpose

What you can do with it

Tech stack

Architecture (conceptual)

Repository layout (high level)

Data and ingestion

Concepts demonstrated

What this project is not

Documentation map

License / usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
agents		agents
data		data
docs		docs
frontend		frontend
indexes		indexes
ingestion		ingestion
router		router
services		services
tools		tools
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
config.py		config.py
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Project description — Blocket-style Agentic RAG (tutorial)

Purpose

What you can do with it

Tech stack

Architecture (conceptual)

Repository layout (high level)

Data and ingestion

Concepts demonstrated

What this project is not

Documentation map

License / usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages