UrbanFixd - AI Agent Systems Take-Home

Small production-style AI knowledge assistant that answers questions from a local document dataset. The implementation focuses on clean architecture, local execution, and testability.

Tech Stack

Python + FastAPI
Local vector index with FAISS
sentence-transformers/all-MiniLM-L6-v2 embeddings
Pytest for automated testing

Project Structure

urbanfixd/
  api/routes.py
  ingestion/loader.py
  retrieval/vector_store.py
  agents/qa_agent.py
  tests/
  data/docs/
  main.py
  README.md

Architecture Decisions

Pipeline separation by responsibility
- ingestion/loader.py: load, clean, chunk, and attach metadata.
- retrieval/vector_store.py: embed chunks, persist local FAISS index, retrieve top-k contexts.
- agents/qa_agent.py: reasoning layer that builds a prompt, ranks context sentences, and returns answer + citations.
- api/routes.py: minimal HTTP interface with dependency injection.
Local-first operation
- All data is stored in data/docs.
- FAISS index and metadata are persisted to data/index.
- No paid APIs or external hosted services are required.
Deterministic tests
- Retrieval tests use a fake embedder to avoid network/model download during CI.
- API tests override the QA dependency to keep tests fast and stable.

Data and Metadata

The dataset includes 15 local markdown knowledge pages in data/docs. Each chunk carries metadata:

{ "source": "local_docs", "file": "01_fastapi_async.md", "chunk_id": 0, "checksum": "..." }

API

`GET /health`

Returns service status:

{ "status": "ok" }

`POST /query`

Request:

{ "question": "How does FastAPI handle async requests?" }

Response:

{
  "answer": "FastAPI runs on ASGI servers and async handlers await non-blocking I/O...",
  "sources": [
    { "file": "01_fastapi_async.md", "chunk_id": 0, "score": 0.82 }
  ]
}

Setup and Run

python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload

Tests

pytest -q

Test coverage includes:

retriever returns relevant document
API endpoint returns response
duplicate documents are ignored

Tradeoffs Due to Time Limits

The answer generator uses lightweight extractive reasoning instead of a full local LLM pipeline.
FAISS index is exact (IndexFlatIP) for simplicity over advanced ANN configurations.
No async background reindexing yet; index is built lazily on first query if absent.

How This Would Scale for Production

Split ingestion/indexing into asynchronous worker jobs.
Move vector storage to a distributed vector database (or FAISS sharded service).
Add a model-serving layer for embeddings with batching and GPU acceleration.
Add caching for repeated queries and top-k retrieval results.
Add observability: tracing, retrieval metrics, answer quality monitoring, and alerting.
Add auth, rate limiting, request validation hardening, and deployment orchestration.

Final Question: 1M Documents and 500 RPS Redesign

To support 1 million documents and 500 requests/second, redesign into separate online and offline systems:

Offline indexing plane
- Stream documents into a durable queue.
- Distributed workers perform cleaning/chunking/embedding.
- Store chunk metadata in object storage + relational metadata store.
- Build sharded ANN indexes (IVF/HNSW/PQ) and publish versioned snapshots.
Online query plane
- Stateless API pods behind a load balancer.
- Dedicated retrieval service that fans out queries to vector shards.
- Query understanding/reranking stage for better precision.
- Caching layer for hot queries and precomputed embeddings.
Reliability and operations
- Blue/green index rollout with rollback.
- Circuit breakers and fallback behavior when retrieval is degraded.
- End-to-end tracing and SLOs on latency, retrieval recall, and answer quality.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UrbanFixd - AI Agent Systems Take-Home

Tech Stack

Project Structure

Architecture Decisions

Data and Metadata

API

`GET /health`

`POST /query`

Setup and Run

Tests

Tradeoffs Due to Time Limits

How This Would Scale for Production

Final Question: 1M Documents and 500 RPS Redesign

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
agents		agents
api		api
data/docs		data/docs
ingestion		ingestion
retrieval		retrieval
tests		tests
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

UrbanFixd - AI Agent Systems Take-Home

Tech Stack

Project Structure

Architecture Decisions

Data and Metadata

API

GET /health

POST /query

Setup and Run

Tests

Tradeoffs Due to Time Limits

How This Would Scale for Production

Final Question: 1M Documents and 500 RPS Redesign

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /health`

`POST /query`

Packages