Skip to content

therexroder/sec-rag

Repository files navigation

SEC RAG

SEC RAG is a full-stack Retrieval-Augmented Generation experience tailored for exploring SEC filings. It combines a FastAPI backend, LlamaIndex pipelines, Chroma vector search, and a polished React UI to surface explainable answers with source citations.

Project Highlights

  • End-to-end question answering over bundled 10-K filings with automatic CIK normalization and Gemini-powered reasoning.
  • Real-time streaming chat UX that exposes intermediate steps (chain-of-thought) and relevance-ranked sources.
  • Document ingestion pipeline that cleans raw SEC submissions and syncs them into a pgvector-backed Chroma store.
  • Typed SDK generated from FastAPI OpenAPI schema for the frontend to stay in lockstep with the API.

Tech Stack

  • Backend: FastAPI, LlamaIndex, Python uv, Celery-style task orchestration for long-running jobs.
  • Vector Store: ChromaDB (SQLite) with pgvector-friendly schema for portability.
  • Frontend: React 19 + Vite + Tailwind, AI SDK for streaming assistant responses.
  • Tooling: uv for dependency management, Makefile helpers, Docker Compose for local Postgres experiments.

Quick Tour

  1. Launch the API server.

    uv sync
    make run
  2. Fire up the frontend (served via Vite).

    cd frontend
    npm install
    npm run dev
  3. Open the chat interface at the Vite dev URL (default http://localhost:5173). Ask, for example, “What risks did Apple highlight in its 2023 10-K?” — the assistant streams the answer with traceable sources.

  4. Explore the documents page to browse indexed filings and trigger embeddings refreshes.

Feature Walkthrough

  • Chat with Citations: Streamed responses, status timeline, chain-of-thought toggle, and document provenance baked into each answer.
  • Document Embedding Pipeline: Structured ingestion cleans raw EDGAR text, chunks filings, and upserts embeddings into Chroma.
  • API-first Architecture: /ask/stream, /documents, and EDGAR enrichment endpoints drive both UI and automation.
  • Developer Experience: Generated SDK clients, comprehensive Makefile targets, and environment loading guardrails in app/main.py.

Live Screenshots

  • Chat Experience

    Chat interface showing status timeline and streamed answer

  • Document Embedding Dashboard

    Documents page showing ingestion form and embedded filings

Getting Started Checklist

  • Populate .env with GOOGLE_API_KEY (see app/main.py).
  • Download or generate SEC filings in data/ for your target tickers.
  • Run make run for API, npm run dev for frontend, and connect at http://localhost:5173.

About

SEC filing RAG system with Frontend

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors