Doc Visualizer is a full-stack application that takes a 10-K financial document and produces dynamic visualizations for the most important insights in the document.
Doc Visualizer is designed to help users quickly analyze large financial documents (like SEC 10-K filings). It does the following:
- Parses the document into text chunks.
- Embeds the chunks using OpenAI’s embeddings API.
- Stores embeddings in Pinecone for quick retrieval.
- Uses GPT to identify key insights from the document.
- Generates chart or module specifications that can be rendered dynamically in the front-end.
The end goal is to turn dense text content into meaningful data visualizations so users can spot important trends, performance metrics, and risk factors more quickly.
Backend
- FastAPI: For the REST API, background tasks, and overall application logic.
- Python: Core language for the backend.
- OpenAI API: For text embeddings using embeddings API and for summarizing documents, generating insights, and chart specs using GPT API.
- Pinecone: Vector database to store embeddings for RAG.
- Pydantic: For data validation and modeling (e.g., chart specs, insights).
Frontend
- Next.js: Main React framework.
- TypeScript
[WIP] A simplified view:
├── backend
│ └── app
│ ├── routers
│ ├── services
│ └── main.py
└── frontend
├── app
│ ├── api
│ │ ├── generate-visualization
│ │ └── upload
│ ├── upload
│ └── visualize
└── components
├── charts
└── visualization
-
Clone the Repository
git clone https://github.com/your-username/doc-visualizer.git cd doc-visualizer
-
Set Up Backend
- Create a virtual environment:
cd backend python -m venv venv source venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
- Configure environment variables (see
.env.example
) - Start the FastAPI server:
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
- Create a virtual environment:
-
Set Up Frontend
- Navigate to
frontend
directory:cd frontend
- Install dependencies:
npm install
- Configure environment variables (see
.env.example
) - Run dev server:
npm run dev
- Visit http://localhost:3000.
- Navigate to
-
Upload a PDF
- Go to
[frontend_base_url]/upload
(e.g.,http://localhost:3000/upload
). - Select a PDF (10-K).
- The system will generate the visualization at [frontend_base_url]/visualize/[doc-id]`.
- Go to
Thank you for checking out Doc Visualizer!
For questions, issues, or contributions, please open an issue or a pull request on GitHub. Happy visualizing!