Juris AI is a legal document question-answering application built with a vectorless RAG approach using PageIndex. It extracts text from uploaded PDF legal documents, builds a lightweight page-level index, retrieves the most relevant pages for a user query, and generates grounded answers with page citations.
- Upload and process PDF legal documents
- Extract page-level text using PyMuPDF
- Build a vectorless PageIndex without embeddings or a vector database
- Route user questions to the most relevant document pages
- Generate legal answers using Groq-hosted LLMs
- Cite the pages used to answer each query
- React frontend for document upload and chat
- FastAPI backend with PDF upload and chat endpoints
- Python
- FastAPI
- PyMuPDF
- Groq API
- Pydantic
- React
- Vite
- Axios
- Lucide React
page_index/
├── core/
│ ├── config.py
│ ├── models.py
│ └── page_index.py
├── frontend/
│ ├── public/
│ ├── src/
│ ├── package.json
│ └── vite.config.js
├── uploads/
├── main.py
├── requirements.txt
├── test_bot.py
└── README.md
- A user uploads a PDF legal document.
- The backend extracts text from each page using PyMuPDF.
- A PageIndex is generated from page-level text previews.
- When the user asks a question, the router identifies the most relevant pages from the PageIndex.
- The answer generator uses only the retrieved pages to produce a response.
- The response includes page citations so the answer stays grounded in the source document.
Create and activate a virtual environment:
python -m venv venv
.\venv\Scripts\activateInstall dependencies:
pip install -r requirements.txtCreate a .env file in the project root:
GROQ_API_KEY=your-groq-api-key
GROQ_MODEL=llama-3.3-70b-versatile
UPLOAD_DIR=./uploadsRun the backend:
python main.pyThe backend will run at:
http://localhost:8000
Move into the frontend directory:
cd frontendInstall dependencies:
npm installRun the development server:
npm run devThe frontend will run at:
http://localhost:5173
POST /uploadUploads a PDF document and builds its PageIndex.
POST /chatSends a user query for a previously uploaded document.
Example request:
{
"document_id": "uploaded-document-id",
"query": "What are the termination clauses in this agreement?"
}| Variable | Description |
|---|---|
GROQ_API_KEY |
API key used to access Groq models |
GROQ_MODEL |
Model used for page routing and answer generation |
UPLOAD_DIR |
Directory where uploaded PDFs and generated indexes are stored |
- Do not commit
.envfiles or API keys. - Do not commit uploaded legal documents unless they are public sample files.
- This project is designed for document-grounded assistance and should not be treated as legal advice.