Juris AI - Vectorless Legal RAG with PageIndex

Juris AI is a legal document question-answering application built with a vectorless RAG approach using PageIndex. It extracts text from uploaded PDF legal documents, builds a lightweight page-level index, retrieves the most relevant pages for a user query, and generates grounded answers with page citations.

Features

Upload and process PDF legal documents
Extract page-level text using PyMuPDF
Build a vectorless PageIndex without embeddings or a vector database
Route user questions to the most relevant document pages
Generate legal answers using Groq-hosted LLMs
Cite the pages used to answer each query
React frontend for document upload and chat
FastAPI backend with PDF upload and chat endpoints

Tech Stack

Backend

Python
FastAPI
PyMuPDF
Groq API
Pydantic

Frontend

React
Vite
Axios
Lucide React

Project Structure

page_index/
├── core/
│   ├── config.py
│   ├── models.py
│   └── page_index.py
├── frontend/
│   ├── public/
│   ├── src/
│   ├── package.json
│   └── vite.config.js
├── uploads/
├── main.py
├── requirements.txt
├── test_bot.py
└── README.md

How It Works

A user uploads a PDF legal document.
The backend extracts text from each page using PyMuPDF.
A PageIndex is generated from page-level text previews.
When the user asks a question, the router identifies the most relevant pages from the PageIndex.
The answer generator uses only the retrieved pages to produce a response.
The response includes page citations so the answer stays grounded in the source document.

Backend Setup

Create and activate a virtual environment:

python -m venv venv
.\venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Create a .env file in the project root:

GROQ_API_KEY=your-groq-api-key
GROQ_MODEL=llama-3.3-70b-versatile
UPLOAD_DIR=./uploads

Run the backend:

python main.py

The backend will run at:

http://localhost:8000

Frontend Setup

Move into the frontend directory:

cd frontend

Install dependencies:

npm install

Run the development server:

npm run dev

The frontend will run at:

http://localhost:5173

API Endpoints

Upload Document

POST /upload

Uploads a PDF document and builds its PageIndex.

Chat with Document

POST /chat

Sends a user query for a previously uploaded document.

Example request:

{
  "document_id": "uploaded-document-id",
  "query": "What are the termination clauses in this agreement?"
}

Environment Variables

Variable	Description
`GROQ_API_KEY`	API key used to access Groq models
`GROQ_MODEL`	Model used for page routing and answer generation
`UPLOAD_DIR`	Directory where uploaded PDFs and generated indexes are stored

Notes

Do not commit .env files or API keys.
Do not commit uploaded legal documents unless they are public sample files.
This project is designed for document-grounded assistance and should not be treated as legal advice.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Juris AI - Vectorless Legal RAG with PageIndex

Features

Tech Stack

Backend

Frontend

Project Structure

How It Works

Backend Setup

Frontend Setup

API Endpoints

Upload Document

Chat with Document

Environment Variables

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
core		core
frontend		frontend
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
test_bot.py		test_bot.py

Folders and files

Latest commit

History

Repository files navigation

Juris AI - Vectorless Legal RAG with PageIndex

Features

Tech Stack

Backend

Frontend

Project Structure

How It Works

Backend Setup

Frontend Setup

API Endpoints

Upload Document

Chat with Document

Environment Variables

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages