RAG Web Application

Project Overview

This project implements a Retrieval-Augmented Generation (RAG) web application designed to enable users to create projects by uploading PDFs or attaching files from Google Drive. The system uses a hierarchical chunking approach to segment documents for efficient semantic search and retrieval, capable of handling any number of files.

The application supports advanced quantitative data retrieval and analytics, integrates OpenAI or Anthropic large language models (LLMs) for query processing, and returns outputs with citations and relevance scores. Every user query and system response is saved into a continuous master file associated with each project, ensuring full conversation retention for auditing and further analysis.

Key Features

Project-based user workspace for document upload and Google Drive attachment
Hierarchical chunking of documents for fine-grained retrieval
Scalable implementation supporting any number of files
Quantitative data parsing and advanced analytics capabilities
Query processing using OpenAI or Anthropic APIs with citation and scoring
Continuous logging of all query outputs and chat history per project

Technology Stack

Backend: Python FastAPI (with routes for upload, project management, query handling)
Frontend: React or Next.js for project UI and conversation interface
Storage: PostgreSQL for projects, queries, logs; Qdrant for vector search
LLM APIs: OpenAI GPT models or Anthropic Claude models
Data Processing: Hierarchical chunking, text and table extraction for retrieval and analytics

Project Structure

Rag_App/
├── app/
│   ├── __init__.py
│   ├── main.py                 # FastAPI application entry point
│   ├── api/
│   │   └── routes/             # API route handlers
│   ├── core/                   # Core configuration and settings
│   ├── models/                 # Database models and schemas
│   ├── services/               # Business logic services
│   │   ├── __init__.py
│   │   ├── chunking.py         # Hierarchical chunking implementation
│   │   ├── llm_openai.py       # OpenAI integration
│   │   ├── llm_anthropic.py    # Anthropic integration
│   │   └── storage.py          # Query/output storage service
│   └── utils/                  # Utility functions
├── data/
│   ├── uploads/                # Uploaded files storage
│   └── processed/              # Processed data and query logs
├── tests/                      # Unit and integration tests
├── frontend/                   # Frontend application (React/Next.js)
├── docs/                       # Additional documentation
├── requirements.txt            # Python dependencies
├── .env.example                # Environment variables template
├── .gitignore                  # Git ignore rules
└── README.md                   # This file

Installation and Setup

1. Clone the repository

git clone https://your-repo-url.git
cd Rag_App

2. Create and activate Python virtual environment

# Windows
python -m venv venv
.\venv\Scripts\activate

# Linux/macOS
python -m venv venv
source venv/bin/activate

3. Install required Python packages

pip install -r requirements.txt

4. Set up environment variables

Copy the .env.example file to .env and fill in your configuration:

cp .env.example .env

Edit .env and add your API keys and configuration:

OPENAI_API_KEY - Your OpenAI API key
ANTHROPIC_API_KEY - Your Anthropic API key
DATABASE_URL - Your PostgreSQL connection string
QDRANT_HOST - Qdrant vector database host
Other configuration as needed

5. Set up the database

# Create PostgreSQL database
# Run migrations (to be implemented)

6. Run the backend server

uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

The API will be available at http://localhost:8000 API documentation will be available at http://localhost:8000/docs

7. Start the frontend app (optional)

cd frontend
npm install
npm start

Usage

Create a new project via the frontend or API endpoint /projects/create
Upload PDFs or attach Google Drive files to the project
The system ingests and chunks documents hierarchically for search
Query the project knowledge base via natural language questions
View results with citations and relevance scores
All queries and responses are saved continuously for reference

API Endpoints

POST /projects/create - Create a new project
POST /projects/{project_id}/upload - Upload files to a project
POST /projects/{project_id}/attach-google-drive - Attach Google Drive files
POST /projects/{project_id}/query - Query the project knowledge base
GET /projects/{project_id}/history - Get query history

Development

Running Tests

pytest tests/

Code Formatting

black app/
flake8 app/

Next Steps

Implement database models and migrations
Create API route handlers in app/api/routes/
Implement vector database integration
Build document processing pipeline
Create frontend UI
Add authentication and authorization
Implement comprehensive testing

Contributing

Contributions are welcome. Please open issues or pull requests for enhancements and bugfixes.

License

Specify your license here.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
app		app
data		data
docs		docs
frontend		frontend
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.mcp.json		.mcp.json
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
NEXT_STEPS.md		NEXT_STEPS.md
Readme.md		Readme.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Web Application

Project Overview

Key Features

Technology Stack

Project Structure

Installation and Setup

1. Clone the repository

2. Create and activate Python virtual environment

3. Install required Python packages

4. Set up environment variables

5. Set up the database

6. Run the backend server

7. Start the frontend app (optional)

Usage

API Endpoints

Development

Running Tests

Code Formatting

Next Steps

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Web Application

Project Overview

Key Features

Technology Stack

Project Structure

Installation and Setup

1. Clone the repository

2. Create and activate Python virtual environment

3. Install required Python packages

4. Set up environment variables

5. Set up the database

6. Run the backend server

7. Start the frontend app (optional)

Usage

API Endpoints

Development

Running Tests

Code Formatting

Next Steps

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages