A Retrieval-Augmented Generation (RAG) document assistant built with Flask, Pinecone, Gemini Embeddings, Groq API, and Google Gemini. Upload PDFs, DOCX, TXT, or MD files and intuitively chat with them using modern AI models.
- π Multi-format Uploads: Support for PDF, DOCX, TXT, and Markdown files.
- π¬ Interactive Chat: Query and chat with your documents using AI.
- π RAG-based Retrieval: Fast and accurate semantic search using Pinecone vector database.
- π§ Multiple LLM Support: Powered by Groq API (Llama 3) and Google Gemini.
- π Robust Authentication: Supports Google OAuth as well as standard Email/Password login.
- πΌοΈ User Profiles: Custom profile picture uploads & Google profile pic sync.
- π€ Data Isolation: Per-user namespaces in Pinecone for complete privacy.
- π‘οΈ Admin Dashboard: Admin panel to monitor users and uploaded files.
- ποΈ Data Management: Intuitive UI to delete files and clear vector stores.
- π± Responsive UI: Minimal and modern front-end for seamless user experience.
- βοΈ Lightweight & Cloud-Native: Zero local ML models β all embeddings and LLM calls are cloud-based API calls, requiring minimal server RAM.
| Layer | Technology |
|---|---|
| Backend | Flask (Python) |
| Authentication | Flask-Login + Flask-Dance (Google OAuth) |
| Embeddings | Google Gemini (gemini-embedding-001) |
| Vector Store | Pinecone (Serverless) |
| LLMs | Groq API (Llama 3.3 70B) & Google Gemini |
| User Database | MongoDB Atlas |
| Frontend | HTML, CSS, Vanilla JS |
RAG_App/
βββ app.py # Main Flask application & routes
βββ models.py # MongoDB user model & encrypted key storage
βββ config.py # Configuration & env variables
βββ requirements.txt # Python dependencies
βββ render.yaml # Render deployment blueprint
βββ Dockerfile # Docker containerization
βββ .env.example # Environment variable template
βββ rag/
β βββ chunker.py # Document parsing & chunking logic
β βββ embeddings.py # Gemini embeddings + Pinecone upsert
β βββ retriever.py # Pinecone semantic search & retrieval
β βββ generator.py # LLM integration for answer generation
βββ templates/
β βββ index.html # File management & upload dashboard
β βββ chat.html # RAG chat interface
β βββ login.html # User login page
β βββ register.html # User registration page
β βββ admin.html # Admin dashboard
β βββ profile.html # User profile & API key settings
βββ static/ # Static assets (CSS, JS, profile_pics)
βββ uploads/ # User-uploaded files (isolated per user)
βββ .github/workflows/
βββ devsecops.yml # Security scanning pipeline
βββ deploy.yml # Docker build & GHCR push pipeline
git clone https://github.com/param20h/PDF-Assistant-RAG.git
cd PDF-Assistant-RAGpython -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/Mac
source .venv/bin/activatepip install -r requirements.txtCreate a .env file using the template:
cp .env.example .envFill in the required server-side variables:
SECRET_KEY=<your-secret-key>
ENCRYPTION_KEY=<your-fernet-key>
MONGO_URI=<your-mongodb-atlas-uri>
GOOGLE_CLIENT_ID=<your-google-client-id>
GOOGLE_CLIENT_SECRET=<your-google-client-secret>Generate keys:
# SECRET_KEY python -c "import secrets; print(secrets.token_hex(32))" # ENCRYPTION_KEY python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"
python app.pyVisit http://localhost:5000 in your web browser.
After registering/logging in, each user must add their own API keys on the Profile page:
| Service | Required? | Where to Get | Notes |
|---|---|---|---|
| Gemini API Key | β Required | aistudio.google.com | Free β used for embeddings & chat |
| Pinecone API Key | β Required | app.pinecone.io | Free tier available |
| Pinecone Index Name | β Required | Pinecone Dashboard | Create: dim 3072, metric cosine |
| Groq API Key | Optional | console.groq.com | For Llama 3 chat generation |
- Create a free account at pinecone.io
- Create a Serverless index with:
- Dimension:
3072 - Metric:
cosine
- Dimension:
- Copy your API key and index name into the Profile page
- Go to Google Cloud Console β console.cloud.google.com
- Create a new project and navigate to APIs & Services β Credentials
- Click Create Credentials β OAuth Client ID
- Set the Authorized redirect URI to:
http://localhost:5000/login/google/authorized - Copy your
Client IDandClient Secretinto the.envfile
- Upload: User uploads a document (PDF, DOCX, TXT, or MD).
- Chunking: The document is parsed and split into manageable textual chunks.
- Embedding: Chunks are converted to 3072-dimensional vectors using
gemini-embedding-001. - Vector Storage: Vectors are stored in the user's Pinecone namespace.
- Querying: The user submits a question.
- Retrieval: Pinecone retrieves the most semantically relevant chunks.
- Generation: The retrieved context is passed to the selected LLM (Groq or Gemini) to generate an accurate, grounded answer.
- Push your code to GitHub
- Go to Render β New β Web Service
- Connect your GitHub repository
- Render auto-detects
render.yamland configures everything - Add environment variables:
SECRET_KEY,ENCRYPTION_KEY,MONGO_URI,GOOGLE_CLIENT_ID,GOOGLE_CLIENT_SECRET - Update Google OAuth redirect URI to:
https://your-app.onrender.com/login/google/authorized - Deploy!
docker build -t rag-app .
docker run -p 5000:5000 --env-file .env rag-app| Tool | Purpose |
|---|---|
GitHub Actions |
CI/CD Pipeline |
Bandit |
SAST β Python security vulnerability scanning |
Gitleaks |
Hardcoded secret and credential detection |
Trivy |
Container and dependency vulnerability checking |
Snyk |
Advanced dependency vulnerability scanning |
OWASP ZAP |
DAST β Dynamic web security scanning |
SonarCloud |
Overall code quality and security analysis |
GHCR |
Docker image hosting via GitHub Container Registry |
- Name: Paramjit Singh (param20h)
This project is licensed under the MIT License. Check the LICENSE file for more details.
If you found this project helpful or inspiring, please give it a β!