A full-stack application that combines Graph RAG (Retrieval-Augmented Generation) with Google's Gemini LLM to provide intelligent document querying capabilities. The application uses Neo4j for graph-based storage of document embeddings and features a modern React frontend.
- Document Upload: Support for PDF, DOCX, and TXT files
- Graph-Based Storage: Neo4j graph database for storing document embeddings and relationships
- Vector Search: Efficient similarity search using vector embeddings
- Gemini LLM Integration: Powered by Google's Gemini 1.5 Pro model
- Interactive Chat Interface: Modern UI for querying documents
- Document Management: View and manage uploaded documents
- Source Attribution: Responses include relevant source documents
- FastAPI: High-performance REST API
- Neo4j: Graph database for storing embeddings and relationships
- Sentence Transformers: Local embedding generation
- Google Gemini: LLM for response generation
- LangChain: Text processing and chunking
- React 18: Modern UI framework
- Vite: Fast build tool and dev server
- TailwindCSS: Utility-first styling
- Lucide Icons: Beautiful icon set
- React Markdown: Formatted response rendering
- Python 3.10 or higher
- Node.js 18 or higher
- Docker and Docker Compose (for Neo4j)
- Google API Key (for Gemini)
git clone <repository-url>
cd graphragagentStart Neo4j using Docker Compose:
docker-compose up -dAccess Neo4j Browser at http://localhost:7474 (username: neo4j, password: password123)
Create a .env file in the root directory:
cp .env.example .envEdit .env and add your configuration:
GOOGLE_API_KEY=your_google_api_key_here
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=password123Using Poetry (recommended):
poetry installOr using pip:
pip install -r requirements.txtcd frontend
npm installFrom the root directory:
# Using Poetry
poetry run python -m backend.main
# Or using Python directly
python -m backend.mainThe backend API will be available at http://localhost:8000
In a new terminal, from the frontend directory:
npm run devThe frontend will be available at http://localhost:3000
- Navigate to the Upload tab
- Drag and drop or click to select a document (PDF, DOCX, or TXT)
- Wait for the document to be processed and chunked
- The system will automatically generate embeddings and store them in Neo4j
- Navigate to the Chat tab
- Type your question in the input field
- The system will:
- Generate an embedding for your query
- Search for relevant document chunks using vector similarity
- Use Gemini LLM to generate a contextual response
- Display the response with source attribution
- Navigate to the Documents tab
- View all uploaded documents
- Delete documents as needed
POST /api/upload- Upload a documentGET /api/documents- List all documentsDELETE /api/documents/{document_id}- Delete a documentGET /api/documents/{document_id}/chunks- Get document chunks
POST /api/chat- Send a chat message and get a response
GET /health- Health check endpointGET /- API information
Edit backend/config.py or use environment variables:
GOOGLE_API_KEY: Your Google API key for GeminiNEO4J_URI: Neo4j connection URINEO4J_USER: Neo4j usernameNEO4J_PASSWORD: Neo4j passwordEMBEDDING_MODEL: Sentence transformer model nameCHUNK_SIZE: Text chunk size (default: 1000)CHUNK_OVERLAP: Chunk overlap size (default: 200)GEMINI_MODEL: Gemini model name (default: gemini-1.5-pro)TEMPERATURE: LLM temperature (default: 0.7)MAX_TOKENS: Maximum response tokens (default: 2048)
Create frontend/.env.local for custom API URL:
VITE_API_URL=http://localhost:8000graphragagent/
βββ backend/
β βββ __init__.py
β βββ config.py # Configuration settings
β βββ main.py # FastAPI application
β βββ models.py # Pydantic models
β βββ graph_store.py # Neo4j graph operations
β βββ embeddings.py # Embedding generation
β βββ document_processor.py # Document processing
β βββ gemini_agent.py # Gemini LLM integration
βββ frontend/
β βββ src/
β β βββ api/
β β β βββ client.js # API client
β β βββ components/
β β β βββ ChatInterface.jsx
β β β βββ DocumentUpload.jsx
β β β βββ DocumentList.jsx
β β βββ App.jsx
β β βββ main.jsx
β β βββ index.css
β βββ index.html
β βββ package.json
β βββ vite.config.js
β βββ tailwind.config.js
βββ pyproject.toml # Poetry dependencies
βββ docker-compose.yml # Neo4j setup
βββ .env.example # Environment template
βββ README.md
# Run with auto-reload
poetry run uvicorn backend.main:app --reload
# Run tests
poetry run pytest
# Format code
poetry run black backend/
# Type checking
poetry run mypy backend/# Development server with hot reload
npm run dev
# Build for production
npm run build
# Preview production build
npm run preview
# Lint code
npm run lint- Ensure Neo4j is running:
docker-compose ps - Check Neo4j logs:
docker-compose logs neo4j - Verify credentials in
.envfile
On first run, the sentence transformer model will be downloaded automatically. This may take a few minutes.
- Ensure your Google API key is valid and has Gemini API access
- Check the key is correctly set in the
.envfile
This project is licensed under the MIT License.
Contributions are welcome! Please feel free to submit a Pull Request.
For issues and questions, please open an issue on GitHub.