A powerful Graph-based Retrieval Augmented Generation (RAG) system built with Neo4j, Qdrant vector database, semantic caching, and multiple LLM integrations.
GraphRAG combines the power of graph databases with vector search and large language models to create an intelligent knowledge retrieval and generation system. This project leverages Neo4j for graph storage and traversal, Qdrant for hybrid vector search, semantic caching for optimized performance, and integrates with Google Gemini and Cohere LLMs for enhanced question-answering and content generation.
Arabic Medical Data Support: This system includes comprehensive medical disease information in Arabic, making it accessible for Arabic-speaking users to query about symptoms, treatments, medications, and disease relationships in their native language.
- Graph Database Integration: Utilizes Neo4j for efficient knowledge graph storage and querying
- Hybrid Search with Qdrant: Advanced vector search combining dense and sparse vectors for optimal retrieval
- Semantic Caching: Intelligent caching layer that reduces redundant LLM calls and improves response times
- Multi-LLM Support: Integration with Google Gemini and Cohere for diverse AI capabilities
- Multilingual Support: Powered by Cohere's multilingual capabilities, supporting Arabic and other languages
- Arabic Medical Knowledge: Pre-loaded with medical disease data in Arabic for healthcare queries
- Streamlit Frontend: Interactive web interface for easy querying and visualization
- Async Architecture: Built with Python's asyncio for high-performance operations
- Vector Embeddings: Support for multiple embedding models for semantic understanding
- Template Processing: Advanced template parsing for dynamic content generation
- Docker Support: Containerized deployment for easy setup and scalability
GraphRAG/
โโโ frontend/ # Streamlit-based user interface
โโโ graphrag/ # Main application code
โ โโโ src/
โ โ โโโ controllers/ # API controllers (NLP, Process)
โ โ โโโ helpers/ # Utility functions
โ โ โโโ models/ # Data models (Neo4j)
โ โ โโโ routes/ # API routes
โ โ โโโ schemes/ # Data schemas
โ โ โโโ stores/ # Data storage layers
โ โ โโโ llm/ # LLM provider factories
โ โ โโโ templates/ # Template parsers
โ โ โโโ vectordb/ # Vector database providers
โ โโโ main.py # Application entry point
โ โโโ pipeline.py # Data processing pipeline
โโโ GraphRAGDocker/ # Docker configuration
โโโ requirements.txt # Python dependencies
โโโ README.md # Project documentation
- Python 3.8+
- uv - Fast Python package installer
- Neo4j Database
- Qdrant Vector Database
- Docker (optional, for containerized deployment)
- Google Gemini API Key
- Cohere API Key
- Clone the repository:
git clone https://github.com/omarsabri125/GraphRAG.git
cd GraphRAG- Install dependencies:
uv pip install -r requirements.txt- Configure environment variables:
# Create a .env file with your settings
# Neo4j Configuration
NEO4J_URI=<your-neo4j-uri>
NEO4J_USERNAME=<your-username>
NEO4J_PASSWORD=<your-password>
# Qdrant Configuration
QDRANT_URL=<your-qdrant-url>
QDRANT_API_KEY=<your-qdrant-api-key>
# LLM API Keys
GEMINI_API_KEY=<your-gemini-api-key>
COHERE_API_KEY=<your-cohere-api-key>
# Semantic Cache Settings
CACHE_ENABLED=true
CACHE_TTL=3600
SIMILARITY_THRESHOLD=0.95- Run the application:
python main.py- Run the pipeline (one-time setup):
python pipeline.pyThis will process your documents, extract entities and relationships, and populate both Neo4j and Qdrant.
- Build and run with Docker Compose:
cd GraphRAGDocker
docker-compose up -dAfter completing the installation and running the pipeline, you can start using the GraphRAG system through the Streamlit interface:
- Start the Backend API:
uvicorn main:app --host 0.0.0.0 --port 8001 --reload- Launch the Streamlit Frontend (in a new terminal):
cd frontend
streamlit run app.pyThe frontend interface will open automatically in your browser at http://localhost:8501
The Streamlit interface provides:
- Query Interface: Enter your questions in natural language (English or Arabic)
- Real-time Responses: Stream responses as they're generated by the LLM
- Context Visualization: View the retrieved context from the knowledge graph
- Source Attribution: See which entities and relationships were used to generate the answer
- Search Results: Display hybrid search results from Qdrant
- Document Management: Upload and manage documents for processing
- Settings Panel: Configure search parameters and LLM settings
- Arabic Medical Queries: Ask questions about diseases, symptoms, and treatments in Arabic
- Open the Streamlit interface at
http://localhost:8501 - Enter your question in the query box (English or Arabic)
- Example (English): "What are the symptoms of diabetes?"
- Example (Arabic): "ู ุง ูู ุฃุนุฑุงุถ ู ุฑุถ ุงูุณูุฑูุ"
- The system will:
- Check the semantic cache for similar previous queries
- Perform hybrid search in Qdrant if needed
- Retrieve relevant context from Neo4j graph
- Generate a response using Gemini/Cohere
- Display the answer with source attribution
- View the retrieved context and sources used for the answer
The system includes comprehensive medical information in Arabic covering:
- Diseases (ุงูุฃู ุฑุงุถ): Names, descriptions, and classifications
- Symptoms (ุงูุฃุนุฑุงุถ): Disease symptoms and manifestations
- Treatments (ุงูุนูุงุฌุงุช): Treatment options and medications
- Medications (ุงูุฃุฏููุฉ): Drug names and usage information
- Relationships: Connections between diseases, symptoms, and treatments
Example Arabic Queries:
- "ู ุง ูู ุฃุนุฑุงุถ ู ุฑุถ ุงูุถุบุทุ" (What are the symptoms of hypertension?)
- "ููู ูุชู ุนูุงุฌ ุงูุณูุฑูุ" (How is diabetes treated?)
- "ู ุง ูู ุงูุฃุฏููุฉ ุงูู ุณุชุฎุฏู ุฉ ูุนูุงุฌ ุงูุฑุจูุ" (What medications are used to treat asthma?)
The application uses a settings management system located in helpers.get_settings(). Key configuration options include:
- Neo4j connection settings (URI, credentials)
- Qdrant vector database configuration
- LLM provider settings (Gemini, Cohere)
- Semantic cache configuration
- Hybrid search parameters
- Logging levels
The system implements hybrid search combining:
- Dense Vectors: Semantic embeddings for conceptual similarity (works excellently with Arabic)
- Sparse Vectors: Keyword-based matching for precise retrieval
- Fusion Strategy: Combines both approaches for optimal results
Semantic caching reduces latency and API costs by:
- Storing previous query-response pairs with embeddings
- Matching similar queries using vector similarity (supports Arabic queries)
- Returning cached responses for semantically similar questions
- Configurable similarity thresholds and TTL
Google Gemini:
- Used for advanced reasoning and content generation
- Supports multimodal inputs
- Multilingual support including Arabic
- Configured via
GEMINI_API_KEY
Cohere:
- Provides powerful embeddings and reranking
- Excellent for multilingual support (100+ languages including Arabic)
- High-quality Arabic text understanding
- Configured via
COHERE_API_KEY
The pipeline is a one-time execution process designed for initial data ingestion and processing:
- Entity Extraction: Extracts entities from input documents (including Arabic medical texts) using LLMs (Gemini/Cohere)
- Relationship Extraction: Identifies and extracts relationships between entities
- Neo4j Storage: Stores entities and relationships as a knowledge graph in Neo4j
- Vector Embedding: Generates embeddings for entities and relationships (multilingual embeddings for Arabic)
- Qdrant Injection: Indexes embeddings in Qdrant for hybrid search capabilities
Pipeline Flow:
Input Documents (Arabic/English) โ Entity/Relationship Extraction โ Neo4j Graph Storage โ Vector Embeddings โ Qdrant Index
Once the pipeline completes, your data is:
- Structured in Neo4j as a knowledge graph
- Searchable via Qdrant with hybrid vector search
- Ready for RAG queries through the API endpoints in Arabic or English
Note: The pipeline should only be run once during initial setup or when adding new data to the knowledge base.
The application exposes RESTful API endpoints through controllers for querying the knowledge base:
- NLP Controller: Natural language processing and query operations against the knowledge graph (supports Arabic)
- Process Controller: Data processing and management operations
How it works after pipeline execution:
- User sends a query via Streamlit frontend or API (in Arabic or English)
- System checks semantic cache for similar queries
- If not cached, performs hybrid search in Qdrant
- Retrieves relevant context from Neo4j graph
- Generates response using LLMs (Gemini/Cohere)
- Caches the response for future similar queries
- Returns response to user through the frontend
Interactive web interface that provides:
- User-friendly query interface for asking questions in Arabic or English
- Real-time response streaming from the RAG system
- Visualization of search results and retrieved context
- Document upload and management
- Configuration settings management
One-time execution script that:
- Extracts entities and relationships from documents using LLMs (supports Arabic medical texts)
- Builds a knowledge graph in Neo4j
- Creates vector embeddings and indexes them in Qdrant
- Should be run only during initial setup or data updates
Handle API requests and orchestrate business logic between models and storage layers.
Define data structures and interactions with Neo4j graph database using the Neo4jModel.
Manage different storage backends:
- LLM: Factory pattern for various language model providers (Gemini, Cohere with Arabic support)
- VectorDB: Qdrant vector database operations with hybrid search capabilities
- Templates: Dynamic template parsing and rendering
- Semantic Cache: Intelligent caching layer for query optimization
Utility functions for configuration management, logging, and common operations.
python main.py --devThe application uses Python's built-in logging with configurable levels:
import logging
logging.basicConfig(level=logging.INFO)To add more Arabic medical data to the system:
- Prepare your medical documents in Arabic (PDF, TXT, or other supported formats)
- Place them in the appropriate data directory
- Run the pipeline to process the new data:
python pipeline.py- The system will extract entities, relationships, and create embeddings for the new Arabic content
- Neo4j for graph database capabilities
- Qdrant for high-performance vector search
- Google Gemini for advanced LLM capabilities with multilingual support
- Cohere for embeddings and excellent multilingual support (including Arabic)
- AsyncIO for Python async support
- The open-source community for various dependencies
For issues, questions, or contributions, please open an issue in the repository.
Note: Make sure to configure all necessary environment variables and API keys before running the application. The system is optimized for Arabic medical queries through Cohere's multilingual capabilities.