GraphRAGEval is an advanced Research Assistant system that leverages Graph Retrieval Augmented Generation (GraphRAG) to analyze and answer questions about Reddit discussions. The system ingests Reddit data into a Neo4j graph database, provides intelligent retrieval mechanisms, and uses local Large Language Models (LLMs) via Ollama for response generation.
Built on the principles of GraphRAG, GraphRAGEval creates a knowledge graph from community discussions, enabling nuanced understanding of complex topics through conversational AI with built-in evaluation frameworks.
- Graph Database: Neo4j with vector embeddings and graph structure
- Retrieval: Hybrid retrieval combining semantic similarity and graph relationships
- LLM Integration: Local models via Ollama (Mistral, Llama, etc.)
- Reinforcement Learning: RLHF-tuned reasoning agents for adaptive behavior
- Evaluation Framework: Comprehensive vero-eval metrics suite
- Frontend: Modern React/Next.js interface with real-time chat
- Backend: FastAPI for high-performance serving
- π€ Intelligent Chat Assistant: Context-aware conversations about Reddit discussions
- π Graph-Based Retrieval: Retrieve relevant discussions through graph relationships and similarity
- π Performance Evaluation: Built-in evaluation suite with 10+ metrics including BERTScore, ROUGE, Faithfulness
- π Adaptive Personas: RLHF-tuned agents that learn from interaction quality
- π Knowledge Graph: Structured representation of Reddit communities and discussions
- π³ Containerized Deployment: Docker Compose for easy local setup
- π¨ Modern UI: Beautiful Next.js frontend with dark/light themes
- Hybrid Retrieval: Combines vector similarity (Redis) and graph traversals (Neo4j)
- Real-time RLHF: Agents adapt response quality based on user feedback
- Multi-modal Evaluation: Coverage, faithfulness, relevance, and ranking metrics
- Session Management: Persistent chat history with context tracking
- Background Processing: Async ingestion and evaluation jobs
- Local First: No cloud dependencies - runs entirely on your machine
- Python 3.9+ with pip
- Docker & Docker Compose for Neo4j and Redis
- Ollama for LLM serving (auto-installed via setup script)
- Node.js 18+ for frontend (optional)
-
Clone and Setup
git clone https://github.com/kliewerdaniel/graphrageval.git cd graphrageval -
Run Automated Setup
chmod +x setup.sh ./setup.sh
This installs dependencies, starts Docker services, pulls Ollama models, and prepares the database.
-
Start the Backend
python main.py
The FastAPI server will start on
http://localhost:8000 -
(Optional) Start the Frontend
cd frontend npm install npm run devOpens at
http://localhost:3000 -
Access Interfaces
- Web UI:
http://localhost:3000 - API Docs:
http://localhost:8000/docs - Neo4j Browser:
http://localhost:7474(neo4j/research2025) - LLM Playground:
http://localhost:11434(Ollama)
- Web UI:
The system includes sample Reddit export data. To ingest new data:
# Ingest additional Reddit exports
python scripts/ingest_reddit_data.py --input ./your_reddit_export/Ask questions about Reddit discussions:
What do AI researchers think about the future of AGI?
The system will:
- Analyze your query for discussion-relevance
- Retrieve similar Reddit threads using hybrid graph search
- Generate responses citing specific users and subreddits
- Adapt behavior based on response quality
# Chat endpoint
response = requests.post("http://localhost:8000/api/chat", json={
"query": "Your question here",
"session_id": "optional-session-id"
})
# Search endpoint
results = requests.get("http://localhost:8000/api/search", params={
"query": "search term",
"limit": 10
})
# Evaluation
results = requests.post("http://localhost:8000/api/evaluate", json={
"dataset_path": "custom_dataset.json"
})See /api/docs for complete API documentation.
# Test reasoning agent
python scripts/reddit_reasoning_agent.py --query "What are people's opinions on AI alignment?"
# Run evaluation
python evaluation/run_evaluation.py
# Ingest data
python scripts/ingest_reddit_data.py --input ./reddit_export/βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Next.js UI β β FastAPI β β Neo4j Graph β
β βββββΊβ Backend βββββΊβ Database β
β - Chat Interfaceβ β β β β
β - Real-time β β /api/chat β β - Reddit Nodes β
β - Progress β β /api/search β β - Comment Edges β
βββββββββββββββββββ β /api/evaluate β β - Vector Indexesβ
βββββββββββββββββββ βββββββββββββββββββ
β β² β
β β β
βββββββββββββββββββ βββββββββββββββββββ
β Ollama LLMs β β Redis Cache β
β β β β
β - Local Models β β - Embeddings β
β - RLHF Agents β β - Session Data β
βββββββββββββββββββ βββββββββββββββββββ
β²
β
βββββββββββββββββββ
β vero-eval β
β Evaluation β
β β
β - 10+ Metrics β
β - Trace Logging β
β - Performance β
βββββββββββββββββββ
- Graph Traversals: Explore Reddit discussion networks
- Vector Similarity: Redis-powered embedding search
- Hybrid Scoring: Combines relevance and relationship strength
- Persona Agents: RLHF-tuned response generation
- Context Integration: Citations and discussion insights
- Adaptive Learning: Behavior adjustment based on feedback
- Retrieval Metrics: Precision@K, Recall@K, Sufficiency
- Generation Metrics: Faithfulness, BERTScore, ROUGE
- Ranking Metrics: MRR, MAP, NDCG
- Custom Metrics: Discussion insight quality
# Neo4j Configuration
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=research2025
# Redis Configuration
REDIS_URL=redis://localhost:6379
# Ollama Configuration
OLLAMA_HOST=http://localhost:11434
# API Configuration
PORT=8000Modify data/persona.json to adjust agent behavior:
{
"rlhf_thresholds": {
"retrieval_required": 0.6,
"minimum_context_overlap": 0.3,
"formality_level": 0.6,
"technical_detail_level": 0.7,
"citation_requirement": 0.8
}
}The system includes comprehensive evaluation capabilities:
python evaluation/run_evaluation.py --dataset evaluation/datasets/your_dataset.json{
"retrieval": {
"PrecisionMetric_summary": {"mean": 0.783, "std": 0.142},
"RecallMetric_summary": {"mean": 0.654, "std": 0.098},
"SufficiencyMetric_summary": {"mean": 0.812, "std": 0.076}
},
"generation": {
"FaithfulnessMetric_summary": {"mean": 0.894, "std": 0.054},
"BERTScoreMetric_summary": {"mean": 0.872, "std": 0.061}
}
}Create evaluation datasets:
{
"queries": [
{
"query": "Sample research question",
"persona": "researcher",
"ground_truth_chunk_ids": ["doc1", "doc2"],
"reference_answer": "Expected answer",
"complexity_score": 0.7
}
]
}python -m pytest./test.shTest with synthetic datasets and monitor resource usage through the evaluation dashboard.
# Full development environment
docker-compose -f docker-compose.yml -f docker-compose.dev.yml up -d
# Run with hot reload
python main.py --reloadMIT License - see LICENSE file for details.
- GraphRAG: Inspired by Microsoft's Graph Retrieval Augmented Generation research
- vero-eval: Comprehensive RAG evaluation framework
- Ollama: Local LLM serving made possible
- Neo4j: World's leading graph database
- FastAPI: Modern Python web framework
- π Documentation
- π Bug Reports
- π¬ Discussions
- π§ Contact: your-email@example.com
