PatternRAG

PatternRAG is an advanced Retrieval-Augmented Generation system designed to identify non-obvious connections and patterns across documents. It combines vector search, knowledge graph analysis, and LLM reasoning to discover relationships between concepts that might be missed by traditional RAG systems.

🌟 Key Features

Multi-perspective Retrieval: Expands queries to look for connections across domains
Knowledge Graph Integration: Uses entity and relationship extraction to build a knowledge graph
Pattern Detection: Specialized prompting to identify meaningful patterns and connections
Hierarchical Chunking: Processes documents at both paragraph and sentence levels
OpenAI-compatible API: Drop-in replacement for OpenAI's chat completions API

🚀 Getting Started

Prerequisites

Python 3.8+
An LLM API service like Ollama
At least 8GB RAM (16GB+ recommended)
10GB+ storage space for document processing
Docker installation of OpenWebUI or equivalent - for a front-end for the utility

Installation

Clone the repository:

git clone https://github.com/Robert-Beken/PatternRAG.git
cd pattern-rag

Install dependencies:

pip install -r requirements.txt

Install spaCy model:

python -m spacy download en_core_web_sm

Create directories:

mkdir -p data/db data/metadata data/graph documents

Configure settings:

cp config/default_config.yaml config/config.yaml
# Edit config.yaml as needed

Basic Usage

Add documents:

Place your documents in the documents directory or specify a custom location in the config file.
Process documents:

python -m patternrag.ingest --config config/config.yaml

Start the API service:

python -m patternrag.service --config config/config.yaml

Query the system:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "pattern-rag",
    "messages": [
      {"role": "user", "content": "What connections exist between mathematics and music?"}
    ]
  }'

📖 Documentation

💡 How It Works

PatternRAG works by:

Document Processing: Documents are loaded, chunked, and embedded into a vector database. Entities and relationships are extracted to build a knowledge graph.
Query Analysis: User queries are analyzed for entities and expanded to look for potential connections across domains.
Multi-angle Retrieval:
- Vector similarity search with expanded queries
- Knowledge graph traversal to find related entities
- Predefined pattern-based searches
Pattern Identification: An LLM analyzes retrieved documents to identify meaningful patterns and connections.
Response Generation: The system synthesizes findings into a coherent response that highlights discovered patterns.

🔄 Searching Modes

PatternRAG offers two search modes:

Pattern Mode (default): Full pattern-finding capabilities, query expansion, and connection analysis.
Standard Mode: Simple retrieval without extensive pattern finding. Activate by prefixing your query with "standard search".

🛠️ Advanced Configuration

PatternRAG is highly configurable. Key configuration options include:

Custom Pattern Templates: Define patterns to guide the system's search
Embedding Model: Choose the embedding model for vector search
Chunking Parameters: Adjust document chunking for different document types
LLM Settings: Configure which model to use for reasoning

See the Configuration Guide for detailed options.

👥 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📜 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

🙏 Acknowledgments

LangChain for the RAG framework
ChromaDB for vector database functionality
NetworkX for knowledge graph capabilities
spaCy for NLP processing

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
config		config
docs		docs
utils		utils
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
__init__.py		__init__.py
ingest.py		ingest.py
readme.md		readme.md
requirements.txt		requirements.txt
service.py		service.py
setup.py		setup.py
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PatternRAG

🌟 Key Features

🚀 Getting Started

Prerequisites

Installation

Basic Usage

📖 Documentation

💡 How It Works

🔄 Searching Modes

🛠️ Advanced Configuration

👥 Contributing

📜 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PatternRAG

🌟 Key Features

🚀 Getting Started

Prerequisites

Installation

Basic Usage

📖 Documentation

💡 How It Works

🔄 Searching Modes

🛠️ Advanced Configuration

👥 Contributing

📜 License

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages