LocalRAG - Local Retrieval-Augmented Generation Chat

A powerful RAG (Retrieval-Augmented Generation) chat application that allows you to chat with your documents using local LLMs through Ollama.

Features

💬 Interactive Chat Interface - Built with Streamlit for a smooth user experience
📄 Document Upload - Support for PDF and TXT files
🤖 Local LLM Support - Uses Ollama for complete privacy and offline operation
🔍 Vector Search - ChromaDB for efficient document retrieval
⚡ Streaming Responses - Real-time response generation
🎨 Model Selection - Switch between different Ollama models on the fly
📚 Persistent Storage - Documents are stored in a local vector database

Prerequisites

Python 3.10 or higher
Ollama installed and running locally
At least one Ollama model downloaded (e.g., llama3.2:latest)

Installing Ollama

Download and install Ollama from https://ollama.ai/
Pull a model: ollama pull llama3.2:latest
Verify Ollama is running: ollama list

Installation

1. Clone or Download the Repository

cd LocalRAG

2. Install Dependencies

Option A: Using uv (Recommended - Faster)

uv pip install --system -r requirements.txt

Option B: Using pip

pip install -r requirements.txt

Configuration

The application can be configured by editing config.py or setting environment variables:

Setting	Default	Description
`llm_model`	`llama3.2:latest`	Ollama model to use
`text_embedding_model`	`nomic-embed-text`	Embedding model for vector search
`chunk_size`	`1512`	Text chunk size for document splitting
`chunk_overlap`	`256`	Overlap between chunks
`max_context_docs`	`3`	Number of documents to retrieve
`temp_folder`	`./_temp`	Temporary file storage
`chroma_path`	`./chroma`	Vector database storage path
`log_level`	`INFO`	Logging level

Usage

Starting the Application

streamlit run app.py

The application will open in your default browser at http://localhost:8501

Using the Application

Upload Documents (Optional)
- Click the sidebar to expand it
- Go to "Document Management"
- Upload PDF or TXT files
- Wait for processing confirmation
Select Model (Optional)
- In the sidebar under "Model Configuration"
- Choose from available Ollama models
Chat
- Type your question in the chat input
- Press Enter to send
- View streaming responses in real-time
Clear History
- Click "🧹 Clear Chat History" in the sidebar

Project Structure

LocalRAG/
├── app.py                  # Main Streamlit application
├── query.py               # Query handling and RAG logic
├── embed.py               # Document embedding and processing
├── get_vector_db.py       # Vector database management
├── config.py              # Configuration settings
├── logger_config.py       # Logging configuration
├── requirements.txt       # Python dependencies
├── README.md             # This file
├── chroma/               # Vector database storage (created on first run)
└── _temp/                # Temporary file storage (created on first run)

How It Works

Document Processing: Uploaded documents are split into chunks and embedded using Ollama's embedding model
Vector Storage: Embeddings are stored in ChromaDB for efficient retrieval
Query Processing: User queries are embedded and used to retrieve relevant document chunks
Response Generation: Retrieved context is sent to the LLM along with the query to generate accurate responses

Dependencies

Core dependencies:

streamlit - Web interface
langchain - LLM framework
langchain-ollama - Ollama integration
langchain-chroma - ChromaDB integration
langchain-community - Document loaders
chromadb - Vector database
pypdf - PDF processing

See requirements.txt for the complete list.

Troubleshooting

Ollama Connection Error

Problem: "Could not connect to Ollama API" Solution:

Ensure Ollama is running: ollama serve
Check if Ollama is accessible: curl http://localhost:11434/api/tags

Import Errors

Problem: ModuleNotFoundError Solution: Reinstall dependencies

pip install --upgrade -r requirements.txt

No Documents Found

Problem: RAG not working even after uploading files Solution:

Check chroma/ directory exists
Verify embedding model is downloaded: ollama pull nomic-embed-text
Check logs for errors

Slow Response Times

Problem: Responses take too long Solution:

Use a smaller/faster model
Reduce max_context_docs in config
Ensure Ollama has adequate resources

Development

Running Tests

python -c "from config import settings; from query import get_query_handler; print('✅ All modules working')"

Logging

Logs are output to the console. Adjust log_level in config.py for more/less detail:

DEBUG - Detailed information
INFO - General information (default)
WARNING - Warning messages only
ERROR - Error messages only

Performance Tips

Model Selection: Smaller models (e.g., llama3.2:3b) are faster than larger ones
Document Size: Break large documents into smaller files for faster processing
Chunk Size: Adjust chunk_size based on your documents (larger for books, smaller for articles)
GPU Acceleration: Ollama automatically uses GPU if available

License

This project is open source and available for personal and commercial use.

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

Acknowledgments

Ollama - Local LLM inference
LangChain - LLM framework
ChromaDB - Vector database
Streamlit - Web interface

Support

For issues, questions, or suggestions, please open an issue on the repository.

Happy chatting with your documents! 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Templates		Templates
static		static
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
MODEL_CONFIG_UPDATE.md		MODEL_CONFIG_UPDATE.md
PDF_FIX_SUMMARY.md		PDF_FIX_SUMMARY.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
SETUP_SUMMARY.md		SETUP_SUMMARY.md
app.py		app.py
config.py		config.py
diagnostic.py		diagnostic.py
embed.py		embed.py
get_vector_db.py		get_vector_db.py
install_tesseract.md		install_tesseract.md
logger_config.py		logger_config.py
package.json		package.json
query copy.py		query copy.py
query.py		query.py
query_vue.py		query_vue.py
requirements.txt		requirements.txt
start.bat		start.bat
start.sh		start.sh
test_pdf.py		test_pdf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LocalRAG - Local Retrieval-Augmented Generation Chat

Features

Prerequisites

Installing Ollama

Installation

1. Clone or Download the Repository

2. Install Dependencies

Configuration

Usage

Starting the Application

Using the Application

Project Structure

How It Works

Dependencies

Troubleshooting

Ollama Connection Error

Import Errors

No Documents Found

Slow Response Times

Development

Running Tests

Logging

Performance Tips

License

Contributing

Acknowledgments

Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LocalRAG - Local Retrieval-Augmented Generation Chat

Features

Prerequisites

Installing Ollama

Installation

1. Clone or Download the Repository

2. Install Dependencies

Configuration

Usage

Starting the Application

Using the Application

Project Structure

How It Works

Dependencies

Troubleshooting

Ollama Connection Error

Import Errors

No Documents Found

Slow Response Times

Development

Running Tests

Logging

Performance Tips

License

Contributing

Acknowledgments

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages