RAG Chatbot

This repository contains code and instructions for running your own Firebolt-powered chatbot that uses retrieval-augmented generation (RAG).

Quick Start

Prerequisites

Python 3.12 (for local setup)
Docker and Docker Compose (for Docker setup)
GPU with NVIDIA drivers (optional, for better performance)
Git

Installing Task

Task is a task runner / build tool that aims to be simpler and easier to use than, for example, GNU Make. It's used in this project to simplify common operations.

Install Task using one of the following methods:

# macOS (via Homebrew)
brew install go-task/tap/go-task

# Linux (via script)
sh -c "$(curl --location https://taskfile.dev/install.sh)" -- -d

For other installation methods, see the official Task installation guide.

Option 1: Automated Setup (Recommended)

# Clone the repository
git clone <repository-url>
cd rag_chatbot

# Run the automated setup script (interactive mode)
./setup.sh

# Or specify setup mode directly:
./setup.sh --docker  # for Docker setup
./setup.sh --local   # for local Python setup

The script will:

Check your system for required dependencies
Install Ollama if not already installed
Set up environment files
Install Python dependencies or build Docker containers
Download required Ollama models

Option 2: Task-based Setup

If you have Task installed:

# Install dependencies
task install-deps

# Setup Ollama models
task setup-ollama

# Start the server
task start-server

Configuration

Copy .env.example to .env and fill in your Firebolt credentials
Update the GitHub repository paths and chunking strategy configuration in your .env file
Run task populate or python populate_table.py to populate your vector database
The system automatically ensures chunking strategy consistency across embedding generation and retrieval

Docker Setup for populate_table.py

When using Docker, you can populate the table using the following methods:

Option 1: Using Task (Recommended)

# This automatically detects Docker vs local setup
task populate

Option 2: Direct Docker Command

# Ensure your Docker services are running
docker compose up -d

# Run populate_table.py inside the container
docker compose exec rag_chatbot python populate_table.py

Important Notes for Docker:

The FIREBOLT_RAG_CHATBOT_LOCAL_GITHUB_PATH environment variable should point to your local GitHub repositories directory
This directory is automatically mounted to /github inside the Docker container
The script will automatically use /github as the base path when running in Docker
Make sure your document repositories are cloned locally in the FIREBOLT_RAG_CHATBOT_LOCAL_GITHUB_PATH directory before running

Troubleshooting

GPU Support: For Docker GPU support, ensure NVIDIA Docker runtime is installed
Ollama Models: Models are automatically downloaded but may take time on first run
Port Conflicts: Default ports are 5000 (web) and 11434 (Ollama)
Ollama Performance: For better performance with large models, consider the following:
- On macOS: OLLAMA_FLASH_ATTENTION="1" OLLAMA_KV_CACHE_TYPE="q8_0" /usr/local/opt/ollama/bin/ollama serve
- For production: Use GPU-accelerated or cloud-hosted inference services

Prerequisites

Firebolt Account Setup

Register for Firebolt
Set up your account following these instructions
Create a database by following the Create a Database section
Create or use an existing engine (Firebolt may have automatically created my_engine)
Create a service account:
- Follow the service account setup instructions
- When creating a user, select Service Account in the Assign To dropdown and account_admin for the role

System Requirements

Python 3.12
GPU support (recommended) - many local computers have GPUs, but for better performance consider using a cloud GPU instance

Detailed Setup Guide

Environment Setup

Option 1: Automated Setup (Recommended)

Run the automated script which will handle all dependencies:

./setup.sh

Choose either Docker or local setup when prompted, or specify directly:

./setup.sh --docker  # for Docker setup
./setup.sh --local   # for local Python setup

Option 2: Manual Setup

If you prefer to set up manually:

Install Ollama:
- macOS: brew install ollama && brew services start ollama
- Linux: curl -fsSL https://ollama.com/install.sh | sh
- Windows: Download from ollama.com/download

Pull Required Models:

ollama pull llama3.1
ollama pull nomic-embed-text

Setup Python Environment:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt

Configuration

Environment Variables:

Copy .env.example to .env

Fill in the required Firebolt credentials and configuration:

# Firebolt Database Configuration
FIREBOLT_RAG_CHATBOT_CLIENT_ID=<your-service-account-id>
FIREBOLT_RAG_CHATBOT_CLIENT_SECRET=<your-service-account-secret>
FIREBOLT_RAG_CHATBOT_ENGINE=<your-engine-name>
FIREBOLT_RAG_CHATBOT_DB=<your-database-name>
FIREBOLT_RAG_CHATBOT_ACCOUNT_NAME=<your-account-name>
FIREBOLT_RAG_CHATBOT_TABLE_NAME=<your-table-name>
FIREBOLT_RAG_CHATBOT_LOCAL_GITHUB_PATH=<path-to-your-github-repos>

# Chunking Strategy Configuration (Environment-Driven)
FIREBOLT_RAG_CHATBOT_CHUNKING_STRATEGY=recursive_character_text_splitting
FIREBOLT_RAG_CHATBOT_CHUNK_SIZE=300
FIREBOLT_RAG_CHATBOT_CHUNK_OVERLAP=50
FIREBOLT_RAG_CHATBOT_NUM_WORDS_PER_CHUNK=100
FIREBOLT_RAG_CHATBOT_NUM_SENTENCES_PER_CHUNK=3
FIREBOLT_RAG_CHATBOT_BATCH_SIZE=150

Chunking Strategy Options:

recursive_character_text_splitting (recommended)
semantic_chunking
by_paragraph
by_sentence
by_sentence_with_sliding_window
every_n_words

Prepare Documents for RAG:
- Clone your document repositories locally
- Update repo_dict in populate_table.py with your repositories
- Configure chunking strategy and parameters via environment variables (no code changes needed)
- Optionally, add file names to DISALLOWED_FILENAMES in constants.py to exclude them
Populate the Vector Database:
- The script automatically validates chunking strategy consistency to prevent embedding mismatches
- For local setup: python populate_table.py
- For Docker setup: task populate or docker compose exec rag_chatbot python populate_table.py
- Important: The system will warn you if changing chunking strategies on existing embeddings
Customize the Chatbot:
- Modify the prompt in run_chatbot() function in run_llm.py to suit your use case
- Configure chunking strategy and parameters via environment variables in your .env file
- The system automatically ensures consistency between embedding generation and retrieval phases

Running the Chatbot

Local Setup:

python web_server.py

Docker Setup:

docker-compose up -d

Access the web UI at http://127.0.0.1:5000

Common Issues and Solutions

Document Handling

Supported Formats: Only .docx, .txt, and .md files are processed
Character Issues: Null characters and certain Unicode values may cause errors in Firebolt tables
Markdown Syntax: Ensure all Markdown files have valid syntax to prevent errors

Chunking Strategy Configuration

Environment-Driven: All chunking parameters are configurable via environment variables
Consistency Validation: The system automatically validates chunking strategy consistency before processing
No Code Changes: Switch between chunking strategies by updating your .env file only
Strategy Mismatch Warning: The system warns when attempting to mix different chunking strategies in the same database

User Access Control

To toggle between internal/external user access:

Go to web_server.py
Set is_customer=True in the run_chatbot() function to restrict access to public documents only

Example Dataset

We have provided an example dataset that you can use to build your chatbot! You can find the dataset at this GitHub repository, which contains documentation for HuggingFace Transformers.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
templates		templates
.env.example		.env.example
.gitignore		.gitignore
.legitignore		.legitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
Taskfile.yaml		Taskfile.yaml
chunking_and_embedding.py		chunking_and_embedding.py
constants.py		constants.py
docker-compose.yml		docker-compose.yml
file_parsing.py		file_parsing.py
get_docs_and_versions.py		get_docs_and_versions.py
populate_table.py		populate_table.py
requirements.txt		requirements.txt
run_llm.py		run_llm.py
settings.py		settings.py
setup.sh		setup.sh
vector_search.py		vector_search.py
web_server.py		web_server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG Chatbot

Quick Start

Prerequisites

Installing Task

Option 1: Automated Setup (Recommended)

Option 2: Task-based Setup

Configuration

Docker Setup for populate_table.py

Troubleshooting

Prerequisites

Firebolt Account Setup

System Requirements

Detailed Setup Guide

Environment Setup

Option 1: Automated Setup (Recommended)

Option 2: Manual Setup

Configuration

Running the Chatbot

Local Setup:

Docker Setup:

Common Issues and Solutions

Document Handling

Chunking Strategy Configuration

User Access Control

Example Dataset

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

firebolt-db/rag_chatbot

Folders and files

Latest commit

History

Repository files navigation

RAG Chatbot

Quick Start

Prerequisites

Installing Task

Option 1: Automated Setup (Recommended)

Option 2: Task-based Setup

Configuration

Docker Setup for populate_table.py

Troubleshooting

Prerequisites

Firebolt Account Setup

System Requirements

Detailed Setup Guide

Environment Setup

Option 1: Automated Setup (Recommended)

Option 2: Manual Setup

Configuration

Running the Chatbot

Local Setup:

Docker Setup:

Common Issues and Solutions

Document Handling

Chunking Strategy Configuration

User Access Control

Example Dataset

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages