Custom-RAG

Introduction

This is a custom RAG you can use with any model that uses ollama. You can upload any pdf, word, or txt file into the DataFiles folder and use it in a RAG system.

I hope you can excuse the troublesome setup process. This is tested on Linux Mint Cinnamon and MacOS. I do not plan to support windows.

Installation

First, ensure that ollama is installed on your system, and pull the following models using the commands below:

ollama pull bge-m3

ollama pull llama3.2:latest

Then, install the python dependencies. (I warn you that there are a lot of requirements.)

pip3 install -r requirements.txt

Generating your vector database

This RAG system uses ChromaDB to store its vector embeddings. Go into the /Custom-RAG/ChatBot/DataFiles directory and ensure that .FilesAdded.txt is clear. This file will be used to log everything that is embedded. After that, upload your documents into the folder.

Go back to your ChatBot directory and run:

python3 GenerateDB.py

After it is done running, you should see a VectorDB directory with the vector database.

Starting the system

You can run this from the terminal or through a really ugly chat-like UI I made in 2 days. Beware that if ollama is not started when you run the program, you have to run it in the background with:

ollama serve

The backend is written in Flask and the front-end is just pure JS.

You can also use any generative model ollama supports. Just pull the model you want in ollama, and change the LANGUAGE_MODEL in ChatBot.py to the model name.

Running it in terminal

Run the TerminalChat.py:

python3 ./ChatBot/TerminalChat.py

Running it on a localhost

Run the app.py in /Custom-RAG:

python3 app.py

Explanation on this RAG

Files will be embedded using "bge m3" model from BAAI collection. I used this model because it is fast, multi-lingual, and can be run on almost any modern hardware.

The reranking or cross-encoder is "ms-marco-MiniLM-L-6-v2". This model is called using Sentence-transformers.

The generative model is "llama-3.2", which is a fast, decent 3B parameter model for this occasion. You can run deepseek with bigger parameters if you would like too.

What is happening under the hood?

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
ChatBot		ChatBot
VectorDB		VectorDB
static		static
templates		templates
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Custom-RAG

Introduction

Installation

Generating your vector database

Starting the system

Running it in terminal

Running it on a localhost

Explanation on this RAG

What is happening under the hood?

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Ar-temis/Custom-RAG

Folders and files

Latest commit

History

Repository files navigation

Custom-RAG

Introduction

Installation

Generating your vector database

Starting the system

Running it in terminal

Running it on a localhost

Explanation on this RAG

What is happening under the hood?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages