Python RAG Tutorial: From Basics to Advanced Customization

Welcome to the Python RAG (Retrieval-Augmented Generation) Tutorial! This project is designed as a step-by-step learning journey to help you understand, build, and optimize RAG applications running completely locally.

By following the scripts in order, you will learn how to process documents, generate embeddings, store them in a vector database, and connect them to a local LLM using Ollama.

📁 Project Structure

RAG_Python_Tutorial/
├── data/                   # Contains sample datasets used in the tutorial
│   ├── knowledge_base/     # Text files covering generic Python knowledge
│   ├── pdfs/               # Folder for testing PDF processing
│   ├── python_info.txt     # Single document sample
│   └── sample_doc.txt      # General sample text
├── genai_env/              # Python virtual environment (dependencies)
└── scripts/                # The core tutorial scripts (numbered sequentially)

(Note: As you run the scripts, various chroma_db folders will be generated to store your local vector embeddings).

🧠 System Architecture Overview

To build a solid RAG system, it is crucial to understand the foundational lifecycle. This tutorial implementation structure maps sequentially through these core concepts:

Documents and ingestion → Gathering raw data sources (text files, PDFs, etc.).
Text extraction and cleaning → Parsing the documents to remove noise and extract usable textual data.
Chunking → Breaking the clean text into smaller intelligently grouped pieces.
Embeddings → Converting chunked text into vectors to allow calculation of semantic similarity.
Vector storage → Storing and searching these embeddings efficiently using a database.
Retrieval → Finding the most relevant context vectors for user queries based on semantic representation.
Prompting + answer generation → Feeding the retrieved context to the LLM alongside the user prompt to synthesize an accurate answer.
Evaluation → Analyzing the quality, relevance, and accuracy of generated outputs.
Serving/UI + deployment → Wrapping your operational RAG engine into user-friendly platforms (e.g., Gradio web UI).

🚀 Step-by-Step Guide (File by File)

The tutorial is broken down into sequentially numbered Python scripts located in the scripts/ folder. It is highly recommended to study and run them in order.

Phase 1: The Foundations of RAG

01_embeddings_basics.py: Introduction to creating embeddings. Learn how text is converted into numerical vectors so machines can understand semantic meaning.
02_document_processing.py: Learn how to load, clean, and split (chunk) large text documents into smaller, manageable pieces suitable for vectorization.
03_rag_ollama_basic.py: Your first end-to-end RAG pipeline! Connect document embeddings, a ChromaDB vector store, and a local Ollama LLM to answer questions based on your data.

Phase 2: Improving Retrieval and Handling Data

04_retrieval_strategies.py: Explores advanced retrieval techniques (e.g., semantic search vs. keyword search, similarity thresholds) to ensure the LLM gets the most relevant context.
05_multi_document_rag.py: Scales up the basic pipeline to ingest and query across multiple text documents within the data/knowledge_base/ directory.
05b_rag_ollama_pdf.py: Extends the data ingestion pipeline to handle PDFs instead of just plain text files.

Phase 3: Building a Conversational Chatbot

06_ollama_chatbot_local.py: Upgrades your RAG system into a continuous, interactive terminal chatbot.
06b_add_conversation_memory.py: Adds conversation history (memory) to the chatbot so it can remember previous questions and answers in your chat session.
06c_add_calculate_tokens.py: Introduces token counting and management, crucial for ensuring your prompts and conversational memory don't exceed the LLM's context window limits.

Phase 4: User Interfaces and Advanced Techniques

07_rag_chatbot_ui.py: Moves the chatbot out of the terminal and into a web-based User Interface (using tools like Gradio) for a more user-friendly experience.
08_optimization_techniques.py: Covers advanced optimization strategies to improve the speed, accuracy, and reliability of your RAG outputs.
09_custom_RAG.py: A fully customized, advanced RAG implementation incorporating everything you've learned into a robust, object-oriented pipeline.

� Advice for Learning

To get the most value out of this tutorial, follow this iterative approach:

Read, Run, Check & Edit: Start by reading the script to understand its logic. Then, run the script, review the answers it generates, and critically edit or customize it to test new behaviors.
Capstone Challenge: Once you finish all scripts, challenge yourself by creating a "final project": a completely customized RAG-based local chatbot centered around a dataset (PDFs or Text) of your own choosing!

�🛠️ Setup Instructions

To run this project on your own machine:

Install Ollama: Download and install Ollama. Pull your preferred local model (e.g., ollama run llama3 or ollama run mistral).

Set up Virtual Environment:

# Create a virtual environment (if not already created)
python -m venv genai_env

# Activate it (On Windows)
.\genai_env\Scripts\activate

# Or on macOS/Linux
source genai_env/bin/activate

Install Requirements:
```
pip install -r requirements.txt
```
Run the Scripts: Navigate to the scripts directory and run them one by one to see how the ecosystem works.
```
cd scripts
python 01_embeddings_basics.py
```

⚠️ Important Notes for Testing

Ollama Models: The specific LLMs used in this tutorial (like llama3, mistral, etc.) might be updated or replaced over time. Ensure you have pulled the model required by the active script using ollama pull <model_name>.
Gradio UI: The gradio package used for the web interface in script 07 may undergo API changes in newer versions. If you encounter errors launching the UI, check your package version against the Gradio documentation.

Happy coding and enjoy building your own local AI applications!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
scripts		scripts
.gitignore		.gitignore
README.md		README.md
rag_concepts_summary.md		rag_concepts_summary.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python RAG Tutorial: From Basics to Advanced Customization

📁 Project Structure

🧠 System Architecture Overview

🚀 Step-by-Step Guide (File by File)

Phase 1: The Foundations of RAG

Phase 2: Improving Retrieval and Handling Data

Phase 3: Building a Conversational Chatbot

Phase 4: User Interfaces and Advanced Techniques

� Advice for Learning

�🛠️ Setup Instructions

⚠️ Important Notes for Testing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Python RAG Tutorial: From Basics to Advanced Customization

📁 Project Structure

🧠 System Architecture Overview

🚀 Step-by-Step Guide (File by File)

Phase 1: The Foundations of RAG

Phase 2: Improving Retrieval and Handling Data

Phase 3: Building a Conversational Chatbot

Phase 4: User Interfaces and Advanced Techniques

� Advice for Learning

�🛠️ Setup Instructions

⚠️ Important Notes for Testing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages