Skip to content

pfalli/RAG_Python_Tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python RAG Tutorial: From Basics to Advanced Customization

Welcome to the Python RAG (Retrieval-Augmented Generation) Tutorial! This project is designed as a step-by-step learning journey to help you understand, build, and optimize RAG applications running completely locally.

By following the scripts in order, you will learn how to process documents, generate embeddings, store them in a vector database, and connect them to a local LLM using Ollama.

📁 Project Structure

RAG_Python_Tutorial/
├── data/                   # Contains sample datasets used in the tutorial
│   ├── knowledge_base/     # Text files covering generic Python knowledge
│   ├── pdfs/               # Folder for testing PDF processing
│   ├── python_info.txt     # Single document sample
│   └── sample_doc.txt      # General sample text
├── genai_env/              # Python virtual environment (dependencies)
└── scripts/                # The core tutorial scripts (numbered sequentially)

(Note: As you run the scripts, various chroma_db folders will be generated to store your local vector embeddings).

🧠 System Architecture Overview

To build a solid RAG system, it is crucial to understand the foundational lifecycle. This tutorial implementation structure maps sequentially through these core concepts:

  • Documents and ingestion → Gathering raw data sources (text files, PDFs, etc.).
  • Text extraction and cleaning → Parsing the documents to remove noise and extract usable textual data.
  • Chunking → Breaking the clean text into smaller intelligently grouped pieces.
  • Embeddings → Converting chunked text into vectors to allow calculation of semantic similarity.
  • Vector storage → Storing and searching these embeddings efficiently using a database.
  • Retrieval → Finding the most relevant context vectors for user queries based on semantic representation.
  • Prompting + answer generation → Feeding the retrieved context to the LLM alongside the user prompt to synthesize an accurate answer.
  • Evaluation → Analyzing the quality, relevance, and accuracy of generated outputs.
  • Serving/UI + deployment → Wrapping your operational RAG engine into user-friendly platforms (e.g., Gradio web UI).

🚀 Step-by-Step Guide (File by File)

The tutorial is broken down into sequentially numbered Python scripts located in the scripts/ folder. It is highly recommended to study and run them in order.

Phase 1: The Foundations of RAG

  • 01_embeddings_basics.py: Introduction to creating embeddings. Learn how text is converted into numerical vectors so machines can understand semantic meaning.
  • 02_document_processing.py: Learn how to load, clean, and split (chunk) large text documents into smaller, manageable pieces suitable for vectorization.
  • 03_rag_ollama_basic.py: Your first end-to-end RAG pipeline! Connect document embeddings, a ChromaDB vector store, and a local Ollama LLM to answer questions based on your data.

Phase 2: Improving Retrieval and Handling Data

  • 04_retrieval_strategies.py: Explores advanced retrieval techniques (e.g., semantic search vs. keyword search, similarity thresholds) to ensure the LLM gets the most relevant context.
  • 05_multi_document_rag.py: Scales up the basic pipeline to ingest and query across multiple text documents within the data/knowledge_base/ directory.
  • 05b_rag_ollama_pdf.py: Extends the data ingestion pipeline to handle PDFs instead of just plain text files.

Phase 3: Building a Conversational Chatbot

  • 06_ollama_chatbot_local.py: Upgrades your RAG system into a continuous, interactive terminal chatbot.
  • 06b_add_conversation_memory.py: Adds conversation history (memory) to the chatbot so it can remember previous questions and answers in your chat session.
  • 06c_add_calculate_tokens.py: Introduces token counting and management, crucial for ensuring your prompts and conversational memory don't exceed the LLM's context window limits.

Phase 4: User Interfaces and Advanced Techniques

  • 07_rag_chatbot_ui.py: Moves the chatbot out of the terminal and into a web-based User Interface (using tools like Gradio) for a more user-friendly experience.
  • 08_optimization_techniques.py: Covers advanced optimization strategies to improve the speed, accuracy, and reliability of your RAG outputs.
  • 09_custom_RAG.py: A fully customized, advanced RAG implementation incorporating everything you've learned into a robust, object-oriented pipeline.

� Advice for Learning

To get the most value out of this tutorial, follow this iterative approach:

  1. Read, Run, Check & Edit: Start by reading the script to understand its logic. Then, run the script, review the answers it generates, and critically edit or customize it to test new behaviors.
  2. Capstone Challenge: Once you finish all scripts, challenge yourself by creating a "final project": a completely customized RAG-based local chatbot centered around a dataset (PDFs or Text) of your own choosing!

�🛠️ Setup Instructions

To run this project on your own machine:

  1. Install Ollama: Download and install Ollama. Pull your preferred local model (e.g., ollama run llama3 or ollama run mistral).
  2. Set up Virtual Environment:
    # Create a virtual environment (if not already created)
    python -m venv genai_env
    
    # Activate it (On Windows)
    .\genai_env\Scripts\activate
    
    # Or on macOS/Linux
    source genai_env/bin/activate
  3. Install Requirements:
    pip install -r requirements.txt
  4. Run the Scripts: Navigate to the scripts directory and run them one by one to see how the ecosystem works.
    cd scripts
    python 01_embeddings_basics.py

⚠️ Important Notes for Testing

  • Ollama Models: The specific LLMs used in this tutorial (like llama3, mistral, etc.) might be updated or replaced over time. Ensure you have pulled the model required by the active script using ollama pull <model_name>.
  • Gradio UI: The gradio package used for the web interface in script 07 may undergo API changes in newer versions. If you encounter errors launching the UI, check your package version against the Gradio documentation.

Happy coding and enjoy building your own local AI applications!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages