# **RAGSkeleton: A Google Colab Tutorial**

**A Modular Framework for Retrieval-Augmented Generation (RAG) Systems**

### **Purpose of This Tutorial**

This tutorial demonstrates how [**RAGSkeleton**](https://github.com/hasan-sayeed/RAGSkeleton) allows you to:

- Parse PDF files, create a vector database, and query it using a RAG pipeline—all within a single framework.
- Use any Hugging Face model for text generation, either locally or via the Hugging Face API.
- Swap between different LLMs (e.g., `meta-llama/Llama-3.2-3B-Instruct` and `meta-llama/Meta-Llama-3-8B-Instruct`) effortlessly.

By default, **RAGSkeleton** is designed as a RAG system for materials science documents. However, users can easily adapt it to any other field by modifying the prompt sent to the LLMs. This modularity makes RAGSkeleton versatile and applicable across diverse domains.

By following this tutorial, you’ll learn how to set up and run a Retrieval-Augmented Generation pipeline on your documents using **RAGSkeleton**, ask questions, and get accurate responses grounded in the document content.

## **Step 1: Install RAGSkeleton**

Install RAGSkeleton directly from PyPI.

In [None]:
!pip install rag-skeleton

## **Step 2: Login to Hugging Face**

To access any model from the Hugging Face Hub, log in using the huggingface-cli. This step is required if you plan to use the API for larger models or those not available locally.

In [None]:
!huggingface-cli login

Follow the on-screen instructions to enter your Hugging Face access token. You can obtain the token from your [Hugging Face](https://huggingface.co/) account settings.

## **Step 3: Mount Google Drive**

Mount your Google Drive to access the folder containing your documents. Ensure your PDFs are stored in a folder named `rag_data`.

In [None]:
from google.colab import drive

# Mount Google Drive to access and store your files
drive.mount('/content/drive')

# Define the path to your folder in Google Drive containing the raw PDF files
data_path = "/content/drive/My Drive/rag_data"

# Define the path to store the vector database created from the documents
# This database will be queried by the RAG system to retrieve relevant context for questions
vectordb_path = "/content/drive/My Drive/rag_data/vectordb"

## **Step 4: Process Documents and Create a Vector Database**

All parsing and indexing are handled by RAGSkeleton using its DataProcessor module. This step:

- Loads the documents from the `rag_data` folder.
- Splits the documents into smaller chunks.
- Creates a vector database for efficient retrieval.

In [None]:
from rag_skeleton.data_processing import DataProcessor

# Initialize and process documents
data_processor = DataProcessor(data_path=data_path, vectordb_path=vectordb_path)
data_processor.process_and_create_db()

## **Step 5: Initialize the RAG Pipeline**

The **RAGPipeline** module of RAGSkeleton combines document retrieval and text generation. It supports both local model loading and model loading via the Hugging Face API. For demonstration purposes, we will use API mode to access a larger model, `meta-llama/Meta-Llama-3-8B-Instruct`, via the Hugging Face Hub.

Explanation:
- **Local Mode(`load_mode='local'`)**: Use a model available locally on your machine.
  - Use a model available locally on your machine.
  - This is suitable for smaller models that fit on your hardware or when you have the required computational resources.
- **API Mode(`load_mode='api'`)**: Access any model from the Hugging Face Hub, ideal for larger models like `meta-llama/Meta-Llama-3-8B-Instruct` that may not fit on your local hardware.
  - Allows access to any model from the Hugging Face Hub.
  - Useful for larger models that may not fit on local hardware.
  - **Note**: You must provide your Hugging Face API token using the `api_token=` parameter.


**Example: Initialize the Pipeline in API Mode**

Here’s how to initialize the pipeline in API mode:

In [None]:
from rag_skeleton.rag import RAGPipeline

# Initialize the RAG pipeline
rag_pipeline = RAGPipeline(
    vectordb_path=vectordb_path,  # Default location for vector database
    model_name="meta-llama/Meta-Llama-3-8B-Instruct",  # Example LLM from Hugging Face
    load_mode="api",  # 'local' for models on your system; use 'api' for Hugging Face API
    api_token='YOUR_API_HERE'   # Replace with your Hugging Face API token
)
rag_pipeline.setup_pipeline()

This will set up the RAG pipeline to query your vector database and generate responses using the specified model from the Hugging Face Hub.

**Modularity: Swapping Models**

RAGSkeleton is designed with modularity in mind, allowing you to easily swap the model used for text generation. For example:

- You could use a smaller model like `meta-llama/Llama-3.2-3B-Instruct` for faster inference with lower hardware requirements.
- Alternatively, you can experiment with newer models from the Hugging Face Hub to improve generation quality.
Simply change the `model_name` parameter in the above code block to your desired model.

## **Step 6: Ask Questions**

The RAG system is now ready to accept your questions interactively. It will:

- Retrieve relevant context from the vector database based on your question.
- Generate answers grounded in the document content using the chosen LLM.

When you ask a question, the system will:

- Retrieve relevant context from the vector database.
- Generate a response grounded in that context.
- Return the paths of the source documents used in the retrieval.

**Note:** For this tutorial, we used materials science documents, and the example outputs reflect content from those documents.

**Example Usage:**

Type your questions when prompted, and type exit to quit the chatbot.

In [6]:
# Function to interactively ask questions
def interactive_chat():
    print("Welcome to the RAG system! Type 'exit' to end the session.")
    while True:
        question = input("Ask your question: ")
        if question.lower() == "exit":
            print("Exiting the chatbot. Goodbye!")
            break
        response = rag_pipeline.get_response(question)
        print("\nResponse:", response)
        print("\n")

# Start interactive chat
interactive_chat()

Welcome to the RAG system! Type 'exit' to end the session.
Ask your question: Can you explain the environmental impact of traditional materials, especially in terms of energy consumption and pollution?

Response:  The extraction, processing, and disposal of traditional materials like metals, polymers, and ceramics have significant environmental consequences. Some key impacts include:

* Resource Depletion: Extracting non-renewable resources, such as metals and fossil fuels, depletes the Earth's reserves.
* Energy Consumption: Material production is energy-intensive, requiring substantial energy often derived from fossil fuels, which increases greenhouse gas emissions.
* Pollution and Emissions: Processing materials releases pollutants into the air, water, and soil, contributing to air pollution. For example, plastic production emits VOCs and other pollutants, and the cement industry is a significant source of carbon dioxide emissions.
* Waste Generation: Industrial processes generate l