<div style="display: flex; justify-content: flex-start; align-items: center; gap: 15px; margin-bottom: 20px;">
  <a target="_blank" href="https://colab.research.google.com/github.com/SylphAI-Inc/AdalFlow/blob/main/notebooks/tutorials/adalflow_rag_vanilla.ipynb">
    <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
  </a>
  <a href="https://github.com/SylphAI-Inc/AdalFlow/blob/main/tutorials/adalflow_rag_vanilla.py" target="_blank" style="display: flex; align-items: center;">
      <img src="https://github.githubassets.com/images/modules/logos_page/GitHub-Mark.png" alt="GitHub" style="height: 20px; width: 20px; margin-right: 5px;">
      <span style="vertical-align: middle;"> Open Source Code </span>
  </a>
</div>

# 🤗 Welcome to AdalFlow!
## The PyTorch library to auto-optimize any LLM task pipelines

Thanks for trying us out, we're here to provide you with the best LLM application development experience you can dream of 😊 any questions or concerns you may have, [come talk to us on discord,](https://discord.gg/ezzszrRZvT) we're always here to help! ⭐ <i>Star us on <a href="https://github.com/SylphAI-Inc/AdalFlow">Github</a> </i> ⭐


# Quick Links

Github repo: https://github.com/SylphAI-Inc/AdalFlow

Full Tutorials: https://adalflow.sylph.ai/index.html#.

Deep dive on each API: check out the [developer notes](https://adalflow.sylph.ai/tutorials/index.html).

Common use cases along with the auto-optimization:  check out [Use cases](https://adalflow.sylph.ai/use_cases/index.html).

# Author
This notebook was created by community contributor [Ajith](https://github.com/ajithvcoder/).

# Outline

This is a quick introduction of what AdalFlow is capable of. We will cover:

* How to use adalflow for rag

Adalflow can be used in a genric manner for any api provider without worrying much about prompt, 
model args and parsing results

**Next: Try our [adalflow-rag-for-documents]("https://colab.research.google.com/github.com/SylphAI-Inc/AdalFlow/blob/main/notebooks/tutorials/adalflow_rag_documents.ipynb")**


# Installation

1. Use `pip` to install the `adalflow` Python package. We will need `openai`, `groq`, and `faiss`(cpu version) from the extra packages.

  ```bash
    pip install torch --index-url https://download.pytorch.org/whl/cpu
    pip install sentence-transformers==3.3.1
    pip install adalflow[openai,groq,faiss-cpu]
  ```
2. Setup  `openai` and `groq` API key in the environment variables

### Set Environment Variables

Note: Enter your api keys in below cell #todo

In [None]:
%%writefile .env

OPENAI_API_KEY="PASTE-OPENAI_API_KEY_HERE"
GROQ_API_KEY="PASTE-GROQ_API_KEY-HERE"

Overwriting .env


In [1]:
from adalflow.utils import setup_env

# Load environment variables - Make sure to have OPENAI_API_KEY in .env file and .env is present in current folder
setup_env(".env")

In [2]:
import os
from typing import List, Dict
import numpy as np
from sentence_transformers import SentenceTransformer
from faiss import IndexFlatL2

from adalflow.components.model_client import GroqAPIClient, OpenAIClient
from adalflow.core.types import ModelType
from adalflow.utils import setup_env

  from .autonotebook import tqdm as notebook_tqdm


`AdalflowRAGPipeline` is a class that implements a Retrieval-Augmented Generation (RAG) pipeline with adalflow. It integrates:

- Embedding models (e.g., Sentence Transformers) for document and query embeddings.
- FAISS for vector similarity search.
- A LLM client to generate context-aware responses using retrieved documents.

In [3]:
class AdalflowRAGPipeline:
    def __init__(self, 
                 model_client = None,
                 model_kwargs = None,
                 embedding_model='all-MiniLM-L6-v2', 
                 vector_dim=384, 
                 top_k_retrieval=1):
        """ 
        Initialize RAG Pipeline with embedding and retrieval components
        
        Args:
            embedding_model (str): Sentence transformer model for embeddings
            vector_dim (int): Dimension of embedding vectors
            top_k_retrieval (int): Number of documents to retrieve
        """
        # Initialize model client for generation
        self.model_client = model_client
        
        # Initialize embedding model
        self.embedding_model = SentenceTransformer(embedding_model)
        
        # Initialize FAISS index for vector similarity search
        self.index = IndexFlatL2(vector_dim)
        
        # Store document texts and their embeddings
        self.documents = []
        self.document_embeddings = []
        
        # Retrieval parameters
        self.top_k_retrieval = top_k_retrieval
        
        # Conversation history and context
        self.conversation_history = ""
        self.model_kwargs = model_kwargs

    def add_documents(self, documents: List[str]):
        """
        Add documents to the RAG pipeline's knowledge base
        
        Args:
            documents (List[str]): List of document texts to add
        """
        for doc in documents:
            # Embed document
            embedding = self.embedding_model.encode(doc)
            
            # Add to index and document store
            self.index.add(np.array([embedding]))
            self.documents.append(doc)
            self.document_embeddings.append(embedding)

    def retrieve_relevant_docs(self, query: str) -> List[str]:
        """
        Retrieve most relevant documents for a given query
        
        Args:
            query (str): Input query to find relevant documents
        
        Returns:
            List[str]: Top k most relevant documents
        """
        # Embed query
        query_embedding = self.embedding_model.encode(query)
        
        # Perform similarity search
        distances, indices = self.index.search(
            np.array([query_embedding]), 
            self.top_k_retrieval
        )
        
        # Retrieve and return top documents
        return [self.documents[i] for i in indices[0]]

    def generate_response(self, query: str) -> str:
        """
        Generate a response using retrieval-augmented generation
        
        Args:
            query (str): User's input query
        
        Returns:
            str: Generated response incorporating retrieved context
        """
        # Retrieve relevant documents
        retrieved_docs = self.retrieve_relevant_docs(query)
        
        # Construct context-aware prompt
        context = "\n\n".join([f"Context Document: {doc}" for doc in retrieved_docs])
        full_prompt = f"""
        Context:
        {context}
        
        Query: {query}
        
        Generate a comprehensive and informative response that:
        1. Uses the provided context documents
        2. Directly answers the query
        3. Incorporates relevant information from the context
        """
        
        # Prepare API arguments
        api_kwargs = self.model_client.convert_inputs_to_api_kwargs(
            input=full_prompt,
            model_kwargs=self.model_kwargs,
            model_type=ModelType.LLM
        )
        
        # Call API and parse response
        response = self.model_client.call(
            api_kwargs=api_kwargs, 
            model_type=ModelType.LLM
        )
        response_text = self.model_client.parse_chat_completion(response)
        
        # Update conversation history
        self.conversation_history += f"\nQuery: {query}\nResponse: {response_text}"
        
        return response_text


The `run_rag_pipeline` function demonstrates how to use the AdalflowRAGPipeline for embedding documents, retrieving relevant context, and generating responses:

In [4]:
def run_rag_pipeline(model_client, model_kwargs, documents, queries):
    rag_pipeline = AdalflowRAGPipeline(model_client=model_client, model_kwargs=model_kwargs)

    rag_pipeline.add_documents(documents)

    # Generate responses
    for query in queries:
        print(f"\nQuery: {query}")
        response = rag_pipeline.generate_response(query)
        print(f"Response: {response}")

In [None]:
# setup_env()

# ajithvcoder's statements are added so that we can validate that the LLM is generating from these lines only
documents = [
    "ajithvcoder is a good person whom the world knows as Ajith Kumar, ajithvcoder is his nick name that AjithKumar gave himself",
    "The Eiffel Tower is a famous landmark in Paris, built in 1889 for the World's Fair.",
    "ajithvcoder likes Hyderabadi panner dum briyani much.",
    "The Louvre Museum in Paris is the world's largest art museum, housing thousands of works of art.",
    "ajithvcoder has a engineering degree and he graduated on May, 2016."
]

# Questions related to ajithvcoder's are added so that we can validate
# that the LLM is generating from above given lines only
queries = [
    "Does Ajith Kumar has any nick name ?",
    "What is the ajithvcoder's favourite food?",
    "When did ajithvcoder graduated ?"
]

groq_model_kwargs = {
    "model": "llama-3.2-1b-preview",  # Use 16k model for larger context
    "temperature": 0.1,
    "max_tokens": 800,
}

openai_model_kwargs = {
    "model": "gpt-3.5-turbo",  # Use 16k model for larger context
    "temperature": 0.1,
    "max_tokens": 800,
}

# Below example shows that adalflow can be used in a genric manner for any api provider
# without worrying about prompt and parsing results
model_client = GroqAPIClient()
run_rag_pipeline(model_client, groq_model_kwargs, documents, queries)
run_rag_pipeline(OpenAIClient(), openai_model_kwargs, documents, queries)



Query: Does Ajith Kumar has any nick name ?
Response: GeneratorOutput(id=None, data=None, error=None, usage=CompletionUsage(completion_tokens=78, prompt_tokens=122, total_tokens=200), raw_response='Based on the provided context documents, Ajith Kumar, also known as Ajithvcoder, has a nickname that he has given himself. According to the context, Ajithvcoder is his nickname that he has chosen for himself.\n\nTherefore, the answer to the query is:\n\nYes, Ajith Kumar has a nickname that he has given himself, which is Ajithvcoder.', metadata=None)

Query: What is the ajithvcoder's favourite food?
Response: GeneratorOutput(id=None, data=None, error=None, usage=CompletionUsage(completion_tokens=67, prompt_tokens=109, total_tokens=176), raw_response='Based on the provided context document, I can confidently answer the query as follows:\n\nAjithvcoder\'s favourite food is Hyderabadi Panner Dum Briyani.\n\nThis answer is directly supported by the context document, which states: "ajithvcoder li