<a href="https://colab.research.google.com/drive/1mrbwEpyfdthfezsZY3N7rfApGxZEtyty?usp=sharing" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>

###	🔄 What is Self-Adaptive RAG?

Self-Adaptive RAG is an advanced technique that autonomously optimizes its performance over time. It uses machine learning algorithms to continuously analyse its own outputs, user feedback, and performance metrics to refine its retrieval and generation strategies. This system can adjust its parameters, update its knowledge base, and modify its decision-making processes without constant human intervention, allowing it to adapt to changing information landscapes and user needs.

### 🔧 Self-Adaptive RAG Implementation:

1. **Query Complexity Assessment:** We use the LLM to rate the complexity of the query on a scale of 1 to 5.

2. **Adaptive Retrieval Strategy:**

  * For simple queries (complexity <= 2), we use standard retrieval.
  * For moderately complex queries (complexity <= 4), we use query expansion before retrieval.
  * For very complex queries (complexity > 4), we use hypothetical document embedding (HyDE) for retrieval.

3. **Final Response Generation:** Using the retrieved context, we generate a final response to the original query.

# ⚙️ Setup

1. **[LLM](https://groq.com/):** Groq's free Open source LLM endpoints([Groq API Key](https://console.groq.com/keys))
2. **[Vector Store](https://www.pinecone.io/learn/vector-database/):** [ChromaDB](https://www.trychroma.com/)
3. **[Embedding Model](https://qdrant.tech/articles/what-are-embeddings/):** [nomic-embed-text-v1.5](https://www.nomic.ai/blog/posts/nomic-embed-text-v1)
4. **[LLM Framework](https://python.langchain.com/v0.2/docs/introduction/):** LangChain
5. **[Huggingface API Key](https://huggingface.co/settings/tokens)**


# Install required libraries

In [1]:
!pip install -q -U \
     Sentence-transformers==3.0.1 \
     langchain==0.3.19 \
     langchain-groq==0.2.4 \
     langchain-chroma==0.2.2 \
     langchain-community==0.3.18 \
     langchain-huggingface==0.1.2 \
     einops==0.8.1

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m227.1/227.1 kB[0m [31m14.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m60.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m611.1/611.1 kB[0m [31m44.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m90.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m121.9/121.9 kB[0m [31m12.4 MB/s[0m eta [36m0:00

### Import related libraries related to Langchain, HuggingfaceEmbedding

In [2]:
from langchain_groq import ChatGroq
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.prompts import ChatPromptTemplate
from langchain.document_loaders import WebBaseLoader



In [3]:
import os
import getpass

#### Provide a Groq API key. You can create one to access free open-source models at the following link.

[Groq API Creation Link](https://console.groq.com/keys)




In [4]:
os.environ["GROQ_API_KEY"] = getpass.getpass()

··········


### Provide Huggingface API Key. You can create Huggingface API key at following link

[Huggingface API Creation Link](https://huggingface.co/settings/tokens)




In [5]:
os.environ["HF_TOKEN"] = getpass.getpass()

··········


### Step 1: Load and preprocess data code

In [9]:
def load_and_process_data(url):
    # Load data from web
    loader = WebBaseLoader(url)
    data = loader.load()

    # Split text into chunks (Experiment with Chunk Size and Chunk Overlap to get optimal chunking)
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
    chunks = text_splitter.split_documents(data)

    return chunks

### Step 2: Create vector store code

In [10]:
def create_vector_store(chunks):
    embeddings = HuggingFaceEmbeddings(model_name="nomic-ai/nomic-embed-text-v1.5", model_kwargs = {'trust_remote_code': True})
    vectorstore = Chroma.from_documents(chunks, embeddings)
    return vectorstore

### Step 3: Self-Adaptive RAG related code

1. **Query Complexity Assessment:** We use the LLM to rate the complexity of the query on a scale of 1 to 5.

2. **Adaptive Retrieval Strategy:**

  * For simple queries (complexity <= 2), we use standard retrieval.
  * For moderately complex queries (complexity <= 4), we use query expansion before retrieval.
  * For very complex queries (complexity > 4), we use hypothetical document embedding (HyDE) for retrieval.

3. **Final Response Generation:** Using the retrieved context, we generate a final response to the original query.

In [11]:
def self_adaptive_rag(query, vectorstore, llm):
    # Assess query complexity and type
    assess_prompt = ChatPromptTemplate.from_template(
        "Analyze the following query and provide:\n"
        "1. Complexity (rate from 1-5, where 1 is very simple and 5 is very complex)\n"
        "2. Query type (e.g., factual, analytical, open-ended)\n"
        "3. Suggested retrieval strategy (e.g., standard, query expansion, multi-hop)\n"
        "Query: {query}\n"
        "Analysis:"
    )
    assess_chain = assess_prompt | llm
    try:
        assessment = assess_chain.invoke({"query": query})
        # Parse the assessment (This is a simplified parsing, you might want to make it more robust)
        lines = assessment.content.split('\n')
        complexity = int(lines[0].split(':')[-1].strip())
        query_type = lines[1].split(':')[-1].strip()
        retrieval_strategy = lines[2].split(':')[-1].strip()
    except Exception as e:
        print(f"Error assessing query: {e}")
        complexity, query_type, retrieval_strategy = 3, "unknown", "standard"

    # Adapt retrieval strategy based on assessment
    if retrieval_strategy == "standard" or complexity <= 2:
        docs = vectorstore.similarity_search(query, k=3)
    elif retrieval_strategy == "query expansion" or complexity == 3:
        expand_prompt = ChatPromptTemplate.from_template(
            "Expand the following query with relevant keywords:\nQuery: {query}\nExpanded query:"
        )
        expand_chain = expand_prompt | llm
        expanded_query = expand_chain.invoke({"query": query}).content
        docs = vectorstore.similarity_search(expanded_query, k=4)
    else:  # multi-hop or high complexity
        hop1_prompt = ChatPromptTemplate.from_template(
            "What intermediate question should be answered first to help address this query?\nQuery: {query}\nIntermediate question:"
        )
        hop1_chain = hop1_prompt | llm
        intermediate_query = hop1_chain.invoke({"query": query}).content
        intermediate_docs = vectorstore.similarity_search(intermediate_query, k=2)
        docs = vectorstore.similarity_search(query, k=3)
        docs.extend(intermediate_docs)

    context = "\n\n".join([doc.page_content for doc in docs])

    # Generate response
    response_prompt = ChatPromptTemplate.from_template(
        "You are an AI assistant tasked with answering questions based on the provided context. "
        "The retrieval strategy was adapted based on the query's complexity and type. "
        "Please provide a comprehensive answer to the question, using the context when relevant "
        "and your general knowledge when necessary.\n\n"
        "Query complexity: {complexity}\n"
        "Query type: {query_type}\n"
        "Retrieval strategy: {retrieval_strategy}\n"
        "Context:\n{context}\n\n"
        "Question: {query}\n"
        "Answer:"
    )
    response_chain = response_prompt | llm
    try:
        response = response_chain.invoke({
            "complexity": complexity,
            "query_type": query_type,
            "retrieval_strategy": retrieval_strategy,
            "context": context,
            "query": query
        })
        final_answer = response.content
    except Exception as e:
        print(f"Error generating response: {e}")
        final_answer = "I apologize, but I encountered an error while generating the response."

    return {
        "query": query,
        "complexity": complexity,
        "query_type": query_type,
        "retrieval_strategy": retrieval_strategy,
        "final_answer": final_answer,
        "retrieved_context": context
    }

### Step 4: Create chunk of web data to Chroma Vector Store

In [12]:
llm = ChatGroq(
    model="llama3-8b-8192",
    temperature=0.5
)

# Load and process data
url = "https://en.wikipedia.org/wiki/Artificial_intelligence"
chunks = load_and_process_data(url)

# Create vector store
vectorstore = create_vector_store(chunks)

  from tqdm.autonotebook import tqdm, trange


modules.json:   0%|          | 0.00/255 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/140 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/71.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/120 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/2.06k [00:00<?, ?B/s]

configuration_hf_nomic_bert.py:   0%|          | 0.00/1.96k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/nomic-ai/nomic-bert-2048:
- configuration_hf_nomic_bert.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


modeling_hf_nomic_bert.py:   0%|          | 0.00/103k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/nomic-ai/nomic-bert-2048:
- modeling_hf_nomic_bert.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


model.safetensors:   0%|          | 0.00/547M [00:00<?, ?B/s]



tokenizer_config.json:   0%|          | 0.00/1.19k [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/695 [00:00<?, ?B/s]

1_Pooling%2Fconfig.json:   0%|          | 0.00/286 [00:00<?, ?B/s]

### Step 5: Run Self-Adaptive RAG

This implementation shows the key parts of Self-Adaptive RAG:

1. Dynamic assessment of query complexity
2. Adaptation of retrieval strategy based on query complexity
3. Use of different techniques (standard retrieval, query expansion, HyDE) depending on the query

In [13]:
# Example queries
queries = [
      "What is AI?",
      "How does machine learning contribute to AI development?",
      "Discuss the ethical implications of AI in autonomous weapon systems and their potential impact on international relations."
]

# Run Self-Adaptive RAG for each query
for query in queries:
  print(f"\nQuery: {query}")
  result = self_adaptive_rag(query, vectorstore, llm)
  print(f"Complexity: {result['complexity']}")
  print(f"Query Type: {result['query_type']}")
  print(f"Retrieval Strategy: {result['retrieval_strategy']}")
  print("Final Answer:")
  print(result["final_answer"])
  print("\nRetrieved Context (first 300 characters):")
  print(result["retrieved_context"][:300] + "...")


Query: What is AI?
Error assessing query: invalid literal for int() with base 10: ''
Complexity: 3
Query Type: unknown
Retrieval Strategy: standard
Final Answer:
Based on the provided context, Artificial Intelligence (AI) is defined as "intelligence exhibited by machines, particularly computer systems." It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals. AI agents are software entities designed to perceive their environment, make decisions, and take actions autonomously to achieve specific goals, and are used in various applications such as virtual assistants, chatbots, autonomous vehicles, game-playing systems, and industrial robotics.

Retrieved Context (first 300 characters):
Glossary
Glossary
vte
Artificial intelligence (AI), in its broadest sense, is intelligence exhibited by machines,