# Agentic RAG

This project showcases an agentic Retrieval-Augmented Generation (RAG) system that intelligently combines autonomous agents with vector-based semantic search to tackle complex queries. By breaking down questions into manageable subtasks, generating targeted search queries, and reasoning over retrieved information, the system mimics human-like problem-solving.

Core elements include:

- **Embedding Model (SentenceTransformer)**: Converts text into dense vectors for effective similarity matching.
- **Pinecone Vector Database**: Provides fast and scalable retrieval of relevant document embeddings.
- **Shivaay Agentic Pipeline**: A set of AI agents that plan, search, retrieve, and reason in a modular fashion to produce precise and context-aware answers.

This agentic RAG approach enhances the flexibility and depth of question answering, enabling dynamic interaction with large knowledge bases and supporting advanced AI applications such as intelligent assistants and research aids.


### Step 1: Install Required Libraries

Before running the code, ensure you have the necessary Python libraries installed. You can install them using pip:

```bash
pip install requests
pip install pinecone
pip install sentence-transformers


###  Step 2: Imports and Configuration

We import the necessary libraries:
- `requests`: To make HTTP requests (e.g., calling external APIs).
- `time`: For handling delays or timing operations.
- `Pinecone` and `ServerlessSpec` from `pinecone`: To interact with Pinecone vector database.
- `SentenceTransformer` from `sentence_transformers`: To generate text embeddings.

We also define important configuration values:
- `PINECONE_API_KEY`: Authentication key for accessing Pinecone.
- `SHIVAAY_API_KEY`: API key for authenticating with the Shivaay AI service.
- `SHIVAAY_API_URL`: The endpoint URL to send chat completion requests.
- `INDEX_NAME`: The name of the Pinecone index we'll use or create.
- `EMBEDDING_MODEL`: Specifies the pre-trained model used for generating embeddings.

This sets up everything needed to connect to external services and begin working with text embeddings and vector search.

In [20]:
import requests
import time
from pinecone import Pinecone, ServerlessSpec
from sentence_transformers import SentenceTransformer

PINECONE_API_KEY = "YOUR-PINE-API-KEY"
SHIVAAY_API_KEY = "YOUR-SHIVAAY-API-KEY"
SHIVAAY_API_URL = "https://api.futurixai.com/api/shivaay/v1/chat/completions"
INDEX_NAME = "YOUR-PINE-INDEX-NAME"
EMBEDDING_MODEL = "sentence-transformers/all-mpnet-base-v2"

### Step 3: Pinecone Initialization and Model Loading

We initialize the Pinecone vector database and load the embedding model:

- `Pinecone(api_key=PINECONE_API_KEY)`: Authenticates with the Pinecone service using our API key.
- `pc.list_indexes().names()`: Lists all existing indexes. If our target index (`INDEX_NAME`) does not exist, we create it.
- `pc.create_index(...)`: Creates a new index with:
  - `name`: The index name.
  - `dimension`: Set to 768 to match the embedding vector size of our model.
  - `metric`: We use cosine similarity for measuring vector similarity.
  - `ServerlessSpec`: Defines the serverless environment configuration (cloud and region).
- `pc.Index(INDEX_NAME)`: Connects to the index for further operations like upserting or querying.
- `SentenceTransformer(EMBEDDING_MODEL)`: Loads the specified embedding model (e.g., `all-MiniLM-L6-v2`) to convert documents or queries into numerical vectors.

This step ensures our vector storage (Pinecone) and embedding model are ready for use.


In [21]:
pc = Pinecone(api_key=PINECONE_API_KEY)
if INDEX_NAME not in pc.list_indexes().names():
    pc.create_index(
        name=INDEX_NAME,
        dimension=768,
        metric="cosine",
        spec=ServerlessSpec(cloud="YOUR-PINE-API-CLOUD", region="YOUR-PINE-API-REGION")
    )
index = pc.Index(INDEX_NAME)

model = SentenceTransformer(EMBEDDING_MODEL)

### Step 4: Shivaay Agents for Task-Oriented Processing

This step defines a series of specialized agents that break down a complex question, retrieve relevant documents, and generate an intelligent response using the Shivaay LLM API.

- **`call_shivaay()`**: A helper function that sends a request to the Shivaay API with:
  - Message history (`messages`)
  - Temperature and token limit settings
  - It returns the LLM's response from the API.

- **`planning_agent(user_question)`**:
  - Breaks the user’s question into logical subtasks using the LLM.
  - Useful for complex queries that require multiple steps.

- **`search_agent(subtask)`**:
  - Takes a subtask and asks the LLM to generate 2–3 focused search queries.
  - These queries are used to find relevant information in the vector database.

- **`retrieval_agent(queries)`**:
  - Converts each search query to an embedding vector using the model.
  - Searches the Pinecone index with the vector and retrieves the top-k relevant document snippets.
  - Formats the search results for downstream use.

- **`reasoning_agent(question, context)`**:
  - Takes the original user question and the retrieved context as input.
  - Sends them to the LLM to generate a final answer based on reasoning over the context.

These agents work together to mimic a modular RAG (Retrieval-Augmented Generation) pipeline.


In [27]:
def call_shivaay(messages, temperature=0.3, max_tokens=500):
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {SHIVAAY_API_KEY}"
    }
    payload = {
        "model": "shivaay",
        "messages": messages,
        "temperature": temperature,
        "max_tokens": max_tokens
    }
    response = requests.post(SHIVAAY_API_URL, headers=headers, json=payload)
    response.raise_for_status()
    return response.json()["choices"][0]["message"]["content"]

def planning_agent(user_question):
    messages = [
        {"role": "system", "content": "Break down the user question into logical steps."},
        {"role": "user", "content": user_question}
    ]
    return call_shivaay(messages)

def search_agent(subtask):
    messages = [
        {"role": "system", "content": "Generate 2-3 relevant search queries."},
        {"role": "user", "content": subtask}
    ]
    raw = call_shivaay(messages)
    return [q.strip("- ").strip() for q in raw.split("\n") if q.strip()]

def retrieval_agent(queries):
    results = []
    for query in queries:
        query_embedding = model.encode(query).tolist()
        pinecone_results = index.query(
            vector=query_embedding,
            top_k=3,
            include_metadata=True
        )
        for match in pinecone_results.matches:
            results.append(f"Relevant document for '{query}': {match.metadata['text']}")
    return "\n".join(results)

def reasoning_agent(question, context):
    messages = [
        {"role": "system", "content": f"Context:\n{context}"},
        {"role": "user", "content": question}
    ]
    return call_shivaay(messages, temperature=0.2)

### Step 5: Embedding and Indexing Custom Documents

In this step, we manually define a small set of knowledge documents about the Indian Army, which will be indexed into Pinecone for semantic search.

- **`documents`**: A list of short factual sentences related to the Indian Army.

- **Embedding**:
  - We use the `SentenceTransformer` model (`model.encode()`) to convert each document into a high-dimensional vector representation that captures its semantic meaning.

- **Indexing into Pinecone**:
  - We prepare the vectors in a format accepted by Pinecone (each with an ID, the vector values, and metadata).
  - `index.upsert()` inserts or updates these vectors into the Pinecone index for retrieval.

Once this step is completed, the semantic search system can match user queries with these documents based on meaning rather than exact keywords.


In [23]:
documents = [
    "The Indian Army has over 1.4 million active personnel.",
    "It was established on 15th August 1947.",
    "The Army participates in UN peacekeeping missions.",
    "Modernization includes indigenous weapons development."
]

embeddings = model.encode(documents).tolist()
vectors = [{"id": f"doc{i}", "values": emb, "metadata": {"text": doc}}
          for i, (doc, emb) in enumerate(zip(documents, embeddings))]
index.upsert(vectors=vectors)

print("✅ Documents indexed!")

✅ Documents indexed!


### Step 6: Interactive QA Pipeline Execution

This step executes the full question-answering pipeline using the Shivaay-powered agents:

- **User Input**:
  - The user is prompted to enter a complex, multi-part question.
  
- **Planning Agent**:
  - The input is sent to the `planning_agent`, which breaks it down into logical subtasks.
  - These subtasks are extracted as a list for further processing.

- **Search Agent and Retrieval**:
  - For each subtask, the `search_agent` generates multiple semantic search queries.
  - These queries are sent to the `retrieval_agent`, which searches the Pinecone vector index and retrieves the top relevant document snippets.
  - All retrieved contexts are collected for the final reasoning step.

- **Reasoning Agent**:
  - The full set of retrieved contexts and the original question are sent to the `reasoning_agent`, which generates a coherent, synthesized answer using Shivaay AI.

This interactive loop allows the system to semantically understand the user's question, search for relevant knowledge, and reason over it to generate an accurate, context-aware answer.


In [28]:
user_question = input("💬 Enter your complex question: ")

print("\n🧠 Planning agent thinking...")
plan = planning_agent(user_question)
print("✅ Plan:\n", plan)

subtasks = [line.strip("- ").strip() for line in plan.split("\n") if line.strip()]
all_contexts = []

for i, task in enumerate(subtasks):
    print(f"\n🔍 Processing subtask {i+1}: {task}")
    queries = search_agent(task)
    print("Generated Queries:", queries)

    context = retrieval_agent(queries)
    all_contexts.append(context)
    print(f"📚 Retrieved {len(context.splitlines())} documents")

print("\n🧠 Reasoning agent compiling final answer...")
final_context = "\n\n".join(all_contexts)
answer = reasoning_agent(user_question, final_context)

print("\n📝 Final Answer:\n", answer)

💬 Enter your complex question: How has the Indian Army contributed to global peace efforts, and how does this align with its modernization initiatives post-independence?

🧠 Planning agent thinking...
✅ Plan:
 To address the question about the Indian Army's contributions to global peace efforts and how these align with its modernization initiatives post-independence, we can break it down into several logical steps:

1. **Historical Context of Indian Army Post-Independence:**
   - Understand the role of the Indian Army during the partition of India in 1947.
   - Examine the initial challenges faced by the newly formed Indian Army in terms of infrastructure, manpower, and equipment.

2. **Early Contributions to Global Peace Efforts:**
   - Identify early instances where the Indian Army participated in United Nations (UN) peacekeeping missions.
   - Discuss the significance of these missions in terms of international relations and the image of India as a peace-loving nation.

3. **Moderniz