# RAG (Retrieval-Augmented Generation) Agent with Anthropic Claude

This notebook implements a RAG system using Anthropic's Claude model with tool-use capabilities. The agent can search through vectorized documents using Pinecone and provide contextual responses based on the retrieved information.

## Key Features:
- Semantic search using Pinecone vector database
- Document retrieval and reading capabilities
- Tool-use framework for structured interactions

## Requirements:
- Anthropic API key
- OpenAI API key (for embeddings)
- Pinecone API key
- Pre-populated Pinecone index(Create that in *notebooks/chatbot/vector_store.ipynb*)


# Two RAG Implementation Examples

### 1. Full Document Retrieval Implementation
- First searches through vector embeddings to identify relevant documents
- Then retrieves and reads the complete document content
- Provides comprehensive responses based on both the vector search and full document context
- Results in more accurate and detailed responses, especially for code implementations

### 2. Vector-Only Search Implementation
- Only uses vector embeddings for information retrieval
- Faster and *cheaper* but less detailed as it doesn't access the full document content


In [7]:
from pinecone import Pinecone, ServerlessSpec
import os
import dotenv
import json
from tqdm import tqdm
from openai import OpenAI
import anthropic
import tiktoken

# Load environment variables
dotenv.load_dotenv()

# Initialize clients
pc = Pinecone(api_key=os.getenv("PINECONE_API_KEY"))
client_openai = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
client_anthropic = anthropic.Anthropic()

# Initialize Pinecone index
index = pc.Index("showcase-index")

# Define Available Tools
Configure the tool definitions that Claude can use for searching and retrieving information.


In [None]:

TOOLS = [
    {
        "name": "search_pinecone",
        "description": "find relevant documents in the Pinecone index for openai or anthropic questions",
        "input_schema": {
            "type": "object",
            "properties": {
                "questions_to_search": {
                    "type": "string",
                    "description": "the questions to search for in the Pinecone index",
                }
            },
            "required": ["questions_to_search"],
        },
    },
    {
        "name": "read_full_document",
        "description": "read the full content of a document given its relative path",
        "input_schema": {
            "type": "object",
            "properties": {
                "file_path": {
                    "type": "string",
                    "description": "the relative path of the document to read",
                },
                "questions_to_search": {
                    "type": "string",
                    "description": "full questions to be answered from the context of the document",
                },
            },
        },
    }
]


# Core Utility Functions
Essential functions for semantic search, document reading, and tool processing.


In [None]:


def semantic_search(query: str, index, client, top_k=3):
    """Perform semantic search using the query"""
    query_embedding = client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    )
    
    return index.query(
        vector=query_embedding.data[0].embedding,
        top_k=top_k,
        include_metadata=True
    )

def read_full_document(file_path: str) -> str:
    """Read the full content of a document"""
    try:
        with open(file_path, 'r', encoding='utf-8') as f:
            return f.read()
    except Exception as e:
        return f"Error reading file {file_path}: {e}"

def process_tool_call(tool_name: str, tool_input: dict, index, client_openai):
    """Process a tool call and return the result"""
    if tool_name == "search_pinecone":
        search_results = semantic_search(
            tool_input['questions_to_search'], 
            index, 
            client_openai
        )
        return {
            "matches": [
                {
                    "id": match.id,
                    "score": match.score,
                    "metadata": match.metadata
                }
                for match in search_results['matches']
            ]
        }
    elif tool_name == "read_full_document":
        return read_full_document(tool_input['file_path'])
    else:
        return f"Unknown tool: {tool_name}"

# Main RAG Implementation
The primary chat_with_rag function that orchestrates the entire RAG workflow using Claude's tool-use capabilities.


In [8]:
def chat_with_rag(
    query: str,
    client_anthropic,
    index,
    client_openai,
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024
):
    """
    Main function to handle the RAG workflow with tool use
    """
    print(f"\n{'='*50}\nProcessing Query: {query}\n{'='*50}")
    
    # Initial request
    response = client_anthropic.messages.create(
        model=model,
        max_tokens=max_tokens,
        tools=TOOLS,
        system="""
        You are a helpful assistant that can search the Pinecone index for relevant documents 
        and read the full content of a document given its relative path.
        Start with searching the pinecone index for relevant documents.
        Then use the document to answer the user's to get a complete answer.
        """,
        messages=[{"role": "user", "content": query}]
    )
    
    messages_history = [{"role": "user", "content": query}]
    
    while response.stop_reason == "tool_use":
        print("\nProcessing tool use request...")
        
        tool_use = next(block for block in response.content if block.type == "tool_use")
        print(f"Tool requested: {tool_use.name}")
        
        result = process_tool_call(
            tool_use.name,
            tool_use.input,
            index,
            client_openai
        )
        
        messages_history.extend([
            {"role": "assistant", "content": response.content},
            {
                "role": "user",
                "content": [
                    {
                        "type": "tool_result",
                        "tool_use_id": tool_use.id,
                        "content": json.dumps(result) if isinstance(result, dict) else result
                    }
                ]
            }
        ])
        
        response = client_anthropic.messages.create(
            model=model,
            max_tokens=max_tokens,
            tools=TOOLS,
            messages=messages_history
        )

    final_response = next(
        (block.text for block in response.content if hasattr(block, "text")),
        None
    )
    
    print(f"\nFinal Response: {final_response}")
    return final_response




# Example Usage - Basic Query
Demonstration of first vector and then full docuemnt retrieval


In [9]:
if __name__ == "__main__":
    query = "Could you please provide me with a simple calculator agent code for Anthropic's tool_use?"
    response = chat_with_rag(
        query=query,
        client_anthropic=client_anthropic,
        index=index,
        client_openai=client_openai
    )

    print(response)


Processing Query: Could you please provide me with a simple calculator agent code for Anthropic's tool_use?

Processing tool use request...
Tool requested: search_pinecone

Processing tool use request...
Tool requested: read_full_document

Final Response: Based on my research, I found that Anthropic provides examples of calculator tool implementation in their cookbook. Let me provide you with a basic calculator tool code example that works with Anthropic's tool_use feature:

Here's a simple calculator tool implementation:

```python
from anthropic import Anthropic

# Initialize the Anthropic client
client = Anthropic()

# Define the calculator tool
calculator_tool = {
    "name": "calculator",
    "description": "A basic calculator that can perform arithmetic operations (addition, subtraction, multiplication, division)",
    "input_schema": {
        "type": "object",
        "properties": {
            "operation": {
                "type": "string",
                "enum": ["add", "

# Example Usage - Pinecone-Only Query
Example showing how to restrict the search to only use Pinecone without full document retrieval.

In [5]:
if __name__ == "__main__":
    query = "Could you please provide me with a simple calculator agent code for Anthropic's tool_use, when retreiving info, please only use the pinecone index and don't query from the document?"
    response = chat_with_rag(
        query=query,
        client_anthropic=client_anthropic,
        index=index,
        client_openai=client_openai
    )

    print(response)


Processing Query: Could you please provide me with a simple calculator agent code for Anthropic's tool_use , when retreiving info, please only use the pinecone index and don't query from the document?

Processing tool use request...
Tool requested: search_pinecone

Processing tool use request...
Tool requested: search_pinecone

Final Response: Based on the Anthropic documentation and search results, I can provide you with a template for implementing a simple calculator tool. Here's how you can define a calculator tool for Anthropic's tool_use:

```json
{
  "name": "calculator",
  "description": "A basic calculator that can perform arithmetic operations (addition, subtraction, multiplication, division)",
  "input_schema": {
    "type": "object",
    "properties": {
      "operation": {
        "type": "string",
        "enum": ["add", "subtract", "multiply", "divide"],
        "description": "The arithmetic operation to perform"
      },
      "operands": {
        "type": "array",
   