# Building an Intelligent RAG Agent with OpenAI's Responses API

## Introduction

In this tutorial, we'll build a powerful Retrieval-Augmented Generation (RAG) agent using OpenAI's **Responses API**. Unlike traditional RAG implementations that require managing separate embedding services and vector databases, OpenAI's Responses API provides a fully-managed solution that handles:

- **Document Processing**: Automatic parsing, chunking, and embedding
- **Vector Storage**: Managed vector stores with semantic search
- **Data Analysis**: Built-in code interpreter for analytics and visualizations
- **Seamless Integration**: Combine document retrieval with computational capabilities

### What You'll Build

By the end of this tutorial, you'll have created an intelligent analytics agent capable of:
- Analyzing multiple documents using file search (RAG)
- Processing data files with code interpreter
- Generating visualizations and insights
- Creating comprehensive analytics reports
- Answering questions based on your document corpus

### Why Responses API?

The Responses API (replacing the deprecated Assistants API) offers:
- **Simpler architecture**: No threads or assistants to manage
- **Stateless design**: Instructions per request, not stored server-side
- **Conversation continuity**: Simple `previous_response_id` chaining
- **Unified interface**: One API for both RAG and code execution

Let's get started!

## Setup and Dependencies

First, let's install the required packages:

In [None]:
%pip install -q openai pandas matplotlib seaborn plotly

In [None]:
import os
import getpass

def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")

_set_env("OPENAI_API_KEY")

In [None]:
from openai import OpenAI
import time
import json
from IPython.display import display, HTML, Markdown
import pandas as pd
import numpy as np

# Initialize the OpenAI client
client = OpenAI()

print("OpenAI client initialized successfully!")
print("Ready to build your RAG agent with the Responses API!")

---

## Part 1: Setting Up Your Knowledge Base

### Understanding Vector Stores

Vector stores are the foundation of the RAG (Retrieval-Augmented Generation) approach. They:
- Store document embeddings for semantic search
- Enable efficient retrieval of relevant information
- Support up to 10,000 files per store
- Handle various file formats automatically

Let's create a vector store for our documents:

In [None]:
def create_knowledge_base(name="Analytics Knowledge Base", expiration_days=30):
    """
    Create a vector store for storing and retrieving documents.
    
    Args:
        name: Descriptive name for the vector store
        expiration_days: Days of inactivity before auto-deletion (cost management)
    
    Returns:
        Vector store object with id and metadata
    """
    vector_store = client.vector_stores.create(
        name=name,
        expires_after={
            "anchor": "last_active_at",
            "days": expiration_days
        }
    )
    
    print(f"✓ Created vector store: {name}")
    print(f"  ID: {vector_store.id}")
    print(f"  Auto-expires after {expiration_days} days of inactivity")
    
    return vector_store

# Create our knowledge base
knowledge_base = create_knowledge_base("Research Papers & Reports")

### Uploading Documents to Your Knowledge Base

Now let's create helper functions to upload and manage documents:

In [None]:
def upload_document_to_knowledge_base(file_path, vector_store_id):
    """
    Upload a document and add it to the knowledge base (vector store).
    
    Args:
        file_path: Path to the document file
        vector_store_id: ID of the target vector store
    
    Returns:
        Dictionary with file_id and status
    """
    file_name = os.path.basename(file_path)
    
    try:
        # Upload file to OpenAI
        print(f"Uploading {file_name}...")
        with open(file_path, 'rb') as file:
            file_response = client.files.create(
                file=file,
                purpose='assistants'
            )
        
        # Add to vector store
        print(f"Adding to knowledge base...")
        client.vector_stores.files.create(
            vector_store_id=vector_store_id,
            file_id=file_response.id
        )
        
        print(f"✓ Successfully added {file_name} to knowledge base")
        return {"file_id": file_response.id, "file_name": file_name, "status": "success"}
        
    except Exception as e:
        print(f"✗ Error uploading {file_name}: {str(e)}")
        return {"file_name": file_name, "status": "failed", "error": str(e)}

def upload_multiple_documents(file_paths, vector_store_id):
    """
    Upload multiple documents to the knowledge base.
    
    Args:
        file_paths: List of file paths to upload
        vector_store_id: ID of the target vector store
    
    Returns:
        List of upload results
    """
    results = []
    for file_path in file_paths:
        result = upload_document_to_knowledge_base(file_path, vector_store_id)
        results.append(result)
    
    # Wait for processing
    print("\nWaiting for document processing...")
    time.sleep(3)
    print("✓ Documents ready for search!")
    
    return results

### Example: Upload Sample Documents

Let's upload some sample documents to our knowledge base. For this example, we'll use documents from your pdfs folder:

In [None]:
# Example: Upload documents from your pdfs folder
# Modify these paths to match your actual documents

sample_documents = [
    './pdfs/arxiv_paper_1.pdf',  # Replace with your actual file paths
    # Add more documents as needed
]

# Upload documents (uncomment when you have actual files)
# uploaded_files = upload_multiple_documents(sample_documents, knowledge_base.id)

# For demonstration, let's show the structure
print("Example upload structure:")
print("uploaded_files = upload_multiple_documents(['file1.pdf', 'file2.pdf'], knowledge_base.id)")

---

## Part 2: Building the RAG Agent with Responses API

### Understanding the Responses API Approach

Unlike the old Assistants API, the Responses API is much simpler:

**Old Way (Assistants API):**
```python
1. Create assistant
2. Create thread
3. Add message to thread
4. Create run
5. Poll for completion
6. Retrieve messages
```

**New Way (Responses API):**
```python
response = client.responses.create(
    input="Your question",
    tools=[{"type": "file_search", "vector_store_ids": [vs_id]}]
)
```

Let's create our RAG agent class:

In [None]:
class RAGAgent:
    """
    A RAG (Retrieval-Augmented Generation) agent using OpenAI's Responses API.
    
    This agent can:
    - Search through documents in a vector store
    - Perform data analysis with code interpreter
    - Maintain conversation context
    - Generate comprehensive reports
    """
    
    def __init__(self, vector_store_id, model="gpt-4o"):
        self.vector_store_id = vector_store_id
        self.model = model
        self.conversation_history = []
        self.last_response_id = None
        
    def query(self, question, use_code_interpreter=False, instructions=None):
        """
        Query the knowledge base and get an AI-generated response.
        
        Args:
            question: The user's question
            use_code_interpreter: Whether to enable code interpreter for analysis
            instructions: Custom instructions for this query (optional)
        
        Returns:
            Response object from OpenAI
        """
        # Default instructions for RAG agent
        if instructions is None:
            instructions = """You are an expert research analyst and data scientist.
            Use the file_search tool to find relevant information from the knowledge base.
            Provide accurate, well-sourced answers based on the documents.
            If you're unsure or the information isn't in the documents, say so clearly.
            Always cite your sources when possible."""
        
        # Build tools list
        tools = [{
            "type": "file_search",
            "vector_store_ids": [self.vector_store_id],
            "max_num_results": 5
        }]
        
        if use_code_interpreter:
            tools.append({
                "type": "code_interpreter",
                "container": {"type": "auto"}
            })
        
        # Create response request
        request_params = {
            "input": question,
            "model": self.model,
            "instructions": instructions,
            "tools": tools
        }
        
        # Add previous response ID if continuing conversation
        if self.last_response_id:
            request_params["previous_response_id"] = self.last_response_id
        
        # Make the API call
        response = client.responses.create(**request_params)
        
        # Update conversation state
        self.last_response_id = response.id
        self.conversation_history.append({
            "question": question,
            "response_id": response.id
        })
        
        return response
    
    def display_response(self, response):
        """
        Display the response in a formatted way.
        
        Args:
            response: Response object from OpenAI
        """
        for item in response.output:
            if hasattr(item, 'content'):
                for content in item.content:
                    if hasattr(content, 'text'):
                        display(Markdown(content.text))
                    elif hasattr(content, 'image_file'):
                        # Handle generated images
                        file_id = content.image_file.file_id
                        print(f"Generated visualization (file_id: {file_id})")
    
    def extract_citations(self, response):
        """
        Extract source citations from the response.
        
        Args:
            response: Response object from OpenAI
        
        Returns:
            Set of cited filenames
        """
        citations = set()
        
        for item in response.output:
            if hasattr(item, 'content'):
                for content in item.content:
                    if hasattr(content, 'annotations'):
                        for annotation in content.annotations:
                            if hasattr(annotation, 'filename'):
                                citations.add(annotation.filename)
        
        return citations
    
    def reset_conversation(self):
        """
        Reset the conversation history and start fresh.
        """
        self.last_response_id = None
        self.conversation_history = []
        print("Conversation reset. Starting fresh!")

print("RAGAgent class defined successfully!")

### Example: Using the RAG Agent

Now let's create an instance of our RAG agent and use it to query our knowledge base:

In [None]:
# Create the RAG agent
rag_agent = RAGAgent(vector_store_id=knowledge_base.id)

print("RAG Agent created and ready!")
print(f"Connected to knowledge base: {knowledge_base.id}")

In [None]:
# Example query 1: Simple question
question1 = "What are the main topics covered in the documents? Provide a summary."

print(f"Question: {question1}\n")
print("=" * 80)

response1 = rag_agent.query(question1)
rag_agent.display_response(response1)

print("\n" + "=" * 80)

# Show citations
citations = rag_agent.extract_citations(response1)
if citations:
    print("\nSources cited:")
    for citation in citations:
        print(f"  - {citation}")

In [None]:
# Example query 2: Follow-up question (uses conversation context)
question2 = "Can you provide more details about the first topic you mentioned?"

print(f"Follow-up Question: {question2}\n")
print("=" * 80)

response2 = rag_agent.query(question2)
rag_agent.display_response(response2)

print("\n" + "=" * 80)

---

## Part 3: Advanced Analytics with Code Interpreter

### Combining RAG with Data Analysis

One of the most powerful features is combining document retrieval with computational analysis. Let's see how:

In [None]:
# Example: Extract data from documents and visualize
analytics_query = """Based on the documents in the knowledge base, 
extract any numerical data, statistics, or metrics mentioned. 
Then create visualizations to illustrate the key findings.
Provide a comprehensive analysis with charts and insights."""

print(f"Analytics Query:\n{analytics_query}\n")
print("=" * 80)

# Enable code interpreter for this query
response = rag_agent.query(
    analytics_query,
    use_code_interpreter=True,
    instructions="""You are an expert data analyst and research scientist.
    Use file_search to find relevant data in the documents.
    Use code_interpreter to process, analyze, and visualize the data.
    Create clear, informative visualizations.
    Explain your analysis methodology and findings."""
)

rag_agent.display_response(response)

print("\n" + "=" * 80)

### Working with Data Files

You can also upload data files (CSV, Excel, JSON) directly for analysis:

In [None]:
# Create a sample dataset for demonstration
np.random.seed(42)
dates = pd.date_range('2024-01-01', periods=90, freq='D')

sample_data = pd.DataFrame({
    'date': dates,
    'revenue': np.random.randint(50000, 150000, 90) + np.arange(90) * 500,
    'costs': np.random.randint(20000, 80000, 90) + np.arange(90) * 200,
    'customers': np.random.randint(500, 2000, 90) + np.arange(90) * 10,
    'conversion_rate': np.random.uniform(0.02, 0.08, 90)
})

# Save to CSV
data_file_path = 'business_metrics.csv'
sample_data.to_csv(data_file_path, index=False)

print("Sample business metrics dataset created!")
print(f"\nFirst few rows:")
print(sample_data.head())
print(f"\nDataset shape: {sample_data.shape}")

In [None]:
def analyze_data_file(file_path, query, model="gpt-4o"):
    """
    Upload and analyze a data file using the Responses API.
    
    Args:
        file_path: Path to the data file
        query: Analysis request
        model: OpenAI model to use
    
    Returns:
        Response object from OpenAI
    """
    # Upload the file
    print(f"Uploading {os.path.basename(file_path)}...")
    with open(file_path, 'rb') as file:
        file_response = client.files.create(
            file=file,
            purpose='assistants'
        )
    
    print(f"File uploaded: {file_response.id}")
    
    # Create input with file attachment
    input_data = [
        {
            "role": "user",
            "content": [
                {
                    "type": "input_file",
                    "file_id": file_response.id
                },
                {
                    "type": "input_text",
                    "text": query
                }
            ]
        }
    ]
    
    instructions = """You are an expert business analyst and data scientist.
    Analyze the provided data file thoroughly.
    Create insightful visualizations.
    Identify trends, patterns, and anomalies.
    Provide actionable recommendations based on the data."""
    
    # Make the request
    response = client.responses.create(
        input=input_data,
        model=model,
        instructions=instructions,
        tools=[{
            "type": "code_interpreter",
            "container": {"type": "auto"}
        }]
    )
    
    return response, file_response.id

# Analyze the sample data
analysis_request = """Please analyze this business metrics data:
1. Calculate key statistics and trends
2. Create visualizations for revenue, costs, and profit over time
3. Analyze customer growth and conversion rate trends
4. Identify any interesting patterns or anomalies
5. Provide strategic recommendations based on the data"""

print(f"\nAnalyzing data file...\n")
print("=" * 80)

data_response, file_id = analyze_data_file(data_file_path, analysis_request)

# Display results
for item in data_response.output:
    if hasattr(item, 'content'):
        for content in item.content:
            if hasattr(content, 'text'):
                display(Markdown(content.text))

print("\n" + "=" * 80)

---

## Part 4: Building a Complete Analytics Dashboard

Let's combine everything we've learned to create a comprehensive analytics workflow:

In [None]:
class AnalyticsDashboard:
    """
    A comprehensive analytics dashboard combining RAG and data analysis.
    """
    
    def __init__(self, vector_store_id, model="gpt-4o"):
        self.rag_agent = RAGAgent(vector_store_id, model)
        self.model = model
        self.reports = []
    
    def generate_executive_summary(self):
        """
        Generate an executive summary from the knowledge base.
        """
        query = """Generate an executive summary of the key findings and insights 
        from all documents in the knowledge base. Include:
        1. Main topics and themes
        2. Critical findings
        3. Key metrics and data points
        4. Strategic implications"""
        
        print("Generating Executive Summary...\n")
        response = self.rag_agent.query(query, use_code_interpreter=True)
        
        self.reports.append({
            "type": "executive_summary",
            "response_id": response.id
        })
        
        return response
    
    def analyze_trends(self):
        """
        Analyze trends from the documents.
        """
        query = """Identify and analyze key trends mentioned in the documents.
        Extract any time-series data or historical comparisons.
        Create visualizations showing trend progression.
        Provide insights on trend directions and implications."""
        
        print("Analyzing Trends...\n")
        response = self.rag_agent.query(query, use_code_interpreter=True)
        
        self.reports.append({
            "type": "trend_analysis",
            "response_id": response.id
        })
        
        return response
    
    def compare_documents(self):
        """
        Compare and contrast documents in the knowledge base.
        """
        query = """Compare and contrast the different documents in the knowledge base.
        Identify:
        1. Common themes and differences
        2. Contradictions or agreements
        3. Complementary information
        4. Gaps in coverage
        Create a comparison table or chart if appropriate."""
        
        print("Comparing Documents...\n")
        response = self.rag_agent.query(query, use_code_interpreter=True)
        
        self.reports.append({
            "type": "document_comparison",
            "response_id": response.id
        })
        
        return response
    
    def generate_recommendations(self):
        """
        Generate actionable recommendations based on all insights.
        """
        query = """Based on all the information in our knowledge base and our conversation,
        generate actionable recommendations. Include:
        1. Strategic priorities
        2. Tactical next steps
        3. Risk considerations
        4. Success metrics
        Organize recommendations by priority and feasibility."""
        
        print("Generating Recommendations...\n")
        response = self.rag_agent.query(query, use_code_interpreter=False)
        
        self.reports.append({
            "type": "recommendations",
            "response_id": response.id
        })
        
        return response
    
    def get_report_summary(self):
        """
        Get a summary of all generated reports.
        """
        print("Generated Reports:")
        for i, report in enumerate(self.reports, 1):
            print(f"{i}. {report['type'].replace('_', ' ').title()} (ID: {report['response_id']})")

print("AnalyticsDashboard class defined successfully!")

In [None]:
# Create the analytics dashboard
dashboard = AnalyticsDashboard(vector_store_id=knowledge_base.id)

print("Analytics Dashboard initialized!")
print("\nAvailable methods:")
print("  - generate_executive_summary()")
print("  - analyze_trends()")
print("  - compare_documents()")
print("  - generate_recommendations()")

In [None]:
# Example: Generate executive summary
# Uncomment when you have documents in your knowledge base

# summary_response = dashboard.generate_executive_summary()
# dashboard.rag_agent.display_response(summary_response)

print("Example usage:")
print("summary = dashboard.generate_executive_summary()")
print("dashboard.rag_agent.display_response(summary)")

---

## Part 5: Best Practices and Production Considerations

### Cost Management

In [None]:
def get_vector_store_stats(vector_store_id):
    """
    Get statistics about a vector store.
    """
    vs = client.vector_stores.retrieve(vector_store_id)
    
    print(f"Vector Store: {vs.name}")
    print(f"ID: {vs.id}")
    print(f"Files: {vs.file_counts.completed} completed")
    print(f"Status: {vs.status}")
    print(f"\nCost Considerations:")
    print(f"  - File Search: $2.50 per 1,000 queries")
    print(f"  - Storage: $0.10/GB/day (first GB free)")
    print(f"  - Code Interpreter: $0.03 per session")
    
    if vs.expires_after:
        print(f"\nAuto-deletion: After {vs.expires_after.days} days of inactivity")

# Example usage
get_vector_store_stats(knowledge_base.id)

### Error Handling and Retries

In [None]:
from time import sleep

def robust_query(rag_agent, question, max_retries=3, use_code_interpreter=False):
    """
    Query with automatic retry logic.
    
    Args:
        rag_agent: RAGAgent instance
        question: Query string
        max_retries: Maximum number of retry attempts
        use_code_interpreter: Whether to use code interpreter
    
    Returns:
        Response object or None if all retries fail
    """
    for attempt in range(max_retries):
        try:
            response = rag_agent.query(question, use_code_interpreter=use_code_interpreter)
            return response
        
        except Exception as e:
            print(f"Attempt {attempt + 1} failed: {str(e)}")
            
            if attempt < max_retries - 1:
                wait_time = (attempt + 1) * 2
                print(f"Retrying in {wait_time} seconds...")
                sleep(wait_time)
            else:
                print("Max retries reached. Query failed.")
                return None

print("Robust query function defined.")
print("Usage: response = robust_query(rag_agent, 'Your question here')")

### Cleanup and Resource Management

In [None]:
def cleanup_resources(vector_store_ids=None, file_ids=None):
    """
    Clean up vector stores and files to manage costs.
    
    Args:
        vector_store_ids: List of vector store IDs to delete
        file_ids: List of file IDs to delete
    """
    if vector_store_ids:
        print("Deleting vector stores...")
        for vs_id in vector_store_ids:
            try:
                client.vector_stores.delete(vs_id)
                print(f"  ✓ Deleted vector store: {vs_id}")
            except Exception as e:
                print(f"  ✗ Error deleting {vs_id}: {e}")
    
    if file_ids:
        print("\nDeleting files...")
        for file_id in file_ids:
            try:
                client.files.delete(file_id)
                print(f"  ✓ Deleted file: {file_id}")
            except Exception as e:
                print(f"  ✗ Error deleting {file_id}: {e}")
    
    print("\nCleanup complete!")

# Example cleanup (uncomment when needed)
# cleanup_resources(vector_store_ids=[knowledge_base.id])

print("Cleanup function ready.")
print("Use with caution - this will permanently delete resources!")

---

## Conclusion

Congratulations! You've learned how to build a sophisticated RAG (Retrieval-Augmented Generation) agent using OpenAI's Responses API.

### What We Covered

1. **Knowledge Base Setup**
   - Creating and managing vector stores
   - Uploading and organizing documents
   - Cost-effective expiration policies

2. **RAG Agent Development**
   - Building a reusable RAGAgent class
   - Querying documents with file_search
   - Maintaining conversation context
   - Extracting citations and sources

3. **Advanced Analytics**
   - Combining file_search with code_interpreter
   - Analyzing data files
   - Generating visualizations
   - Creating comprehensive reports

4. **Production Best Practices**
   - Error handling and retries
   - Cost management
   - Resource cleanup
   - Dashboard patterns

### Key Advantages of Responses API

Compared to the deprecated Assistants API:
- **Simpler**: No threads, assistants, or complex polling
- **More efficient**: Direct responses without polling loops
- **Flexible**: Stateless design with easy conversation chaining
- **Powerful**: Same tools (file_search, code_interpreter) with better UX

### Next Steps

- Upload your own documents and build a custom knowledge base
- Experiment with different query patterns and instructions
- Build specialized agents for your domain
- Integrate with your applications and workflows
- Explore streaming responses for real-time updates

### Resources

- [OpenAI Responses API Documentation](https://platform.openai.com/docs/api-reference/responses)
- [File Search Guide](https://platform.openai.com/docs/guides/tools-file-search)
- [Code Interpreter Guide](https://platform.openai.com/docs/guides/tools-code-interpreter)
- [OpenAI Cookbook](https://cookbook.openai.com/)

### Final Tips

1. **Start small**: Begin with a few documents and expand gradually
2. **Iterate on instructions**: Experiment with different prompts for better results
3. **Monitor costs**: Set expiration policies and clean up unused resources
4. **Handle errors**: Implement robust error handling for production use
5. **Cite sources**: Always encourage citation of source documents

Happy building! 🚀