# Managing Conversations in the OpenAI Responses API

## Introduction

In this lesson, we'll explore how to effectively manage conversations in the OpenAI Responses API. The Responses API is the modern replacement for the Assistants API, providing a streamlined, stateful interface for building conversational AI applications.

### Key Differences from Assistants API

**Assistants API ‚Üí Responses API Mapping:**
- Assistants ‚Üí Prompts (instructions)
- Threads ‚Üí Conversations (implicit, linked via response IDs)
- Runs ‚Üí Responses
- Run-Steps ‚Üí Items

**Main Advantages:**
- Server-side conversation state management
- No need to manually track threads and message history
- Simplified API with less boilerplate code
- Built-in tools (web search, file search, code interpreter)
- Conversation forking capabilities

First, let's set up our environment:

In [None]:
%pip install openai==2.6.1

In [1]:
import os
import getpass

def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")

_set_env("OPENAI_API_KEY")

In [2]:
from openai import OpenAI
import time

# Initialize the OpenAI client
client = OpenAI()

## Understanding Conversations in the Responses API

Unlike the Assistants API which required explicit thread creation and management, the Responses API handles conversations implicitly through response chaining. Each response has a unique ID, and you maintain conversation continuity by referencing the previous response ID.

**Key Concepts:**
- Responses are stored for 30 days by default
- You can disable storage with `store=False`
- Conversations are formed by linking responses via `previous_response_id`
- All prior input tokens remain billable, even when using `previous_response_id`

### Creating a Basic Response

Let's start by creating a simple response:

In [46]:
def create_basic_response(user_input):
    """Create a basic response without conversation history."""
    response = client.responses.create(
        model="gpt-4o-mini",
        input=user_input
    )
    print(f"Response ID: {response.id}")
    print(f"Output: {response.output[0].content[0].text}")
    return response

# Create a new response
response = create_basic_response("Tell me a joke about programming.")

Response ID: resp_0ccae9f6789c1683006900eb7736988196a2d7ff5b92582ad1
Output: Why do programmers prefer dark mode?  

Because light attracts bugs!


### Continuing Conversations with previous_response_id

To continue a conversation, simply pass the `previous_response_id` parameter. The API automatically retrieves the full conversation history:

In [47]:
def continue_conversation(previous_response_id, user_input):
    """Continue an existing conversation by referencing the previous response."""
    response = client.responses.create(
        model="gpt-4o-mini",
        input=user_input,
        previous_response_id=previous_response_id
    )
    print(f"Response ID: {response.id}")
    print(f"Output: {response.output[0].content[0].text}")
    return response

# Continue the conversation
response_2 = continue_conversation(response.id, "Tell me another one!")

Response ID: resp_0ccae9f6789c1683006900ebd36c488196b537607de9318ec8
Output: Why do Java developers wear glasses?  

Because they don't see sharp!


## Managing Conversation State

### Using the store Parameter

By default, responses are stored on OpenAI's servers (`store=True`). You can disable this for privacy or cost reasons:

In [48]:
def create_ephemeral_response(user_input):
    """Create a response that won't be stored on OpenAI's servers."""
    response = client.responses.create(
        model="gpt-4o-mini",
        input=user_input,
        store=False  # Don't store conversation state
    )
    print(f"Response ID: {response.id}")
    print(f"Output: {response.output[0].content[0].text}")
    return response

# Example: Create a response without storing
ephemeral_response = create_ephemeral_response("What's the weather like?")

Response ID: resp_0962954ab5acd61f016900ebf9e6708196bc73dcb150b64700
Output: I can't check real-time weather data, but you can easily find the current weather through a weather website, app, or by asking a smart device. If you tell me your location, I can help guide you on how to find the information!


In [49]:
ephemeral_response

Response(id='resp_0962954ab5acd61f016900ebf9e6708196bc73dcb150b64700', created_at=1761668089.0, error=None, incomplete_details=None, instructions=None, metadata={}, model='gpt-4o-mini-2024-07-18', object='response', output=[ResponseOutputMessage(id='msg_0962954ab5acd61f016900ebfc85508196852b8c6b501d0c06', content=[ResponseOutputText(annotations=[], text="I can't check real-time weather data, but you can easily find the current weather through a weather website, app, or by asking a smart device. If you tell me your location, I can help guide you on how to find the information!", type='output_text', logprobs=[])], role='assistant', status='completed', type='message')], parallel_tool_calls=True, temperature=1.0, tool_choice='auto', tools=[], top_p=1.0, background=False, conversation=None, max_output_tokens=None, max_tool_calls=None, previous_response_id=None, prompt=None, prompt_cache_key=None, reasoning=Reasoning(effort=None, generate_summary=None, summary=None), safety_identifier=None, 

### Retrieving Previous Responses

You can retrieve any stored response by its ID:

In [51]:
def retrieve_response(response_id):
    """Retrieve a previously stored response by its ID."""
    fetched_response = client.responses.retrieve(response_id=response_id)
    print(f"Retrieved Response ID: {fetched_response.id}")
    print(f"Output: {fetched_response.output[0].content[0].text}")
    return fetched_response

# Example: Retrieve the first response we created
retrieved = retrieve_response(response.id)

Retrieved Response ID: resp_0ccae9f6789c1683006900eb7736988196a2d7ff5b92582ad1
Output: Why do programmers prefer dark mode?  

Because light attracts bugs!


## Forking Conversations

One powerful feature is the ability to fork conversations - branching from any previous response to explore alternative paths:

In [7]:
def fork_conversation(fork_from_id, user_input):
    """Fork a conversation from a specific response ID."""
    response = client.responses.create(
        model="gpt-4o-mini",
        input=user_input,
        previous_response_id=fork_from_id
    )
    print(f"Forked Response ID: {response.id}")
    print(f"Output: {response.output[0].content[0].text}")
    return response

# Fork from the first response with a different question
forked_response = fork_conversation(
    response.id, 
    "Actually, can you explain what makes that joke funny?"
)

Forked Response ID: resp_0edda0239567c750006900ada4c6c88197af788c11beb3d6cc
Output: Sure! The joke plays on a couple of ideas:

1. **Two Meanings of "Light"**: In programming, "light mode" refers to a bright color scheme, while "dark mode" is a darker color scheme preferred by many programmers for its aesthetic and comfort. 

2. **Bugs and Light**: The joke uses wordplay with the term "bugs." In the context of programming, "bugs" refer to errors or glitches in code. However, "bugs" can also refer to actual insects, which are attracted to light. 

The humor comes from the clever connection between the two meanings, creating an unexpected punchline that resonates with programmers.


## Working with Instructions (System Prompts)

You can provide instructions to shape the assistant's behavior. Instructions are similar to system prompts in Chat Completions:

In [8]:
def create_response_with_instructions(instructions, user_input):
    """Create a response with custom instructions."""
    response = client.responses.create(
        model="gpt-4o-mini",
        instructions=instructions,
        input=user_input
    )
    print(f"Response: {response.output[0].content[0].text}")
    return response

# Example: Create a response with a specific persona
pirate_response = create_response_with_instructions(
    instructions="You are a helpful coding assistant that talks like a pirate.",
    user_input="How do I declare a variable in Python?"
)

Response: Arrr matey! To declare a variable in Python, ye simply be usin‚Äô an assignment operator. Here's how ye can do it:

1. Choose a name fer yer variable, like `ship` or `treasure`.
2. Use the equals sign `=` to assign a value to it.

Here be an example:

```python
ship = "Black Pearl"
treasure = 1000
```

In this case, `ship` holds the name of yer mighty vessel, and `treasure` be the amount o‚Äô gold doubloons ye be havin'. No need to declare the type; Python be smart enough to figure it out on its own! Arrr!


## Working with Different Content Types

### Text Messages with Multiple Turns

You can structure input with role-based messages for more complex conversations:

In [9]:
def create_multi_turn_response():
    """Create a response with structured message history."""
    response = client.responses.create(
        model="gpt-4o-mini",
        input=[
            {
                "role": "user",
                "content": "I'm learning about data structures."
            },
            {
                "role": "user",
                "content": "Can you explain what a hash table is?"
            }
        ]
    )
    print(f"Response: {response.output[0].content[0].text}")
    return response

# Example with multiple message turns
multi_turn_response = create_multi_turn_response()

Response: Sure! A hash table is a data structure that stores key-value pairs and allows for fast data retrieval. Here‚Äôs a breakdown of how it works:

### Key Components:

1. **Keys**: Unique identifiers used to store and retrieve values in the hash table.
2. **Values**: The data associated with keys.

### How It Works:

1. **Hash Function**: When you insert a key-value pair, a hash function takes the key and computes an index (or hash code) in an array where the value will be stored. The hash function is designed to distribute keys uniformly to minimize collisions.

2. **Collisions**: Sometimes, different keys may produce the same index. A collision resolution strategy is necessary to handle this. Common strategies include:
   - **Chaining**: Each index in the array points to a linked list (or another collection) of entries that hash to the same index.
   - **Open Addressing**: If a collision occurs, the table searches for the next available slot according to certain probing strategi

### Messages with Images

The Responses API supports multimodal inputs including images:

In [52]:
def analyze_image(image_url, question):
    """Analyze an image by providing a URL."""
    response = client.responses.create(
        model="gpt-4o",  # Use gpt-4o for vision capabilities
        input=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "input_text",
                        "text": question
                    },
                    {
                        "type": "input_image",
                        "image_url": image_url
                    }
                ]
            }
        ]
    )
    print(f"Response: {response.output[0].content[0].text}")
    return response

# Example: Analyze an image (uncomment with a real image URL)
image_response = analyze_image(
    "https://upload.wikimedia.org/wikipedia/commons/thumb/4/4d/OpenAI_Logo.svg/640px-OpenAI_Logo.svg.png",
    "Describe this image in one sentence."
)

Response: This is a monochromatic illustration of a stylized canine with intricate patterns across its body.


![](2025-10-28-16-19-25.png)

In [53]:
image_response

Response(id='resp_067e9c7d4f2348e8006900ecf9097881908b0d15a89510973b', created_at=1761668345.0, error=None, incomplete_details=None, instructions=None, metadata={}, model='gpt-4o-2024-08-06', object='response', output=[ResponseOutputMessage(id='msg_067e9c7d4f2348e8006900ecfb1f3081908d548c5f00407f2d', content=[ResponseOutputText(annotations=[], text='This is a monochromatic illustration of a stylized canine with intricate patterns across its body.', type='output_text', logprobs=[])], role='assistant', status='completed', type='message')], parallel_tool_calls=True, temperature=1.0, tool_choice='auto', tools=[], top_p=1.0, background=False, conversation=None, max_output_tokens=None, max_tool_calls=None, previous_response_id=None, prompt=None, prompt_cache_key=None, reasoning=Reasoning(effort=None, generate_summary=None, summary=None), safety_identifier=None, service_tier='default', status='completed', text=ResponseTextConfig(format=ResponseFormatText(type='text'), verbosity='medium'), top

### Working with Base64 Images

You can also provide images as base64-encoded strings:

In [54]:
import base64

def analyze_local_image(image_path, question):
    """Analyze a local image file by encoding it as base64."""
    with open(image_path, "rb") as image_file:
        image_data = base64.b64encode(image_file.read()).decode('utf-8')
    
    # Determine image format from file extension
    image_format = image_path.split('.')[-1].lower()
    
    response = client.responses.create(
        model="gpt-4o",
        input=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "input_text",
                        "text": question
                    },
                    {
                        "type": "input_image",
                        "image_url": f"data:image/{image_format};base64,{image_data}"
                    }
                ]
            }
        ]
    )
    print(f"Response: {response.output[0].content[0].text}")
    return response

# Example: Analyze a local image (uncomment with a real image path)
local_image_response = analyze_local_image(
    "./2025-10-28-14-22-39.png",
    "Re-create this image as a mermaid graph."
)

Response: Certainly! Here is how you can re-create the image as a Mermaid graph:

```mermaid
graph TD;
    A[input prompt] -->|get_response(prompt)| B[LLM] --> C[output text]
```

This code creates a simple flowchart with the same structure as the image provided.


In [56]:
# from IPython.display import Markdown

# Markdown(local_image_response.output[0].content[0].text)

## Using Built-in Tools

### Web Search Tool

The Responses API includes built-in tools like web search:

In [57]:
def search_web(query):
    """Use the built-in web search tool."""
    response = client.responses.create(
        model="gpt-4o",
        tools=[{"type": "web_search"}],
        input=query
    )
    
    # The response may include tool execution results
    for item in response.output:
        if hasattr(item, 'content'):
            for content in item.content:
                if hasattr(content, 'text'):
                    print(f"Response: {content.text}")
    
    return response

# Example: Search for current information
web_response = search_web("What are some of the OReilly courses from instructor Lucas Soares?")

Response: Here are several O‚ÄôReilly live events and sessions featuring instructor **Lucas Soares**:

1. **Building Simple Web Apps with AI Tools**  
   In this beginner-to-intermediate live event, Lucas Soares guides participants through building local web applications using AI tools like Claude and CursorAI, along with HTML, CSS, and JavaScript. Attendees learn to design, implement, and iterate simple projects such as quizzes or habit trackers, and to connect apps to external APIs like weather or stock data. ([oreilly.com](https://www.oreilly.com/live-events/building-simple-web-apps-with-ai-tools/0642572013427/?utm_source=openai))

2. **GenAI Prompt to Product Showdown**  
   Hosted by Lucas Soares, this intermediate-level competition-style session demonstrates how to transform prompt engineering skills into functional minimum viable products (MVPs). Multiple experts participate, and attendees get to vote on the most effective AI-driven solution. ([oreilly.com](https://www.oreilly.c

In [None]:
web_response

### File Search Tool

You can enable file search for document retrieval and analysis:

In [None]:
# def use_file_search(query):
#     """Use the built-in file search tool."""
#     response = client.responses.create(
#         model="gpt-5-mini",
#         input=query,
#         tools=[{"type": "file_search"}]
#     )
#     print(f"Response: {response.output[0].content[0].text}")
#     return response

# # Example usage with file search
# # file_search_response = use_file_search("Find information about Python decorators in the documentation.")

### Code Interpreter Tool

Enable the code interpreter for data analysis and code execution:

In [15]:
def use_code_interpreter(query):
    """Use the built-in code interpreter tool."""
    response = client.responses.create(
        model="gpt-4o",
        input=query,
        tools=[
            {
                "type": "code_interpreter",
                "container": {"type": "auto"}
            }
            ]
    )
    print(f"Response: {response.output[0].content[0].text}")
    return response

# Example: Request data analysis
code_response = use_code_interpreter(
    "Create a Python function to calculate the Fibonacci sequence up to n terms."
)

Response: Here is a Python function that calculates the Fibonacci sequence up to \( n \) terms:

```python
def fibonacci_sequence(n):
    if n <= 0:
        return []
    elif n == 1:
        return [0]
    elif n == 2:
        return [0, 1]
    
    sequence = [0, 1]
    while len(sequence) < n:
        next_value = sequence[-1] + sequence[-2]
        sequence.append(next_value)
    
    return sequence

# Example usage:
n_terms = 10
print(fibonacci_sequence(n_terms))
```

This function checks for edge cases (like when \( n \) is less than or equal to 0) and then computes the rest of the Fibonacci sequence for \( n > 2 \). You can call `fibonacci_sequence` with the desired number of terms to get the sequence up to that number.


## Managing Context Windows and Token Limits

You can control token usage with `max_prompt_tokens` and `max_completion_tokens`:

In [59]:
def create_limited_response(user_input):
    """Create a response with token limits."""
    response = client.responses.create(
        model="gpt-4o-mini",
        input=user_input,
        max_output_tokens=200
    )
    # print(f"Response: {response.output[0].content[0].text}")
    return response.output[-1].content[0].text

# Example with token limits
limited_response = create_limited_response(
    "Explain machine learning in simple terms."
)

limited_response

'Machine learning is a branch of artificial intelligence that teaches computers to learn from data and improve their performance on tasks over time, without being explicitly programmed for each specific task. \n\nHere\'s how it works in simple terms:\n\n1. **Data**: You start with a lot of information (data) related to the problem you\'re trying to solve. For example, if you want to teach a computer to recognize cats in photos, you gather many pictures of cats and non-cats.\n\n2. **Learning**: The computer looks at the data and identifies patterns. It might notice that cats usually have pointy ears or whiskers.\n\n3. **Model**: Based on these patterns, the computer creates a "model" that can make predictions. For instance, it can guess whether a new photo contains a cat or not.\n\n4. **Testing**: You then test this model using new data to see how well it works. If it gets a lot of answers right, it means it has learned well.\n\n'

In [61]:
import IPython.display as display
import html

def get_response_text(max_tokens):
    prompt = "Explain the importance of regular exercise."
    try:
        response = client.responses.create(
            model="gpt-4o-mini",
            input=prompt,
            max_output_tokens=max_tokens
        )
        text = response.output[-1].content[0].text
        return text.strip()
    except Exception as e:
        return f"<span style='color:red;'>Error: {html.escape(str(e))}</span>"

token_settings = [50, 100, 200]
results = []
for tokens in token_settings:
    resp_text = get_response_text(tokens)
    results.append((tokens, resp_text))

# Create nice HTML table
html_rows = ["<tr><th>max_output_tokens</th><th>Response</th></tr>"]
for tokens, resp in results:
    html_rows.append(
        f"<tr><td style='font-weight:bold; text-align:center;'>{tokens}</td><td style='padding:8px; background:#f7f7f9; min-width:350px; font-family:monospace;'>{html.escape(resp)}</td></tr>"
    )
table_html = f"""
<div style="border:1px solid #ccc; border-radius:8px; margin:1em 0; overflow:hidden; box-shadow:0 3px 12px #eee;">
  <div style="background:#4078c0; color:#fff; padding:12px; font-size:1.1em; font-weight:bold;">
    <span>Comparison: <code>max_output_tokens</code> Effects</span>
  </div>
  <table style="width:100%; border-collapse:collapse; font-size:1em;">
    {''.join(html_rows)}
  </table>
</div>
"""

display.display(display.HTML(table_html))

max_output_tokens,Response
50,"Regular exercise is crucial for maintaining overall health and well-being. Here are some key reasons why it is important: 1. **Physical Health**: Exercise helps control weight, improves cardiovascular health, strengthens bones and muscles, and enhances flexibility and balance. It"
100,"Regular exercise is crucial for maintaining overall health and well-being. Here are some key reasons why it's important: 1. **Physical Health**: Regular exercise strengthens the heart, improves circulation, and helps manage body weight. It reduces the risk of chronic diseases such as diabetes, cardiovascular diseases, and certain cancers. 2. **Mental Health**: Exercise is known to alleviate symptoms of anxiety and depression. It triggers the release of endorphins, which can enhance mood and relieve stress. 3. **Enhanced"
200,"Regular exercise is crucial for maintaining overall health and well-being. Here are several key reasons highlighting its importance: 1. **Physical Health**: Exercise strengthens the heart, lungs, and muscles, improving cardiovascular fitness and endurance. It helps maintain a healthy weight, reduces the risk of chronic diseases such as diabetes, heart disease, and certain cancers, and supports bone health. 2. **Mental Health**: Physical activity boosts mood by releasing endorphins, alleviating stress and anxiety. Exercise is also linked to improved cognitive function and can help combat depression. 3. **Energy Boost**: Regular activity enhances stamina and energy levels by improving muscle strength and efficiency in the cardiovascular system, leading to increased productivity in daily tasks. 4. **Improved Sleep**: Regular exercise can help regulate sleep patterns, leading to deeper and more restful sleep, which is essential for overall health. 5. **Social Interaction**: Participating in group sports or fitness classes can foster social connections, enhancing emotional support and reducing feelings"


### Automatic Truncation

For long conversations, use automatic truncation to manage context:

token = small part of a text/word

In [39]:
def simulate_long_conversation_with_truncation():
    """
    Simulate a conversation with 5 rounds, then demonstrate automatic truncation in action.
    Each turn continues from the previous response.
    """
    conversation = [
        "Hi, can you explain what artificial intelligence is?",
        "How is machine learning different from AI?",
        "Can you tell me some real-world applications of machine learning?",
        "What are some challenges or pitfalls in developing ML models?",
        "How is deep learning unique compared to other machine learning approaches?"
    ]

    response_ids = []
    previous_id = None

    print("=== Simulating Long Conversation ===")
    for idx, message in enumerate(conversation):
        response = client.responses.create(
            model="gpt-5-mini",
            input=message,
            previous_response_id=previous_id,
        )
        text = response.output[-1].content[0].text
        print(f"Turn {idx+1} Assistant: {text}\n")
        previous_id = response.id
        response_ids.append(previous_id)

    # Now ask for a summary, which will likely trigger truncation of the oldest turns
    summary_prompt = "Please summarize everything we've discussed so far in three sentences."
    truncated_response = client.responses.create(
        model="gpt-5-mini",
        input=summary_prompt,
        previous_response_id=previous_id,
        truncation="auto"
    )
    print("=== Truncated Summary Response ===")
    print(truncated_response.output[-1].content[0].text)
    return truncated_response

# Run the function to demonstrate truncation
simulate_long_conversation_with_truncation()

=== Simulating Long Conversation ===
Turn 1 Assistant: Short answer
Artificial intelligence (AI) is the study and engineering of computer systems that perform tasks that normally require human intelligence ‚Äî for example recognizing speech, understanding text, seeing and identifying objects, making decisions, or translating languages.

Longer, plain-English breakdown

- What AI aims to do
  - Build machines or software that can perceive their environment, learn from data or experience, reason, and act to achieve goals.
  - In practice today that usually means automating or assisting specific cognitive tasks.

- Two useful categories
  - Narrow AI (or weak AI): systems that do one task well (e.g., face recognition, translation, chess). This is what nearly all current AI is.
  - General AI (AGI): a hypothetical system with human-like general problem-solving and understanding across domains. AGI does not exist today.

- Main technical approaches (high-level)
  - Symbolic / rule-based AI:

Response(id='resp_040c3ea66128520b006900b11b0d748196979fa5d0de752539', created_at=1761653019.0, error=None, incomplete_details=None, instructions=None, metadata={}, model='gpt-5-mini-2025-08-07', object='response', output=[ResponseReasoningItem(id='rs_040c3ea66128520b006900b11ba07c8196bff8fd0b37f5553c', summary=[], type='reasoning', content=None, encrypted_content=None, status=None), ResponseOutputMessage(id='msg_040c3ea66128520b006900b121434c8196bc3fc43c870251ea', content=[ResponseOutputText(annotations=[], text='Artificial intelligence is the broad field of building systems that perform tasks requiring human‚Äëlike intelligence, and machine learning is a subset of AI that trains algorithms to learn patterns and make predictions from data rather than being explicitly programmed. Machine learning is used in many real‚Äëworld applications (recommendations, search, vision, NLP, fraud detection, healthcare, finance, robotics, etc.) but projects commonly stumble on data issues, bias, label

## Complete Conversation Example

Let's create a complete multi-turn conversation with state management:

In [40]:
def multi_turn_conversation():
    """Demonstrate a complete multi-turn conversation."""
    print("=" * 60)
    print("Starting a multi-turn conversation")
    print("=" * 60)
    
    # Turn 1: Initial response
    response_1 = client.responses.create(
        model="gpt-4o-mini",
        instructions="You are a helpful Python programming tutor.",
        input="I'm new to Python. What should I learn first?"
    )
    print(f"\nTurn 1 - Response ID: {response_1.id}")
    print(f"Assistant: {response_1.output[0].content[0].text}\n")
    
    # Turn 2: Continue conversation
    response_2 = client.responses.create(
        model="gpt-4o-mini",
        input="Can you give me an example of a for loop?",
        previous_response_id=response_1.id
    )
    print(f"Turn 2 - Response ID: {response_2.id}")
    print(f"Assistant: {response_2.output[0].content[0].text}\n")
    
    # Turn 3: Continue further
    response_3 = client.responses.create(
        model="gpt-4o-mini",
        input="What's the difference between a list and a tuple?",
        previous_response_id=response_2.id
    )
    print(f"Turn 3 - Response ID: {response_3.id}")
    print(f"Assistant: {response_3.output[0].content[0].text}\n")
    
    # Turn 4: Fork back to turn 1
    response_4 = client.responses.create(
        model="gpt-4o-mini",
        input="Actually, what about web development with Python instead?",
        previous_response_id=response_1.id  # Fork from first response
    )
    print(f"Turn 4 (Forked from Turn 1) - Response ID: {response_4.id}")
    print(f"Assistant: {response_4.output[0].content[0].text}\n")
    
    print("=" * 60)
    print("Conversation complete")
    print("=" * 60)
    
    return response_1, response_2, response_3, response_4

# Run the complete conversation example
conv_responses = multi_turn_conversation()

Starting a multi-turn conversation

Turn 1 - Response ID: resp_08c771d947e40fae006900b17c2d188196bd5b4a287252ef9b
Assistant: Welcome to the world of Python! Here‚Äôs a structured approach to get you started:

### 1. **Basic Syntax**
   - **Variables**: Learn how to create and use variables.
   - **Data Types**: Understand integers, floats, strings, and booleans.
   - **Comments**: How to write comments in your code.

### 2. **Control Structures**
   - **Conditional Statements**: Learn about `if`, `elif`, and `else` statements.
   - **Loops**: Understand `for` loops and `while` loops.

### 3. **Data Structures**
   - **Lists**: Learn how to create and manipulate lists.
   - **Tuples**: Understand immutable sequences.
   - **Dictionaries**: Learn about key-value pairs.
   - **Sets**: Understand unordered collections of unique elements.

### 4. **Functions**
   - **Defining Functions**: Learn to write reusable code blocks using `def`.
   - **Arguments and Return Values**: Understand how t

## Best Practices

### 1. Conversation Management

- **Store response IDs**: Keep track of response IDs to continue conversations
- **Use forking wisely**: Fork conversations to explore alternative paths without losing context
- **Monitor token usage**: Remember that all prior tokens are billable when using `previous_response_id`
- **Set storage preferences**: Use `store=False` for sensitive conversations

### 2. Performance Optimization

- **Use appropriate models**: Use `gpt-4o-mini` for simple tasks, `gpt-4o` for complex reasoning
- **Limit context**: Use `max_prompt_tokens` and `truncation="auto"` for long conversations
- **Cache responses**: Store frequently accessed responses locally to reduce API calls

### 3. Content Best Practices

- **Clear instructions**: Provide clear, specific instructions for consistent behavior
- **Structured input**: Use role-based messages for complex conversations
- **Tool selection**: Choose appropriate tools (web_search, file_search, code_interpreter) based on the task

### 4. Error Handling

Always implement proper error handling:

In [41]:
def safe_create_response(user_input, previous_response_id=None):
    """Create a response with proper error handling."""
    try:
        response = client.responses.create(
            model="gpt-4o-mini",
            input=user_input,
            previous_response_id=previous_response_id
        )
        return response
    except Exception as e:
        print(f"Error creating response: {e}")
        return None

# Example with error handling
safe_response = safe_create_response("Hello, how are you?")

## Exercise: Build a Conversational Assistant

Try this exercise to practice working with conversations:

In [62]:
def interactive_conversation():
    """
    Create an interactive conversation loop.
    Type 'quit' to exit.
    """
    print("Starting interactive conversation. Type 'quit' to exit.\n")
    
    last_response_id = None
    
    while True:
        user_input = input("You: ")
        print("===========User Input===========")
        print(user_input)
        print("================================")
        
        if user_input.lower() in ['quit', 'exit', 'q']:
            print("Goodbye!")
            break
        
        try:
            response = client.responses.create(
                model="gpt-5-mini",
                instructions="You are a friendly and helpful assistant.",
                input=user_input,
                previous_response_id=last_response_id
            )
            
            last_response_id = response.id
            print("=======Assistant Response=======")   
            print(response.output[-1].content[0].text)
            print("================================")
            
        except Exception as e:
            print(f"Error: {e}\n")

# Uncomment to run the interactive conversation
interactive_conversation()

Starting interactive conversation. Type 'quit' to exit.

HI!
Hi there! How can I help you today?
YOu can tell me what is the meaning of life in one sentence.
The meaning of life is to create and pursue purpose, connection, and joy by growing, loving, and contributing to something larger than yourself.
quit
Goodbye!


## Migration from Assistants API

If you're migrating from the Assistants API, here's a quick reference:

### Assistants API Pattern
```python
# Old way (Assistants API)
assistant = client.beta.assistants.create(
    name="My Assistant",
    instructions="You are helpful.",
    model="gpt-4"
)

thread = client.beta.threads.create()

message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="Hello"
)

run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id
)
```

### Responses API Pattern
```python
# New way (Responses API)
response = client.responses.create(
    model="gpt-4o",
    instructions="You are helpful.",
    input="Hello"
)

# Continue conversation
response_2 = client.responses.create(
    model="gpt-4o",
    input="How are you?",
    previous_response_id=response.id
)
```

**Benefits of Migration:**
- Simpler API with less boilerplate
- Faster response times
- Built-in state management
- Conversation forking capabilities
- Unified interface for tools and multimodal inputs

## Conclusion

The Responses API provides a streamlined, powerful way to build conversational AI applications. Key takeaways:

1. **Simplified State Management**: No need to manually manage threads and messages
2. **Server-Side Storage**: Conversations are stored automatically for 30 days
3. **Flexible Continuation**: Use `previous_response_id` to continue or fork conversations
4. **Built-in Tools**: Web search, file search, and code interpreter available out of the box
5. **Multimodal Support**: Handle text, images, and files in the same API
6. **Cost Management**: Control token usage with limits and truncation

The Responses API represents a significant improvement over the Assistants API, offering better performance, simpler code, and more powerful features for building modern AI applications.

## Additional Resources

- [OpenAI Responses API Documentation](https://platform.openai.com/docs/api-reference/responses)
- [OpenAI Cookbook - Responses API Examples](https://cookbook.openai.com/examples/responses_api/responses_example)
- [Migration Guide: Assistants API to Responses API](https://apimagic.ai/blog/switching-assistant-responses-api)