<a href="https://colab.research.google.com/github/Vlad-Enia/NN-LLM-Intro/blob/master/Part%20II%20-%20LLMs/Demos/OpenAI%20-%20Responses%20API.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Responses API
* The Responses API is the newest API from OpenAI, adding
  * **Stateful** and **event-driven architecture**, making it easier to manage chat history and context
  * Built-in tools such as
    * Web search
    * File search (RAG)
    * Computer use
    * Code interpreter (Soon)

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
!pip install --upgrade openai

In [2]:
from openai import OpenAI
from google.colab import userdata

OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')
client = OpenAI(api_key=OPENAI_API_KEY)

## Text Generation

Generating text is similar to Chat Completions API, but the syntax is a more **simplified**



In [4]:
response = client.responses.create(
  model="gpt-4o-mini",
  input="Tell me a three sentence bedtime story about a unicorn."
)

print("Response ID: " + response.id)
print("Response: " + response.output_text)

Response ID: resp_67ebc133d31c8192b529b3b9d9b8a82807c024617c29c7ed
Response: Once upon a time in a shimmering forest, a gentle unicorn named Luna discovered a hidden glade filled with sparkling stars. Each night, she would sprinkle a bit of her magic on the flowers, making them glow and bringing joy to all the woodland creatures. As they gathered to watch the enchanting display, Luna smiled, knowing that her heart's light was the true magic they all cherished.


One thing to note is that any response generated by an LLM will come with a unique ID.

Generate a new response by running the cell above and you will notice a different ID every time.

### Response Object

In [None]:
{
  "id": "resp_67eaaa98592081928e8564d32c50755607602cd764b32c2b",
  "object": "response",
  "created_at": 1741476542,
  "status": "completed",
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "max_output_tokens": null,
  "model": "gpt-4o-mini",
  "output": [
    {
      "type": "message",
      "id": "resp_67eaaa98592081928e8564d32c50755607602cd764b32c2b",
      "status": "completed",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Once upon a time, in a shimmering forest, a gentle unicorn named Lila discovered a hidden glade where the moonlight danced upon the flowers. Every night, she sprinkled her magical stardust, bringing dreams to all the animals nearby. As they slept peacefully, Lila whispered, \"May your dreams be as bright as the stars above.\"",
          "annotations": []
        }
      ]
    }
  ],
  "parallel_tool_calls": true,
  "previous_response_id": null,
  "reasoning": {
    "effort": null,
    "generate_summary": null
  },
  "store": true,
  "temperature": 1.0,
  "text": {
    "format": {
      "type": "text"
    }
  },
  "tool_choice": "auto",
  "tools": [],
  "top_p": 1.0,
  "truncation": "disabled",
  "usage": {
    "input_tokens": 36,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens": 87,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 123
  },
  "user": null,
  "metadata": {}
}


### Chat History

Share context across generated responses with the `previous_response_id` parameter. This parameter lets you chain responses and create a threaded conversation.

This is possible because the prompts are stored by OpenAI by default.

We can disable storing by setting the `store` partameter to `false`.

In [5]:
response = client.responses.create(
    model="gpt-4o-mini",
    input="tell me a joke",
)
print(response.output_text)

print("\n------------------------\n")

second_response = client.responses.create(
    model="gpt-4o-mini",
    previous_response_id=response.id, # We give the id of the first response as value for the previous_response parameter
    input=[{"role": "user", "content": "explain why this is funny."}],
)
print(second_response.output_text)

Why did the scarecrow win an award?

Because he was outstanding in his field!

------------------------

The joke plays on a double meaning and a pun. 

1. **Double Meaning**: "Outstanding" can mean both "exceptional" (as in deserving an award) and literally "standing out" in a field (like a scarecrow would).

2. **Surprise Element**: The punchline leads the listener to realize that the scarecrow isn't recognized for any human-like achievement; instead, it's humorously noted for its literal presence in a field.

Combining wordplay with an unexpected twist makes it funny!


### Including Instructions

Inserts a system (or developer) message as the first item in the model's context.

When using along with `previous_response_id`, the **instructions from a previous response will not be carried over to the next response**.

This makes it simple to swap out system (or developer) messages in new responses.

In [6]:
response = client.responses.create(
  model="gpt-4o-mini",
  instructions="You are a helpful assistant called LISA.",
  input="Hello! My name is Vlad. What is your name?",
)

print(response.output_text)

Hello, Vlad! I'm LISA. How can I assist you today?


### Streaming the Response

The Responses API uses semantic events for streaming. Each event is typed with a predefined schema, so you can listen for events you care about.

Some key lifecycle events are emitted only once, while others are emitted multiple times as the response is generated. Common events to listen for when streaming text are:

1. `response.created`
2. `response.output_text.delta`
3. `response.completed`
4. `error`

Here is how a sequence of event types look when streamin text:

In [33]:
response = client.responses.create(
  model="gpt-4o",
  instructions="You are a helpful assistant named LISA.",
  input="Hello! My name is Vlad. What is your name?",
  stream=True
)

for event in response:
  if(event.type == "response.output_text.delta"):
    print(event.type, "  -->  ", event.delta)
  else:
    print(event.type)

response.created
response.in_progress
response.output_item.added
response.content_part.added
response.output_text.delta   -->   Hello
response.output_text.delta   -->   ,
response.output_text.delta   -->    Vlad
response.output_text.delta   -->   !
response.output_text.delta   -->    My
response.output_text.delta   -->    name
response.output_text.delta   -->    is
response.output_text.delta   -->    L
response.output_text.delta   -->   ISA
response.output_text.delta   -->   .
response.output_text.delta   -->    How
response.output_text.delta   -->    can
response.output_text.delta   -->    I
response.output_text.delta   -->    assist
response.output_text.delta   -->    you
response.output_text.delta   -->    today
response.output_text.delta   -->   ?
response.output_text.done
response.content_part.done
response.output_item.done
response.completed


This is how to stream text responses:

In [7]:
response = client.responses.create(
  model="gpt-4o",
  instructions="You are a helpful assistant named LISA.",
  input="Hello! My name is Vlad. What is your name?",
  stream=True
)

for event in response:
  if(event.type == "response.output_text.delta"):
    print(event.delta, end="", flush=True)

Hello, Vlad! I'm LISA. How can I assist you today?

## Image Input
![](https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg)

In [71]:
response = client.responses.create(
    model="gpt-4o-mini",
    input=[
        {
            "role": "user",
            "content": [
                { "type": "input_text", "text": "what is in this image?" },
                {
                    "type": "input_image",
                    "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
                }
            ]
        }
    ]
)

print(response.output_text)

The image depicts a scenic landscape featuring a wooden pathway leading through a lush, grassy area. The pathway is bordered by green grass and small shrubs, with trees visible in the background. The sky is bright with patches of clouds, suggesting a clear, sunny day. Overall, the scene conveys a tranquil and natural environment, ideal for walking or enjoying nature.


## Built-in Tools

These tools help models access additional context and information from the web or your files.

### Function Calling

You can give the model access to your own custom code through function calling. Based on the system prompt and messages, the model may decide to call these functions — instead of (or in addition to) generating text or audio.

Steps:
1. Define custom function
2. Call the model with the function defined in the `tools` parameter;  
  A function is defined by its schema, informing the model
  * What the function does
  * When to use the function
  * What input arguments to expect
  
  The model returns an object containing the name of the function and input arguments.
3. Parse the model response and call the function with the given arguments.
4. Supply the model with the results so it can incorporate them in the final result.

#### Use Case: Get Weather

Lets look at a use case where we want the model to fetch weather information for a given location.

##### Step 1: define custom function

In [8]:
import requests

def get_weather(latitude, longitude):
    response = requests.get(f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&current=temperature_2m,wind_speed_10m&hourly=temperature_2m,relative_humidity_2m,wind_speed_10m")
    data = response.json()
    return data['current']['temperature_2m']

##### Step 2: call the model with the function defined

In [9]:
tools = [
    {
        "type": "function",
        "name": "get_weather", # Name of the function defined above
        "description": "Get the current weather in a given location", # Details on when and how to use the function
        "parameters": {     # JSON schema defining the function's input arguments
          "type": "object",
          "properties": {
              "latitude": {"type": "number"},
              "longitude": {"type": "number"}
          },
          "required": ["latitude", "longitude"],
        }
    }
]

input_messages = [{"role": "user", "content": "What's the weather like in Paris today?"}]

response = client.responses.create(
  model="gpt-4o-mini",
  tools=tools,
  input=input_messages,
  tool_choice="auto"
)

print(response.output[0].model_dump_json(indent=2))

{
  "arguments": "{\"latitude\":48.8566,\"longitude\":2.3522}",
  "call_id": "call_78gr8vKzvciNha5kS929fC3z",
  "name": "get_weather",
  "type": "function_call",
  "id": "fc_67ebc39a6aec8192bdeb05b6d156e3ff0d58282f5907372c",
  "status": "completed"
}


Notice that the custom function `get_weather` requires sepcific coordinates (`latitude` and `longitude`), but we give a general location (`Boston`) to the model.

The model noticed that and automatically determined the coordinates of Boston, therefore providing the correct inputs for the function call.

##### Step 3: parse the model response

In [10]:
import json

tool_call = response.output[0]

func_call = tool_call.name + '(' # name of the function we want to call -> get_weather(

arguments = json.loads(tool_call.arguments) # convert the arguments from string to json
for key, value in arguments.items(): # iterate through the arguments and add them to the function call string
  func_call += key + "=" + str(value) + ','

func_call += ')'
print(func_call)


get_weather(latitude=48.8566,longitude=2.3522,)


Call the function with the given arguments.

In [11]:
func_result = eval(func_call)
print(func_result)

13.6


##### Step 4: Supply the model with the results so it can incorporate them in the final result

In [12]:
input_messages = [{"role": "user", "content": "What's the weather like in Paris today?"}]
input_messages.append(tool_call)    # append model's function call message
input_messages.append({             # append result message
    "type": "function_call_output",
    "call_id": tool_call.call_id,
    "output": str(func_result)
})

response_2 = client.responses.create(
    model="gpt-4o-mini",
    input=input_messages,
    tools=tools,
)
print(response_2.output_text)

The current temperature in Paris today is approximately 13.6°C. Would you like to know more about the weather, such as humidity or forecast for the week?


#### Additional configurations

By default the model will determine when and how many tools to use. You can force specific behavior with the tool_choice parameter.

* Auto: (Default) Call zero, one, or multiple functions.

  `tool_choice: "auto"`

* Required: Call one or more functions.

  `tool_choice: "required"`

* Forced Function: Call exactly one specific function.

  `tool_choice: {"type": "function", "name": "get_weather"}`
* None: No passing functions.
  `tool_choice: "none"`

#### Token Usage
Under the hood, functions are injected into the system message in a syntax the model has been trained on.

This means functions count against the model's context limit and are billed as input tokens.

If you run into token limits, limit the number of functions or the length of the descriptions you provide for function parameters.

More info on [Function Calling](https://platform.openai.com/docs/guides/function-calling?api-mode=responses)

### Web Search

Allows models to search the web for the latest information before generating a response.

Using the Responses API, you can enable web search by configuring it in the `tools` array in an API request to generate content.

Like any other tool, the model can choose to search the web or not based on the content of the input prompt.

##### Output

In [13]:
import markdown
response = client.responses.create(
    model="gpt-4o-mini",
    tools=[{"type": "web_search_preview"}],
    input="What was a positive news story from today?"
)

In [14]:
from IPython.display import Markdown, display

display(Markdown(response.output_text))

As of April 1, 2025, several positive developments have been reported:

- **Global Work-Life Balance Shift**: An international study revealed that work-life balance has now surpassed pay as the primary motivator for employees worldwide. This trend highlights a growing emphasis on personal well-being over financial compensation. ([positive.news](https://www.positive.news/society/good-news-stories-from-week-04-of-2025/?utm_source=openai))

- **Democratic Republic of Congo's Conservation Efforts**: The DRC has pledged to establish the world's largest protected tropical reserve, covering an area approximately the size of France. This initiative aims to conserve vast tracts of the Congo Basin, a critical carbon sink and biodiversity hotspot. ([positive.news](https://www.positive.news/society/good-news-stories-from-week-04-of-2025/?utm_source=openai))

- **Thailand's Marriage Equality Milestone**: Thailand has become the first Southeast Asian country to legalize same-sex marriage, granting equal rights to same-sex couples, including financial benefits and adoption rights. This landmark legislation was celebrated with a mass wedding in Bangkok. ([positive.news](https://www.positive.news/society/good-news-stories-from-week-04-of-2025/?utm_source=openai))

- **Advancements in Prostate Cancer Detection**: Scientists have developed a groundbreaking ultrasound technique that shows promise in revolutionizing prostate cancer testing. This method utilizes existing clinical ultrasound equipment to produce high-resolution images of the prostate, potentially improving early detection and treatment outcomes. ([positive.news](https://www.positive.news/society/good-news-stories-from-week-02-of-2025/?utm_source=openai))

- **Record Growth in Electric Vehicle Sales**: Global sales of fully electric and plug-in hybrid electric vehicles increased by 25% in 2024, surpassing 17 million units sold. This surge indicates a significant shift towards sustainable transportation options worldwide. ([goodgoodgood.co](https://www.goodgoodgood.co/articles/good-news-this-week-january-25-2025?utm_source=openai))

These stories reflect positive progress in various sectors, including environmental conservation, social equality, healthcare, and sustainable transportation. 

##### Citations

By default, the model's response will include inline citations for URLs found in the web search results.

In addition to this, more detailed citations can be found here:

In [15]:
def print_annotations(response):
  for annotation in response.output[1].content[0].annotations:
    print(annotation.model_dump_json(indent=2))

print_annotations(response)

{
  "end_index": 440,
  "start_index": 330,
  "title": "What went right this week: a global work reset, plus more - Positive News",
  "type": "url_citation",
  "url": "https://www.positive.news/society/good-news-stories-from-week-04-of-2025/?utm_source=openai"
}
{
  "end_index": 857,
  "start_index": 747,
  "title": "What went right this week: a global work reset, plus more - Positive News",
  "type": "url_citation",
  "url": "https://www.positive.news/society/good-news-stories-from-week-04-of-2025/?utm_source=openai"
}
{
  "end_index": 1266,
  "start_index": 1156,
  "title": "What went right this week: a global work reset, plus more - Positive News",
  "type": "url_citation",
  "url": "https://www.positive.news/society/good-news-stories-from-week-04-of-2025/?utm_source=openai"
}
{
  "end_index": 1727,
  "start_index": 1617,
  "title": "What went right this week: 'an epic ocean victory', plus more - Positive News",
  "type": "url_citation",
  "url": "https://www.positive.news/society/g

##### User Location
To refine search results based on geography, you can specify an approximate user location using country, city, region, and/or timezone.

* The `city` and `region` fields are free text strings, like `Minneapolis` and `Minnesota` respectively.
* The `country` field is a two-letter ISO country code, like `US`.
* The `timezone` field is an IANA timezone like `America/Chicago`.

In [16]:
response = client.responses.create(
    model="gpt-4o-mini",
    tools=[{
        "type": "web_search_preview",
        "user_location": {
            "type": "approximate",
            "country": "RO",
            "city": "Iasi",
            "region": "Iasi",
        }
    }],
    input="What is the weather like today?",
)

display(Markdown(response.output_text))

Today in Iași, Romania, expect mostly cloudy conditions with a high of 61°F (16°C) and a low of 45°F (7°C). The average high temperature in April is around 61.7°F (16.5°C), with average lows near 41.4°F (5.2°C). ([weather-atlas.com](https://www.weather-atlas.com/en/romania/iasi-weather-april?utm_source=openai))

## Weather for Iași, Romania:
Current Conditions: Mostly cloudy, 55°F (13°C)

Daily Forecast:
* Tuesday, April 01: Low: 45°F (7°C), High: 61°F (16°C), Description: Beautiful with a blend of sun and clouds
* Wednesday, April 02: Low: 46°F (8°C), High: 53°F (12°C), Description: Mostly cloudy
* Thursday, April 03: Low: 43°F (6°C), High: 52°F (11°C), Description: A passing morning shower or two; otherwise, cloudy
* Friday, April 04: Low: 43°F (6°C), High: 61°F (16°C), Description: Milder with plenty of sun
* Saturday, April 05: Low: 39°F (4°C), High: 65°F (18°C), Description: Partly sunny with a shower in the afternoon
* Sunday, April 06: Low: 35°F (2°C), High: 53°F (12°C), Description: A stray morning shower; otherwise, cloudy and cooler
* Monday, April 07: Low: 34°F (1°C), High: 47°F (8°C), Description: A few flurries in the morning; otherwise, mostly cloudy
 

##### Token Usage


When using this tool, the `search_context_size` parameter controls how much context is retrieved from the web to help the tool formulate a response.

The tokens used by the search tool **do not affect the context window** of the main model specified in the model parameter in your response creation request.

Choosing a context size impacts:
* **Cost**: Pricing of our search tool varies based on the value of this parameter. Higher context sizes are more expensive. See tool pricing [here](https://platform.openai.com/docs/pricing).
* **Quality**: Higher search context sizes generally provide richer context, resulting in more accurate, comprehensive answers.
* **Latency**: Higher context sizes require processing more tokens, which can slow down the tool's response time.

Available values:

* `high`: Most comprehensive context, highest cost, slower response.
* `medium` (default): Balanced context, cost, and latency.
* `low`: Least context, lowest cost, fastest response, but potentially lower answer quality.

### File Search



It enables models to retrieve information in a knowledge base of previously uploaded files through semantic and keyword search.

By creating vector stores and uploading files to them, you can augment the models' inherent knowledge

It's basically what we did in the [RAG demo](https://github.com/Vlad-Enia/NN-LLM-Intro/blob/master/Part%20II%20-%20LLMs/Demos/RAG/RAG_Chromadb.ipynb), but using OpenAI's vector store. When the model decides to use it, it will automatically call the tool, retrieve information from your files, and return an output.

Steps:
  1. Upload files to OpenAI
  2. Create a vector store
  3. Add the files to the vector store
  4. Query the model by including the vector store in the `tools` parameter

#### Use Case: EV Documentation

We will create a knowledge base containing information about EVs, based on three PDF documents.

##### Step 1: Upload files to OpenAI

We will upload three PDF files containing info regarding EVs.

In [17]:
file_names = [
    'electric_vehicles.pdf',
    'pev_consumer_handbook.pdf',
    'department-for-transport-ev-guide.pdf'
]

# Replace with actual file paths after uploading them to drive
ev_docs_path = userdata.get('EV_DOCS_PATH') + '/'

file_ids = []

for file in file_names:
  with open(ev_docs_path + file, "rb") as f:
    result = client.files.create(file=f, purpose="assistants")
  file_ids.append(result.id)

You can also check out the uploaded files [here](https://platform.openai.com/storage/files).

##### Step 2: Create vector store

In [18]:
vector_store = client.vector_stores.create(
    name="EV_Documentation"
)
print(vector_store.id)

vs_67ebc6d94cf88191bc4d185f4b58d25f


You can also check out the vector store [here](https://platform.openai.com/storage/vector_stores).

##### Step 3: Add files to the vector store

In [19]:
for file_id in file_ids:
  result = client.vector_stores.files.create(
      vector_store_id=vector_store.id,
      file_id=file_id
  )
  print(result)

VectorStoreFile(id='file-JbpzTKMRDr67mj7qKvgkpc', created_at=1743505117, last_error=None, object='vector_store.file', status='in_progress', usage_bytes=0, vector_store_id='vs_67ebc6d94cf88191bc4d185f4b58d25f', attributes={}, chunking_strategy=StaticFileChunkingStrategyObject(static=StaticFileChunkingStrategy(chunk_overlap_tokens=400, max_chunk_size_tokens=800), type='static'))
VectorStoreFile(id='file-Kv1Nu8hagBiXuv8AM9v8sL', created_at=1743505118, last_error=None, object='vector_store.file', status='in_progress', usage_bytes=0, vector_store_id='vs_67ebc6d94cf88191bc4d185f4b58d25f', attributes={}, chunking_strategy=StaticFileChunkingStrategyObject(static=StaticFileChunkingStrategy(chunk_overlap_tokens=400, max_chunk_size_tokens=800), type='static'))
VectorStoreFile(id='file-FWUk8RMnwZW897RbLTtZTp', created_at=1743505119, last_error=None, object='vector_store.file', status='in_progress', usage_bytes=0, vector_store_id='vs_67ebc6d94cf88191bc4d185f4b58d25f', attributes={}, chunking_strate

Adding files to a vector store might take some minutes, as they are automatically parsed and then split into chunks which are embedded into vectors.

You can see the status of the files in the vector store [here](https://platform.openai.com/storage/vector_stores/)

##### Step 4: Query the model by including the vector store in the tools parameter

At the moment, you can search in only one vector store at a time, so you can include only one vector store ID when calling the file search tool.

In [127]:
response = client.responses.create(
    model="gpt-4o-mini",
    input="What are the types of charging stations for EVs?",
    tools=[{
        "type": "file_search",
        "vector_store_ids": [vector_store.id]
    }]
)

display(Markdown(response.output_text))

There are several types of charging stations for electric vehicles (EVs) categorized mainly by the amount of power they deliver and their usage contexts. Here are the primary types:

1. **Level 1 Charging**:
   - Utilizes a standard 120-volt AC outlet.
   - Does not require any special installation beyond a dedicated circuit.
   - Adds about 2 to 5 miles of range per hour of charging.
   - Ideal for home use or when only a household outlet is available.

2. **Level 2 Charging**:
   - Operates on a 240-volt AC power supply, usually installed at homes or workplaces.
   - Requires a dedicated electrical circuit and installation of charging equipment.
   - Adds approximately 10 to 20 miles of range per hour.
   - Commonly used for overnight charging in residential settings.

3. **DC Fast Charging (DCFC)**:
   - Provides DC power directly to the vehicle, bypassing the onboard charger.
   - Can add about 60 to 80 miles of range in 20 minutes or less.
   - Typically found in public charging stations along highways and in urban areas.

4. **Wireless Charging**:
   - Utilizes an electromagnetic field to transfer electricity without a physical connection.
   - Still an emerging technology, it is primarily used in specific applications like transit buses and is being developed for passenger vehicles.

These charging methods offer a range of options depending on the user's needs and the infrastructure available.

Let's check out the citations, as they are included as annotations in the response:

In [128]:
print_annotations(response)

{
  "file_id": "file-LFq9WLEK1g7bpFL4kGHCiK",
  "index": 456,
  "type": "file_citation",
  "filename": "pev_consumer_handbook.pdf"
}
{
  "file_id": "file-LFq9WLEK1g7bpFL4kGHCiK",
  "index": 456,
  "type": "file_citation",
  "filename": "pev_consumer_handbook.pdf"
}
{
  "file_id": "file-LFq9WLEK1g7bpFL4kGHCiK",
  "index": 779,
  "type": "file_citation",
  "filename": "pev_consumer_handbook.pdf"
}
{
  "file_id": "file-LFq9WLEK1g7bpFL4kGHCiK",
  "index": 779,
  "type": "file_citation",
  "filename": "pev_consumer_handbook.pdf"
}
{
  "file_id": "file-LFq9WLEK1g7bpFL4kGHCiK",
  "index": 1041,
  "type": "file_citation",
  "filename": "pev_consumer_handbook.pdf"
}
{
  "file_id": "file-LFq9WLEK1g7bpFL4kGHCiK",
  "index": 1041,
  "type": "file_citation",
  "filename": "pev_consumer_handbook.pdf"
}
{
  "file_id": "file-LFq9WLEK1g7bpFL4kGHCiK",
  "index": 1309,
  "type": "file_citation",
  "filename": "pev_consumer_handbook.pdf"
}
{
  "file_id": "file-LFq9WLEK1g7bpFL4kGHCiK",
  "index": 1309,
  "

#### Include search results in the response
To include search results in the response, you can use the include parameter when creating the response.

In [133]:
response = client.responses.create(
    model="gpt-4o-mini",
    input="What are the types of charging stations for EVs?",
    tools=[{
        "type": "file_search",
        "vector_store_ids": [vector_store.id]
    }],
    include=["file_search_call.results"] ##
)

for result in response.output[0].results:
  print(result.model_dump_json(indent=2))

{
  "attributes": {},
  "file_id": "file-LFq9WLEK1g7bpFL4kGHCiK",
  "filename": "pev_consumer_handbook.pdf",
  "score": 0.8874569173726506,
  "text": "Beard, Portland State University/PIX 19557\n\nhttp://afdc.energy.gov/laws\nhttp://afdc.energy.gov/cleancities/coalitions/coalition_locations.php\nhttp://afdc.energy.gov/cleancities/coalitions/coalition_locations.php\nhttp://naseo.org/members-states\nhttp://naseo.org/members-states\n\n\nPlug-In Electric Vehicle Handbook for Consumers 9\n\nCharging your PEV requires plugging in to electric \nvehicle supply equipment (EVSE). EVs must be charged \nregularly, and charging PHEVs regularly minimizes the \namount of gasoline they consume. Whether publicly \navailable or installed in your home, there are vari-\nous types of charging infrastructure. One important \nvariation is how quickly they can charge a vehicle. This \nsection describes the EVSE options so you can choose \nwhat’s best for you.\n\nCharging Your Vehicle\n\nInlet\n\nConnector\n\n

#### Limiting the number of results

You can customize the number of results you want to retrieve from the vector stores.

This can help reduce both token usage and latency, but may come at the cost of reduced answer quality.

In [134]:
response = client.responses.create(
    model="gpt-4o-mini",
    input="What are the types of charging stations for EVs?",
    tools=[{
        "type": "file_search",
        "vector_store_ids": [vector_store.id],
        "max_num_results": 2 ##
    }],
    include=["file_search_call.results"]
)

for result in response.output[0].results:
  print(result.model_dump_json(indent=2))

{
  "attributes": {},
  "file_id": "file-LFq9WLEK1g7bpFL4kGHCiK",
  "filename": "pev_consumer_handbook.pdf",
  "score": 0.8998272777067303,
  "text": "EVs typically have more battery capacity than PHEVs, \nso charging a fully depleted EV takes longer than charg-\ning a fully depleted PHEV.\n\nLevel 1\n\nLevel 1 EVSE provides charging through a 120 volt (V) \nAC plug and requires a dedicated branch circuit per \nthe National Electrical Code (NEC). Most, if not all, \nPEVs will come with a portable Level 1 EVSE cord-\nset, which does not require installation of additional \ncharging equipment. Typically, on one end of the cord \nis a standard, three-prong household plug (NEMA 5-15 \nconnector). On the other end is a standard SAE J1772 \nconnector (see Connectors and Plugs section below), \nwhich plugs into the vehicle. \n\nLevel 1 works well for charging at home, work, or when \nonly a 120 V outlet available. Based on the battery type \nand vehicle, Level 1 charging adds about 2 to 5 mil

## Reasoning

Reasoning models think before they answer, producing a long internal chain of thought before responding to the user.

Reasoning models excel in complex problem solving, coding, scientific reasoning, and multi-step planning for agentic workflows.

As with GPT models, OpenAI provides both a smaller, faster model (`o3-mini`) that is less expensive per token, and a larger model (`o1`) that is somewhat slower and more expensive, but can often generate better responses for complex tasks, and generalize better across domains.



In [136]:
prompt = """
Write a bash script that takes a matrix represented as a string with
format '[1,2],[3,4],[5,6]' and prints the transpose in the same format.
"""

response = client.responses.create(
    model="o3-mini",
    reasoning={"effort": "medium"},
    input=[
        {
            "role": "user",
            "content": prompt
        }
    ]
)

print(response.output_text)

Below is one way to do it in a Bash script. In this solution we:

• Accept the matrix string as the first argument (e.g. "[1,2],[3,4],[5,6]")
• Replace the “],[” with a temporary separator (here “;”) and remove the leading “[” and trailing “]” so that we can split the string into rows.
• Split each row by commas to get the cells and store them in a 2D array (using an associative array).
• Build the transpose by swapping rows and columns.
• Print the result in the same format.

Save the following script as, for example, transpose.sh and make it executable.

--------------------------------------------------
#!/bin/bash
# Check if an argument is provided.
if [ "$#" -lt 1 ]; then
    echo "Usage: $0 '<matrix_string>'"
    exit 1
fi

# Input matrix in the format: "[1,2],[3,4],[5,6]"
matrix="$1"

# Convert the input to a format we can easily split:
# • Remove the first '[' and the last ']'
# • Replace "],[" with a semicolon (;) so that we can split the rows.
matrix_clean=$(echo "$matrix" | 

### Reasoning effort

You can specify one of `low`, `medium`, or `high` for the `reasoning` effort parameter where:
* `low`: favors speed and economical token usage
* `high`: favors more complete reasoning at the cost of more tokens generated and slower responses.
* `medium` (default): a balance between speed and reasoning accuracy.