# OpenAI Responses API

## What is the OpenAI Responses API?

The Responses API is a new API released in March 2025. It is a combination of the traditional 
Chat Completions API and the Assistants API, providing support for:

- **Traditional Chat Completions:** Facilitates seamless conversational AI experiences.
- **Web Search:** Enables real-time information retrieval from the internet.
- **File Search:** Allows searching within files for relevant data.

Accordingly, the Assistants API will be retired in 2026. 

> **For new users, OpenAI recommends using the Responses API instead of the Chat Completions API to leverage its expanded capabilities.**

For a comprehensive comparison between the Responses API and the Chat Completions API, refer to the official OpenAI documentation: 
[Responses vs. Chat Completions](https://platform.openai.com/docs/guides/responses-vs-chat-completions).

## Summary of This Notebook
This notebook provides a hands-on guide for using the **OpenAI Responses API** to analyze tweets. 
It covers essential techniques such as:

- **Creating a vector store** and uploading tweets for semantic search.
- **Using file search** to analyze private datasets.
- **Performing a web search** to retrieve the latest public information.
- **Utilizing stateful responses** to maintain conversation context.
- **Combining file and web search** to enhance retrieval-augmented generation (RAG) applications.

By the end of this notebook, users will be able to integrate OpenAI's Responses API for efficient data retrieval and analysis of structured and unstructured data.

## Install Required Libraries
To use the OpenAI Responses API, we need to install the following libraries:

- **`openai`**: Provides access to OpenAI's APIs, including the Responses API

In [1]:
pip install openai -q

Note: you may need to restart the kernel to use updated packages.


## Import Required Libraries

In [2]:
from IPython.display import Markdown, display
import boto3
from botocore.exceptions import ClientError
import json
import io

## Retrieve Secrets from AWS Secrets Manager

In [3]:
def get_secret(secret_name):
    region_name = "us-east-1"

    # Create a Secrets Manager client
    session = boto3.session.Session()
    client = session.client(
        service_name='secretsmanager',
        region_name=region_name
    )

    try:
        get_secret_value_response = client.get_secret_value(
            SecretId=secret_name
        )
    except ClientError as e:
        raise e

    secret = get_secret_value_response['SecretString']
    
    return json.loads(secret)

## Initialize OpenAI Client

In [4]:
from openai import OpenAI
openai_api_key  = get_secret('openai')['api_key']

client = OpenAI(api_key=openai_api_key)

## File Search API

### Introduction to File Search
File search API enables efficient retrieval of relevant information 
from uploaded files by leveraging vector-based indexing. This feature is particularly useful 
for searching large datasets, extracting insights, and improving retrieval-augmented generation (RAG) applications.

Unlike traditional keyword-based searches, the Responses API uses embeddings 
to identify semantically relevant content, making it ideal for analyzing structured 
and unstructured text data (OpenAI, 2025).

For more details, visit the official OpenAI documentation: 
[File Search in Responses API](https://platform.openai.com/docs/guides/tools-file-search).

### Create a Vector Store

In [5]:
vector_store = client.vector_stores.create(
    name="my_vector_store"
)
vector_store_id = vector_store.id
print(vector_store_id)

vs_6914bfd30bc081918bd2ad146efc3f58


### Upload Files

In [6]:
with open('tweet_text (1).json', 'rb') as f:
    file = client.files.create(
        file=f,            # file-like object
        purpose="assistants"
    )

file_id = file.id
print(file_id)

file-MyhU7DkWiAcZMfvSFXmJ4k


### Attach File to Vector Store

In [7]:
attach_status =client.vector_stores.files.create(
    vector_store_id=vector_store_id,
    file_id=file_id
            )

print(attach_status.id)

file-MyhU7DkWiAcZMfvSFXmJ4k


### Query the Vector Store

In [8]:
query = "the latest development in generativeAI"

In [9]:
search_results = client.vector_stores.search(
    vector_store_id=vector_store_id,
    query=query
)

for result in search_results.data[:5]:
    print(result.content[0].text[:100] + '\n Relevant score: ' + str(result.score))

## OpenAI Response API

### Simple Response

In [10]:
simple_response = client.responses.create(
  model="gpt-4o",
  input=[
      {
          "role": "user",
          "content": query
      }
  ]
)

In [11]:
display(Markdown(simple_response.output_text))

As of the latest updates, here are some significant developments in generative AI:

1. **Advancements in Multimodal Models**: Models like OpenAI's GPT-4 and Google's Gemini are designed to handle both text and images, enhancing their ability to generate more sophisticated and contextually relevant outputs.

2. **Improved Text-to-Image Generation**: Tools like DALL-E 3 and Midjourney have further refined their algorithms to produce more photorealistic and creative images from text prompts.

3. **AI in Content Creation**: There's increasing adoption of generative AI in fields like art, music, and video game design, enabling creators to produce complex works with less manual effort.

4. **Ethical and Regulatory Focus**: With growing concerns over plagiarism, misinformation, and AI ethics, organizations and governments are working on guidelines to ensure responsible use of generative AI technologies.

5. **AI-Driven Personalization**: Companies are leveraging generative models to create personalized marketing content, enhancing customer engagement and targeting.

6. **Expanding Applications in Healthcare**: AI is being used to generate synthetic data to train models for medical diagnostics and drug discovery, accelerating research while protecting patient privacy.

These advancements reflect the rapid progress and expanding capabilities of generative AI across various sectors.

### File Search Response

In [12]:

file_search_response = client.responses.create(
    input= query,
    model="gpt-4o",
    temperature = 0,
    tools=[{
        "type": "file_search",
        "vector_store_ids": [vector_store_id],
    }]
)

In [13]:
display(Markdown(file_search_response.output_text))


The latest developments in generative AI include:

1. **OpenAI's Sora2**: This tool allows users to create cinematic videos from a prompt, incorporating audio, physics, and cameos, offering endless creative possibilities.

2. **GPT-5 and DALL¬∑E 4**: These are being used to create unique multimedia experiences, such as combining quantum fractals with music to produce innovative content.

3. **Generative AI in Business**: AI is being applied across various sectors, from content generation to design systems, reshaping how businesses innovate.

4. **AI in Creative Industries**: Generative AI is transforming creative fields by enabling the production of content like cinematic food commercials without traditional setups.

5. **Enterprise Applications**: IBM's Watsonx is bringing generative AI to enterprises, allowing for the creation of custom large language models to enhance customer engagement and streamline processes.

These developments highlight the expanding role of generative AI in both creative and business applications, driving innovation and efficiency across industries.

## Web Search API

### Introduction to Web Search
The OpenAI Web Search tool allows models to retrieve real-time information from the internet. 
This capability is particularly useful for obtaining up-to-date data, fact-checking, and expanding knowledge 
without relying solely on pre-trained information. 

By leveraging OpenAI's web search functionality, the Responses API can fetch external data 
and provide accurate, relevant results in real time (OpenAI, 2025). 
This feature enhances applications that require the latest insights, such as news aggregation, research, 
or dynamic content generation.

For more details, visit the official OpenAI documentation: 
[Web Search in Responses API](https://platform.openai.com/docs/guides/tools-web-search).

### Perform Web Search

In [14]:
web_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= query,
    tools=[
        {
            "type": "web_search"
        }
    ]
)

In [15]:
display(Markdown(web_search_response.output_text))

Here‚Äôs a detailed, up-to-date analysis of the **latest developments in Generative AI** as of Wednesday, **November 12, 2025**:

---

##  Recent Real-World Deployments and Infrastructure Expansion

‚Äì **AWS Skill Builder Launches AI-Powered Training Tools**  
On November‚ÄØ12, 2025, AWS introduced new components‚Äî**Meeting Simulator**, **Cohorts Studio**, and **microcredentials**‚Äîwithin its Skill Builder platform. These additions enable AI-enhanced communication skills training and group-based learning, and the beta exam for a professional generative AI certification will open for registrations on **November‚ÄØ18** ([aboutamazon.com](https://www.aboutamazon.com/news/aws/aws-ai-certification-learning-tools-skills-development?utm_source=openai)).

‚Äì **UT Austin Doubles Generative AI GPU Capacity**  
As of today, the University of Texas at Austin has **expanded its AI infrastructure**, more than doubling its training capabilities by increasing GPU resources to over 1,000 units. This expansion significantly bolsters its capacity for generative AI research at academic scale ([quantumzeitgeist.com](https://quantumzeitgeist.com/ut-ai-computing/?utm_source=openai)).

‚Äì **Cerebras Systems Launches ‚ÄúCerebras for Nations‚Äù**  
On November‚ÄØ11, 2025, Cerebras unveiled **‚ÄúCerebras for Nations‚Äù**, a global program designed to assist governments in building sovereign AI infrastructure. This empowers countries to develop and scale generative AI systems domestically ([hpcwire.com](https://www.hpcwire.com/aiwire/2025/11/11/cerebras-systems-launches-cerebras-for-nations-to-accelerate-and-scale-sovereign-ai/?utm_source=openai)).

---

##  Academic Research and Thought Leadership

‚Äì **Georgia Tech Cloud Hub Boosts GenAI Research via Microsoft Support**  
Today, the Georgia Tech Institute for Data Engineering and Science (IDEaS) concluded a call for proposals for its ‚ÄúCloud Hub‚Äù initiative, funded by Microsoft. This targets foundational and applied generative AI research, fostering innovation across the field ([newswise.com](https://www.newswise.com/articles/georgia-tech-cloud-hub-advances-generative-ai-research-with-microsoft-support?utm_source=openai)).

‚Äì **Manchester Study Highlights AI in Teacher Training**  
Research from the Manchester Institute of Education reveals early evidence on responsibly embedding generative AI into primary teacher training. The study uses a tool called **TeachMateAI (TMAI)** in the PGCE program, with results from the first year of a three-year longitudinal project ([phys.org](https://phys.org/news/2025-11-ai-teacher.html?utm_source=openai)).

---

##  Expanded Use Cases & Ethical Considerations

‚Äì **Generative AI Adoption and Labor Market Implications**  
A new analysis indicates GenAI is being adopted more broadly and faster than prior digital technologies, offering notable productivity gains in professional tasks (e.g., writing and coding). However, usage remains skewed toward younger, more educated individuals, raising concerns over equitable distribution of benefits ([voxeu.org](https://voxeu.org/voxeu/columns/generative-ai-uneven-adoption-labour-market-returns-and-policy-implications?utm_source=openai)).

‚Äì **AI in Scientific Discovery Accelerating Breakthroughs**  
In industries from drug discovery to climate modeling, AI systems are now acting not just as tools but as architects of research, accelerating processes that once spanned decades to mere hours or minutes ([webpronews.com](https://www.webpronews.com/ais-quantum-leap-reshaping-science-from-labs-to-breakthroughs/?utm_source=openai)).

---

##  Generative AI Tools & Capabilities Advancements

‚Äì **Google‚Äôs Pixel Drop AI Enhances On-Device Generative Features**  
Google‚Äôs November ‚ÄúPixel Drop AI‚Äù includes real-time generative photo editing in messaging apps, AI-powered content summaries, and improved scam detection features‚Äîall running on-device ([startuphub.ai](https://www.startuphub.ai/ai-news/ai-research/2025/pixel-drop-ai-google-deepens-on-device-intelligence/?amp=1&utm_source=openai)).

‚Äì **Agentic Coding: Google Jules Performance**  
A newly reported development in AI: **Google‚Äôs Jules**, a coding assistant, performs better than Gemini CLI (though built on the same underlying model), and is competitive with Anthropic‚Äôs Claude Code and OpenAI‚Äôs Codex. Users are still advised to critically verify outputs, recognizing these agentic tools can produce inaccurate or invented information ([infoworld.com](https://www.infoworld.com/article/4086269/agentic-coding-with-google-jules.html?utm_source=openai)).

---

##  Contextual Perspective: Recent Model and Technical Breakthroughs

- **OpenAI GPT‚Äë5 (Released August 7, 2025)**  
  GPT‚Äë5 is a multimodal LLM offering unified reasoning and fast-processing capabilities via internal routing between two primary model types (‚Äúmain‚Äù and ‚Äúthinking‚Äù), with variants like mini and nano optimized for speed or reasoning effort. It's available through ChatGPT, Microsoft Copilot, and the OpenAI API ([en.wikipedia.org](https://en.wikipedia.org/wiki/GPT-5?utm_source=openai)).

- **OpenAI o4‚Äëmini (Released April 16, 2025)**  
  A lighter reasoning-capable model capable of processing both text and images, accessible via ChatGPT and API‚Äîincluding a ‚Äúhigh‚Äù variant for enhanced accuracy. It improves tasks such as chemistry-related reasoning but still trails advanced models like DeepSeek‚ÄëR1 ([en.wikipedia.org](https://en.wikipedia.org/wiki/OpenAI_o4-mini?utm_source=openai)).

- **Runway Gen‚Äë4 (Released March 31, 2025)**  
  A text-to-video model generating up to 10-second clips with improved character consistency, motion realism, and camera movement. Targeted at previsualization and concept development, it remains limited by clip length and frame rate constraints ([en.wikipedia.org](https://en.wikipedia.org/wiki/Gen-4_%28AI_image_and_video_model%29?utm_source=openai)).

- **Google‚Äôs ‚ÄúNano Banana‚Äù (Gemini 2.5 Flash Image, August 2025)**  
  A viral text-to-image tool featuring features like multi-image fusion, subject consistency, and SynthID watermarking. It rapidly attracted over 10 million new Gemini app users and produced hundreds of millions of edits ([en.wikipedia.org](https://en.wikipedia.org/wiki/Nano_Banana?utm_source=openai)).

---

##  Synthesis & Outlook

The **latest developments in generative AI (as of November 12, 2025)** are characterized by:

‚Äì **Broader institutional and governmental deployment**, from AWS‚Äôs AI-based learning platforms to national-level AI infrastructure programs like Cerebras for Nations.  
‚Äì **Academic integration and research acceleration**, with universities building capacity and launching long-term AI-enhanced education studies.  
‚Äì **Diversification of AI use cases**, spanning teaching, scientific research, and infrastructure. Efficiency gains are notable, yet the equitable reach of these gains remains uneven.  
‚Äì **Advances in generative tools**, with strong progress in on-device capabilities, agentic coding, and video/image generation through models like GPT-5, Runway Gen‚Äë4, and Nano Banana.

All these point toward a transformative moment where generative AI is becoming deeply embedded across society‚Äîfrom education and government to creative industries‚Äîwhile ethical and access-related considerations remain central.

If you'd like deeper insights into any specific area (e.g., a particular tool, policy implications, or academic research), let me know‚ÄîI‚Äôd be happy to explore further.


### Stateful Response

The OpenAI Responses API includes a stateful feature that enables continuity in interactions. 
By using the `response_id`, a conversation can persist across multiple queries, 
allowing users to refine or expand upon previous searches. This is particularly useful for iterative research, 
dynamic content generation, and applications that require follow-up queries based on prior responses.

In [16]:
fetched_response = client.responses.retrieve(response_id=web_search_response.id)
display(Markdown(fetched_response.output_text[:100]))

Here‚Äôs a detailed, up-to-date analysis of the **latest developments in Generative AI** as of Wednesd

### Continue Query with Web Search

In [None]:
continue_query = 'find different news'

continue_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= continue_query,
    previous_response_id=web_search_response.id,
    tools=[
        {
            "type": "web_search"
        }
    ]
)

In [None]:
display(Markdown(continue_search_response.output_text))

### Combining File Search and Web Search

This is an example of using file search to analyze private data and web search to retrieve public or the latest data. 
The Responses API allows developers to integrate these tools to enhance retrieval-augmented generation (RAG) applications. 
By combining file search with web search, users can leverage structured internal knowledge while also retrieving real-time 
information from external sources, ensuring comprehensive and up-to-date responses. 

In [None]:
combined_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= query,
    temperature = 0,
    instructions="Retrieve the results from the file search first, and use the web search tool to expand the results with news resources",
    tools=[{
        "type": "file_search",
        "vector_store_ids": [vector_store_id],
    },
        {
            "type": "web_search"
        }
    ]
)

In [None]:
display(Markdown(combined_search_response.output_text))

# üß© Try It Yourself: Two-Step RAG (Private Data + Combined Search)

## Step 1 ‚Äî Upload & Create Vector Store
1. Upload a short text file (e.g., `my_notes.txt`) to your notebook instance.  
2. Create a **vector store** and **ingest** your uploaded file.  
3. Run a simple test query to verify retrieval:  

In [None]:
# create vector store
vector_store = client.vector_stores.create(
    name="my_vector_store"
)
vector_store_id = vector_store.id
print('vector store id: ', vector_store_id)


In [None]:
# upload files to file search
with open('MLB_pitch_tracking.txt', 'rb') as f:
    file = client.files.create(
        file=f,            # file-like object
        purpose="assistants"
    )
file_id = file.id
print(file_id)

In [None]:
# attach file to vector source
attach_status =client.vector_stores.files.create(
    vector_store_id=vector_store_id,
    file_id=file_id
            )
print(attach_status.id)

In [None]:
query = "recent technical and usage developments of mlb pitch tracking system"

In [None]:
search_results = client.vector_stores.search(
    vector_store_id=vector_store_id,
    query=query
)

for result in search_results.data[:5]:
    print(result.content[0].text[:100] + '\n Relevant score: ' + str(result.score))

## Step 2 ‚Äî Combine File Search with Web Search
1. Enable both **file_search** and **web_search** in the Responses API.  
2. Use a prompt that asks the model to merge insights from both sources.  
   > Example: ‚ÄúUsing my uploaded notes and the latest web information, summarize the current trends on this topic.‚Äù  
3. Review how the answer from your file and **current info** from the web.

‚úÖ You‚Äôve created a RAG system that combines **private** and **public** data for comprehensive, up-to-date analysis.


In [None]:
query = "Using my uploaded notes and the latest web information, summarize the current trends on the mlb pitch tracking system."

In [None]:
# combined search
combined_search_response = client.responses.create(
    input=query,
    model="gpt-4o",
    temperature=0,
    tools=[
        {
            "type": "file_search",
            "vector_store_ids": [vector_store_id],
        },
        {
            "type": "web_search"
        }
    ]
)


In [None]:
display(Markdown(combined_search_response.output_text))