# OpenAI Responses API: Advanced Tweet Analysis with File & Web Search Integration

## What is the OpenAI Responses API?

The Responses API is a new API released in March 2025. It is a combination of the traditional 
Chat Completions API and the Assistants API, providing support for:

- **Traditional Chat Completions:** Facilitates seamless conversational AI experiences.
- **Web Search:** Enables real-time information retrieval from the internet.
- **File Search:** Allows searching within files for relevant data.

Accordingly, the Assistants API will be retired in 2026. 

> **For new users, OpenAI recommends using the Responses API instead of the Chat Completions API to leverage its expanded capabilities.**

For a comprehensive comparison between the Responses API and the Chat Completions API, refer to the official OpenAI documentation: 
[Responses vs. Chat Completions](https://platform.openai.com/docs/guides/responses-vs-chat-completions).

## Summary of This Notebook
This notebook provides a hands-on guide for using the **OpenAI Responses API** to analyze tweets. 
It covers essential techniques such as:

- **Connecting to a MongoDB database** to store and retrieve tweets.
- **Extracting tweets** and converting them into a structured format for further analysis.
- **Creating a vector store** and uploading tweets for semantic search.
- **Using file search** to analyze private datasets.
- **Performing web search** to retrieve the latest public information.
- **Utilizing stateful responses** to maintain conversation context.
- **Combining file and web search** to enhance retrieval-augmented generation (RAG) applications.

By the end of this notebook, users will be able to integrate OpenAI's Responses API for efficient data retrieval 
and analysis of structured and unstructured data.

## Install Required Libraries
To use the OpenAI Responses API and interact with a MongoDB database, we need to install the following libraries:

- **`openai`**: Provides access to OpenAI's APIs, including the Responses API
- **`pymongo`**: A Python driver for MongoDB to store and retrieve tweets.

In [1]:
pip install openai pymongo -q

Note: you may need to restart the kernel to use updated packages.


## Import Required Libraries

In [2]:
from IPython.display import Markdown, display
import boto3
from botocore.exceptions import ClientError
import json
import io

## Retrieve Secrets from AWS Secrets Manager

In [3]:
def get_secret(secret_name):
    region_name = "us-east-1"

    # Create a Secrets Manager client
    session = boto3.session.Session()
    client = session.client(
        service_name='secretsmanager',
        region_name=region_name
    )

    try:
        get_secret_value_response = client.get_secret_value(
            SecretId=secret_name
        )
    except ClientError as e:
        raise e

    secret = get_secret_value_response['SecretString']
    
    return json.loads(secret)

## Connect to MongoDB

In [4]:
import pymongo
from pymongo import MongoClient
mongodb_connect = get_secret('mongodb')['connection_string']

mongo_client = MongoClient(mongodb_connect)
db = mongo_client.demo # use or create a database named demo
tweet_collection = db.tweet_collection #use or create a collection named tweet_collection
# tweet_collection.create_index([("tweet.id", pymongo.ASCENDING)],unique = True) # make sure the collected tweets are unique

## Extract Tweets from MongoDB

In [5]:
filter={

    
}
project={
    'tweet.text': 1,
    '_id':0
}
#rename the client to mongo_client
result = mongo_client['demo']['tweet_collection'].find(
  filter=filter,
  projection=project
)

After retrieving tweets from MongoDB, we convert the query result into a list format for easier processing.
The data is then serialized into a JSON-formatted string, ensuring it can be properly stored and shared across different services.
Using `io.BytesIO`, we create an in-memory JSON file, eliminating the need for disk writes.
This approach is particularly useful for applications that require temporary file storage, such as uploading datasets
to OpenAI's file search API or cloud storage for further analysis.

In [6]:
result_list = list(result)

# Convert result list to JSON string
json_data = json.dumps(result_list, default=str, indent=4)

# Create an in-memory JSON file
json_bytes = io.BytesIO(json_data.encode("utf-8"))
json_bytes.name = "tweet.json" 

In [7]:
print('Number of tweets: ',len(result_list))

Number of tweets:  93


## Initialize OpenAI Client

In [8]:
from openai import OpenAI
openai_api_key  = get_secret('openai')['api_key']

client = OpenAI(api_key=openai_api_key)

## File Search API

### Introduction to File Search
File search API enables efficient retrieval of relevant information 
from uploaded files by leveraging vector-based indexing. This feature is particularly useful 
for searching large datasets, extracting insights, and improving retrieval-augmented generation (RAG) applications.

Unlike traditional keyword-based searches, the Responses API uses embeddings 
to identify semantically relevant content, making it ideal for analyzing structured 
and unstructured text data (OpenAI, 2025).

For more details, visit the official OpenAI documentation: 
[File Search in Responses API](https://platform.openai.com/docs/guides/tools-file-search).

### Create a Vector Store

In [9]:
vector_store = client.vector_stores.create(
    name="tweet_base"
)
vector_store_id = vector_store.id
# print(vector_store_id)

### Upload Tweets File

In [10]:
file = client.files.create(
            file=json_bytes,
            purpose="assistants",)

file_id = file.id
# print(file_id)

### Attach File to Vector Store

In [11]:
attach_status =client.vector_stores.files.create(
    vector_store_id=vector_store_id,
    file_id=file_id
            )

# print(attach_status.id)

### Query the Vector Store

In [12]:
query = "the latest development in genai"

In [13]:
search_results = client.vector_stores.search(
    vector_store_id=vector_store_id,
    query=query
)

for result in search_results.data[:5]:
    print(result.content[0].text[:100] + '\n Relevant score: ' + str(result.score))

}
    },
    {
        "tweet": {
            "text": "@nic__carter @d_fins I think you misinterpret
 Relevant score: 0.5027333234019439
co/Sn9VghMMOp #GenerativeAI #GenAI #AI"
        }
    },
    {
        "tweet": {
            "text"
 Relevant score: 0.44602316930394487
}
    },
    {
        "tweet": {
            "text": "@GainsAssociates @Brcevik @Gamenessapp How wi
 Relevant score: 0.44541319204336244
Here\u2019s the difference:\n\ud83d\udd39 Generative LLM \u2013 Focused on generating text. (e.g. GP
 Relevant score: 0.43621670349887304
{
        "tweet": {
            "text": "@0wlboom @LeQuacked @Rythayze thats not the point. like AT
 Relevant score: 0.4241516494826711


## OpenAI Response API

### Simple Response

In [14]:
simple_response = client.responses.create(
  model="gpt-4o",
  input=[
      {
          "role": "user",
          "content": query
      }
  ]
)

In [15]:
display(Markdown(simple_response.output_text))

As of the latest updates, generative AI continues to advance rapidly across several areas:

1. **Text and Language Models**: Models like GPT-3.5 and successors have seen improvements in language understanding and generation. These models are being used for a variety of applications, including content creation, customer service, and advanced data analysis.

2. **Image and Video Generation**: Tools such as DALL-E and Midjourney have made strides in generating high-quality images and videos from text descriptions. This is revolutionizing fields like design, media, and entertainment.

3. **Multimodal Models**: Models capable of handling and integrating multiple types of data (text, images, audio) are becoming more proficient, offering more versatile AI solutions that can understand and generate content across different media types.

4. **Ethics and Safety Enhancements**: With the expanding capabilities of generative AI, there is increased focus on ethical use, bias reduction, and enhancing safety measures to ensure responsible deployment.

5. **Real-time Applications**: Generative AI is being incorporated into real-time applications, such as dynamic content generation in video games and interactive media, offering more personalized user experiences.

6. **Industry Integration**: Various industries, including healthcare, finance, and marketing, are leveraging generative AI for innovative solutions, such as personalized medicine, financial modeling, and targeted advertising.

Overall, generative AI is becoming more integrated into everyday technologies, driving innovation while raising important discussions about regulation and social impact.

### File Search Response

In [16]:

file_search_response = client.responses.create(
    input= query,
    model="gpt-4o",
    temperature = 0,
    tools=[{
        "type": "file_search",
        "vector_store_ids": [vector_store_id],
    }]
)

In [17]:
display(Markdown(file_search_response.output_text))


Recent developments in generative AI include advancements in multimodal models, which can understand and generate text, images, and more. Examples include GPT-4 and Gemini. Additionally, there is a growing focus on integrating generative AI into various industries, such as healthcare and marketing, to drive innovation and efficiency.

There is also a shift towards cognitive AI, which emphasizes collaboration and strategic thinking beyond simple prompting. Furthermore, the use of generative AI in creative fields, like art and music, continues to spark debate regarding originality and ethical practices.

## Web Search API

### Introduction to Web Search
The OpenAI Web Search tool allows models to retrieve real-time information from the internet. 
This capability is particularly useful for obtaining up-to-date data, fact-checking, and expanding knowledge 
without relying solely on pre-trained information. 

By leveraging OpenAI's web search functionality, the Responses API can fetch external data 
and provide accurate, relevant results in real time (OpenAI, 2025). 
This feature enhances applications that require the latest insights, such as news aggregation, research, 
or dynamic content generation.

For more details, visit the official OpenAI documentation: 
[Web Search in Responses API](https://platform.openai.com/docs/guides/tools-web-search).

### Perform Web Search

In [18]:
web_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= query,
    tools=[
        {
            "type": "web_search"
        }
    ]
)

In [19]:
display(Markdown(web_search_response.output_text))

Generative AI (GenAI) has experienced significant advancements across various sectors, leading to innovative applications and strategic developments. Here are some of the latest developments:

**Business Applications and Collaborations**

- **Google's AI Integration**: At its Cloud Next conference, Google showcased practical business applications of its AI technology. Collaborations include Mattel's use of Google's BigQuery AI tool to analyze feedback on the Barbie Dreamhouse, demonstrating AI's role in enhancing decision-making processes. ([axios.com](https://www.axios.com/2025/04/09/google-ai-mattel-barbie?utm_source=openai))

- **Amazon's Alexa Transformation**: Amazon is revamping its Alexa digital assistant by integrating generative AI, aiming to evolve it into a personalized concierge capable of performing a wide range of tasks. This transformation focuses on enhancing user experience and functionality. ([ft.com](https://www.ft.com/content/de4c86b8-c744-4051-9255-d34259223160?utm_source=openai))

**Technological Innovations**

- **Nvidia's AI Hardware and Models**: At the GTC 2025 event, Nvidia introduced several innovations, including the Blackwell Ultra AI chips and the Groot N1 AI model for robotics. These developments emphasize Nvidia's commitment to advancing AI hardware and software capabilities. ([tomsguide.com](https://www.tomsguide.com/computing/live/nvidia-gtc-2025-live?utm_source=openai))

- **Tencent's 3D Generation Tools**: Tencent released open-source AI tools capable of converting text and images into 3D visuals. The Hunyuan3D-2.0 technology enables the generation of high-quality 3D visuals in just 30 seconds, catering to designers and game developers. ([reuters.com](https://www.reuters.com/technology/artificial-intelligence/tencent-expands-ai-push-with-open-source-3d-generation-tools-2025-03-18/?utm_source=openai))

**Advancements in AI Models**

- **Google DeepMind's Gemini**: In December 2023, Google DeepMind launched Gemini, a multimodal large language model succeeding LaMDA and PaLM 2. Gemini 2.5, released in March 2025, introduced enhanced reasoning capabilities, marking a significant step toward more sophisticated AI interactions. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Google_DeepMind?utm_source=openai))

**Regulatory Developments**

- **Global AI Regulations**: Governments worldwide are implementing regulations to manage AI development. In the U.S., companies like OpenAI and Google agreed to watermark AI-generated content. The European Union's proposed Artificial Intelligence Act includes requirements for disclosing copyrighted material used in training AI systems. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Generative_artificial_intelligence?utm_source=openai))

**Research and Ethical Considerations**

- **Advancements in Generative AI Research**: Recent research focuses on techniques such as Diffusion Models and Denoising Score Matching, aiming to improve the quality and efficiency of AI-generated content. Ethical considerations, including bias mitigation and data privacy, remain central to ongoing discussions. ([indiaai.gov.in](https://indiaai.gov.in/article/cutting-edge-developments-in-generative-ai-new-frontiers-amp-practical-implications?utm_source=openai))


## Recent Developments in Generative AI:
- [Google touts AI's real-world business model](https://www.axios.com/2025/04/09/google-ai-mattel-barbie?utm_source=openai)
- [Nvidia GTC 2025 - Blackwell Ultra, Groot N1, self-driving cars and more from Jensen Huang's keynote](https://www.tomsguide.com/computing/live/nvidia-gtc-2025-live?utm_source=openai)
- [Tencent expands AI push with open-source 3D generation tools](https://www.reuters.com/technology/artificial-intelligence/tencent-expands-ai-push-with-open-source-3d-generation-tools-2025-03-18/?utm_source=openai) 

### Stateful Response

The OpenAI Responses API includes a stateful feature that enables continuity in interactions. 
By using the `response_id`, a conversation can persist across multiple queries, 
allowing users to refine or expand upon previous searches. This is particularly useful for iterative research, 
dynamic content generation, and applications that require follow-up queries based on prior responses.

In [20]:
fetched_response = client.responses.retrieve(response_id=web_search_response.id)
display(Markdown(fetched_response.output_text[:100]))

Generative AI (GenAI) has experienced significant advancements across various sectors, leading to in

### Continue Query with Web Search

In [21]:
continue_query = 'find different news'

continue_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= continue_query,
    previous_response_id=web_search_response.id,
    tools=[
        {
            "type": "web_search"
        }
    ]
)

In [22]:
display(Markdown(continue_search_response.output_text))

Certainly, here are some recent developments in the field of Generative AI:

**Advancements in AI Models**

- **Anthropic's Claude 3**: In March 2024, Anthropic introduced Claude 3, a family of multimodal generative AI models capable of processing both text and images. The suite includes three models—Haiku, Sonnet, and Opus—each varying in size and efficiency. ([analyticsvidhya.com](https://www.analyticsvidhya.com/blog/2024/12/generative-ai-developments/?utm_source=openai))

- **Meta's LLaMA 3**: In April 2024, Meta released LLaMA 3, its third-generation open-source large language model, available in 8B and 70B parameter sizes. Trained on approximately 15 trillion tokens from publicly available sources, LLaMA 3 demonstrated superior performance in coding, reasoning, and multilingual tasks. ([analyticsvidhya.com](https://www.analyticsvidhya.com/blog/2024/12/generative-ai-developments/?utm_source=openai))

**Integration of AI in Consumer Technology**

- **Apple's Apple Intelligence**: In June 2024, Apple announced the launch of Apple Intelligence as part of the iOS 18.1 update, bringing AI-powered features to iPhones. This includes ChatGPT integration in Siri, visual intelligence, and generative AI-powered photo editing features. ([analyticsvidhya.com](https://www.analyticsvidhya.com/blog/2024/12/generative-ai-developments/?utm_source=openai))

**AI in Healthcare**

- **Generative AI in Medical Imaging**: Generative AI platforms have significantly enhanced medical imaging analysis, particularly in radiology. AI algorithms can analyze MRI scans, X-rays, and CT scans with unprecedented accuracy, detecting subtle anomalies and assisting radiologists in making faster and more accurate diagnoses. ([medium.com](https://medium.com/%40AIreporter/recent-developments-in-generative-ai-platforms-for-healthcare-advancements-shaping-the-future-40e077ec2ed1?utm_source=openai))

**Industry Collaborations and Tools**

- **Google's Code Assist**: Google introduced Code Assist to enhance coding efficiency, leveraging the Gemini 1.5 Pro model. With a record 1 million-token context window, Code Assist provides unprecedented AI-powered code comprehension and transformation capabilities, setting a new industry standard. ([generativeaiassociation.org](https://generativeaiassociation.org/articles/game-changing-advances-in-generative-ai-a-roundup-of-key-innovations/?utm_source=openai))


## Recent Developments in Generative AI:
- [In 2024, artificial intelligence was all about putting AI tools to work](https://apnews.com/article/0b6ab89193265c3f60f382bae9bbabc9?utm_source=openai)
- [Nvidia GTC 2025 - Blackwell Ultra, Groot N1, self-driving cars and more from Jensen Huang's keynote](https://www.tomsguide.com/computing/live/nvidia-gtc-2025-live?utm_source=openai)
- [Tencent expands AI push with open-source 3D generation tools](https://www.reuters.com/technology/artificial-intelligence/tencent-expands-ai-push-with-open-source-3d-generation-tools-2025-03-18/?utm_source=openai) 

### Combining File Search and Web Search

This is an example of using file search to analyze private data and web search to retrieve public or the latest data. 
The Responses API allows developers to integrate these tools to enhance retrieval-augmented generation (RAG) applications. 
By combining file search with web search, users can leverage structured internal knowledge while also retrieving real-time 
information from external sources, ensuring comprehensive and up-to-date responses. 

In [23]:
combined_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= query,
    temperature = 0,
    instructions="Retrieve the results from the file search first, and use the web search tool to expand the results with news resources",
    tools=[{
        "type": "file_search",
        "vector_store_ids": [vector_store_id],
    },
        {
            "type": "web_search"
        }
    ]
)

In [24]:
display(Markdown(combined_search_response.output_text))

Recent developments in generative AI include several key advancements and initiatives:

1. **Enterprise Platforms**: Quantiphi's generative AI platform, Baioniq, has been recognized for its excellence, indicating a focus on enterprise-ready solutions.

2. **Collaborations**: Clario is working with AWS to enhance clinical data analysis using generative AI, showcasing the integration of AI in healthcare.

3. **Competitions and Events**: Dubai AI Week 2025 is set to host a major generative AI championship, highlighting the growing interest and investment in AI competitions.

4. **Technological Advancements**: AMD has introduced Amuse 3.0, a generative AI model developed in collaboration with StabilityAI, emphasizing improvements in speed and performance.

5. **Regulatory Discussions**: There is an ongoing call for the development of local laws to govern generative AI, reflecting the need for regulatory frameworks.

6. **Industry Applications**: Generative AI is being used to drive sales and growth in healthcare and pharma, as demonstrated by MASORI Therapeutics.

These developments illustrate the diverse applications and rapid evolution of generative AI across different sectors.