# OpenAI Responses API: Advanced Tweet Analysis with File & Web Search Integration

## What is the OpenAI Responses API?

The Responses API is a new API released in March 2025. It is a combination of the traditional 
Chat Completions API and the Assistants API, providing support for:

- **Traditional Chat Completions:** Facilitates seamless conversational AI experiences.
- **Web Search:** Enables real-time information retrieval from the internet.
- **File Search:** Allows searching within files for relevant data.

Accordingly, the Assistants API will be retired in 2026. 

> **For new users, OpenAI recommends using the Responses API instead of the Chat Completions API to leverage its expanded capabilities.**

For a comprehensive comparison between the Responses API and the Chat Completions API, refer to the official OpenAI documentation: 
[Responses vs. Chat Completions](https://platform.openai.com/docs/guides/responses-vs-chat-completions).

## Summary of This Notebook
This notebook provides a hands-on guide for using the **OpenAI Responses API** to analyze tweets. 
It covers essential techniques such as:

- **Connecting to a MongoDB database** to store and retrieve tweets.
- **Extracting tweets** and converting them into a structured format for further analysis.
- **Creating a vector store** and uploading tweets for semantic search.
- **Using file search** to analyze private datasets.
- **Performing web search** to retrieve the latest public information.
- **Utilizing stateful responses** to maintain conversation context.
- **Combining file and web search** to enhance retrieval-augmented generation (RAG) applications.

By the end of this notebook, users will be able to integrate OpenAI's Responses API for efficient data retrieval 
and analysis of structured and unstructured data.

## Install Required Libraries
To use the OpenAI Responses API and interact with a MongoDB database, we need to install the following libraries:

- **`openai`**: Provides access to OpenAI's APIs, including the Responses API
- **`pymongo`**: A Python driver for MongoDB to store and retrieve tweets.

In [1]:
pip install openai pymongo -q

Note: you may need to restart the kernel to use updated packages.


## Import Required Libraries

In [2]:
from IPython.display import Markdown, display
import boto3
from botocore.exceptions import ClientError
import json
import io

## Retrieve Secrets from AWS Secrets Manager

In [3]:
def get_secret(secret_name):
    region_name = "us-east-1"

    # Create a Secrets Manager client
    session = boto3.session.Session()
    client = session.client(
        service_name='secretsmanager',
        region_name=region_name
    )

    try:
        get_secret_value_response = client.get_secret_value(
            SecretId=secret_name
        )
    except ClientError as e:
        raise e

    secret = get_secret_value_response['SecretString']
    
    return json.loads(secret)

## Connect to MongoDB

In [4]:
import pymongo
from pymongo import MongoClient
mongodb_connect = get_secret('mongodb')['connection_string']

mongo_client = MongoClient(mongodb_connect)
db = mongo_client.demo # use or create a database named demo
tweet_collection = db.tweet_collection #use or create a collection named tweet_collection
# tweet_collection.create_index([("tweet.id", pymongo.ASCENDING)],unique = True) # make sure the collected tweets are unique

## Extract Tweets from MongoDB

In [5]:
filter={

    
}
project={
    'tweet.text': 1,
    '_id':0
}
#rename the client to mongo_client
result = mongo_client['demo']['tweet_collection'].find(
  filter=filter,
  projection=project
)

After retrieving tweets from MongoDB, we convert the query result into a list format for easier processing.
The data is then serialized into a JSON-formatted string, ensuring it can be properly stored and shared across different services.
Using `io.BytesIO`, we create an in-memory JSON file, eliminating the need for disk writes.
This approach is particularly useful for applications that require temporary file storage, such as uploading datasets
to OpenAI's file search API or cloud storage for further analysis.

In [6]:
result_list = list(result)

# Convert result list to JSON string
json_data = json.dumps(result_list, default=str, indent=4)

# Create an in-memory JSON file
json_bytes = io.BytesIO(json_data.encode("utf-8"))
json_bytes.name = "tweet.json" 

In [7]:
print('Number of tweets: ',len(result_list))

Number of tweets:  69


## Initialize OpenAI Client

In [8]:
from openai import OpenAI
openai_api_key  = get_secret('openai')['api_key']

client = OpenAI(api_key=openai_api_key)

## File Search API

### Introduction to File Search
File search API enables efficient retrieval of relevant information 
from uploaded files by leveraging vector-based indexing. This feature is particularly useful 
for searching large datasets, extracting insights, and improving retrieval-augmented generation (RAG) applications.

Unlike traditional keyword-based searches, the Responses API uses embeddings 
to identify semantically relevant content, making it ideal for analyzing structured 
and unstructured text data (OpenAI, 2025).

For more details, visit the official OpenAI documentation: 
[File Search in Responses API](https://platform.openai.com/docs/guides/tools-file-search).

### Create a Vector Store

In [9]:
vector_store = client.vector_stores.create(
    name="tweet_base"
)
vector_store_id = vector_store.id
# print(vector_store_id)

### Upload Tweets File

In [10]:
file = client.files.create(
            file=json_bytes,
            purpose="assistants",)

file_id = file.id
# print(file_id)

### Attach File to Vector Store

In [11]:
attach_status =client.vector_stores.files.create(
    vector_store_id=vector_store_id,
    file_id=file_id
            )

# print(attach_status.id)

### Query the Vector Store

In [12]:
query = "JMU Sports"

In [13]:
search_results = client.vector_stores.search(
    vector_store_id=vector_store_id,
    query=query
)

for result in search_results.data[:5]:
    print(result.content[0].text[:100] + '\n Relevant score: ' + str(result.score))

\n\n\u26be\ufe0f Series win at home over ODU \n\ud83e\udd4e Series sweep at JMU \n\ud83c\udfbe Ran w
 Relevant score: 0.5977979975579375
}
    },
    {
        "tweet": {
            "text": "Paul Lewis, brother of 2021 CAA Player of the
 Relevant score: 0.5858547976437501
{
        "tweet": {
            "text": "RT @HNlEHupY4Nr6hRM: JMU\u78ef\u5b50\u306b\u3066\uff92\uff
 Relevant score: 0.5654995214103978
\u0628\u0624\u062f\u0649\u2390\n\u22b5A4TU\u22b4\nJmU"
        }
    },
    {
        "tweet": {
   
 Relevant score: 0.5616241447776544
Here's what HC Matt Deggs had to say in part of his opening statement, recapping last week's games\n
 Relevant score: 0.5472576232019247


## OpenAI Response API

### Simple Response

In [14]:
simple_response = client.responses.create(
  model="gpt-4o",
  input=[
      {
          "role": "user",
          "content": query
      }
  ]
)

In [15]:
display(Markdown(simple_response.output_text))

James Madison University (JMU) offers a diverse range of sports programs at both the varsity and club levels. The Dukes, as their teams are known, compete in several sports, enjoying a strong athletic tradition.

### Varsity Sports
- **Football**: JMU has a successful football program, known for its strong performances in the NCAA.
- **Basketball**: Both men's and women's teams compete at a high level.
- **Soccer**: JMU fields competitive men's and women's soccer teams.
- **Field Hockey, Lacrosse, Softball, Baseball**: These teams are also key components of JMU's athletic offerings.
- **Track and Field, Cross Country, Swimming and Diving**: These sports maintain strong participation and competitive programs.

### Club and Intramural Sports
- JMU boasts numerous club sports ranging from rugby and ultimate frisbee to equestrian.
- Intramural sports provide opportunities for all students to participate in athletics at various levels of competition.

### Facilities
- **Bridgeforth Stadium**: The home of JMU football.
- **Convocation Center/Atlantic Union Bank Center**: Hosts basketball games and other events.
- **Veterans Memorial Park**: Baseball and softball complex.

### Achievements
JMU's sports teams have garnered championships and high rankings, particularly in football and other successful programs like women's lacrosse.

For the most up-to-date information, including schedules and current standings, visiting the official JMU Athletics website is recommended.

### File Search Response

In [16]:

file_search_response = client.responses.create(
    input= query,
    model="gpt-4o",
    temperature = 0,
    tools=[{
        "type": "file_search",
        "vector_store_ids": [vector_store_id],
    }]
)

In [17]:
display(Markdown(file_search_response.output_text))


Here are some recent updates about JMU sports:

1. JMU had a series sweep against ODU, extending their win streak to five games.
2. The NACE playoffs featured a match between WVU and JMU in Rocket League.
3. Paul Lewis, brother of JMU legend Matt Lewis, has entered the transfer portal.
4. JMU's lacrosse team is ranked 13th in the latest IWLCA poll.

## Web Search API

### Introduction to Web Search
The OpenAI Web Search tool allows models to retrieve real-time information from the internet. 
This capability is particularly useful for obtaining up-to-date data, fact-checking, and expanding knowledge 
without relying solely on pre-trained information. 

By leveraging OpenAI's web search functionality, the Responses API can fetch external data 
and provide accurate, relevant results in real time (OpenAI, 2025). 
This feature enhances applications that require the latest insights, such as news aggregation, research, 
or dynamic content generation.

For more details, visit the official OpenAI documentation: 
[Web Search in Responses API](https://platform.openai.com/docs/guides/tools-web-search).

### Perform Web Search

In [18]:
web_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= query,
    tools=[
        {
            "type": "web_search"
        }
    ]
)

In [19]:
display(Markdown(web_search_response.output_text))

James Madison University (JMU) has a robust sports program, known as the JMU Dukes. The university competes in the NCAA Division I and is a member of the Sun Belt Conference as of 2022. Here are some of the key sports offered at JMU:

1. **Football**: The JMU Dukes football team is highly competitive and has won national championships in the past.
2. **Basketball**: Both men's and women's basketball teams are popular and have seen success in their respective leagues.
3. **Soccer**: JMU offers both men's and women's soccer teams that have competitive seasons.
4. **Baseball and Softball**: These teams have competitive programs with strong performances in their conferences.
5. **Other Sports**: Includes lacrosse, tennis, golf, field hockey, and track and field.

If you need detailed information about specific teams or seasons, feel free to ask!

### Stateful Response

The OpenAI Responses API includes a stateful feature that enables continuity in interactions. 
By using the `response_id`, a conversation can persist across multiple queries, 
allowing users to refine or expand upon previous searches. This is particularly useful for iterative research, 
dynamic content generation, and applications that require follow-up queries based on prior responses.

In [20]:
fetched_response = client.responses.retrieve(response_id=web_search_response.id)
display(Markdown(fetched_response.output_text[:100]))

James Madison University (JMU) has a robust sports program, known as the JMU Dukes. The university c

### Continue Query with Web Search

In [21]:
continue_query = 'find different news'

continue_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= continue_query,
    previous_response_id=web_search_response.id,
    tools=[
        {
            "type": "web_search"
        }
    ]
)

In [22]:
display(Markdown(continue_search_response.output_text))

James Madison University's (JMU) athletic programs have been experiencing significant developments across various sports. Here's an overview of recent news:

**Football**

- **Season Performance**: The JMU Dukes football team has started the 2024 season strongly, securing victories against Charlotte (30-7) and Gardner-Webb (13-6). ([espn.com](https://www.espn.com/college-football/team/_/id/256/james-madison-dukes/?utm_source=openai))

- **Upcoming Game**: The Dukes are set to face the University of North Carolina on September 21, 2024, aiming to continue their winning streak. ([espn.com](https://www.espn.com/college-football/team/_/id/256/james-madison-dukes/?utm_source=openai))

**Men's Basketball**

- **Roster Overhaul**: Under new head coach Preston Spradlin, the men's basketball team has undergone a significant roster transformation. The team has added 10 transfers and one high school recruit, including notable players like Washington transfer Noah Williams and Syracuse transfer Justin Taylor. This revamped roster aims to build on the previous season's success, where the Dukes won 32 games and reached the second round of the NCAA Tournament. ([dnronline.com](https://www.dnronline.com/sports/level/college/james_madison_university/dukes-have-rebuilt-roster-under-spradlin/article_d340a354-a717-5f91-81fd-c1229bea3cd0.html?utm_source=openai))

**Athletic Administration**

- **Potential Scholarship and Roster Changes**: JMU is preparing for potential changes in scholarship allocations and roster sizes across its 18 varsity sports. These adjustments may lead to a reduction of 28 roster spots, prompting coaches to reconsider team compositions and scholarship distributions. ([dnronline.com](https://www.dnronline.com/sports/dukes-preparing-for-potential-scholarship-roster-changes/article_e35ad5be-1930-5b3e-b203-0db9a70bca18.html?utm_source=openai))

These developments reflect JMU's commitment to enhancing its athletic programs and maintaining competitive excellence across various sports. 

### Combining File Search and Web Search

This is an example of using file search to analyze private data and web search to retrieve public or the latest data. 
The Responses API allows developers to integrate these tools to enhance retrieval-augmented generation (RAG) applications. 
By combining file search with web search, users can leverage structured internal knowledge while also retrieving real-time 
information from external sources, ensuring comprehensive and up-to-date responses. 

In [24]:
combined_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= query,
    temperature = 0,
    instructions="Retrieve the results from the file search first, and use the web search tool to expand the results with news resources",
    tools=[{
        "type": "file_search",
        "vector_store_ids": [vector_store_id],
    },
        {
            "type": "web_search"
        }
    ]
)

In [25]:
display(Markdown(combined_search_response.output_text))

Here are some recent updates related to JMU Sports:

1. **Transfer Portal Activity**: Paul Lewis, the brother of JMU legend Matt Lewis, is now in the transfer portal.

2. **Series Sweep**: JMU recently experienced a series sweep by App State, which extended App State's win streak.

3. **Rocket League Playoffs**: JMU's Rocket League team is participating in the NACE playoffs, facing off against WVU.

4. **Football Recruitment**: There have been several visits and recruitment activities involving JMU Football, with prospects expressing excitement about their visits.

If you need more detailed or specific information, feel free to ask!