# OpenAI Responses API: Advanced Tweet Analysis with File & Web Search Integration

## What is the OpenAI Responses API?

The Responses API is a new API released in March 2025. It is a combination of the traditional 
Chat Completions API and the Assistants API, providing support for:

- **Traditional Chat Completions:** Facilitates seamless conversational AI experiences.
- **Web Search:** Enables real-time information retrieval from the internet.
- **File Search:** Allows searching within files for relevant data.

Accordingly, the Assistants API will be retired in 2026. 

> **For new users, OpenAI recommends using the Responses API instead of the Chat Completions API to leverage its expanded capabilities.**

For a comprehensive comparison between the Responses API and the Chat Completions API, refer to the official OpenAI documentation: 
[Responses vs. Chat Completions](https://platform.openai.com/docs/guides/responses-vs-chat-completions).

## Summary of This Notebook
This notebook provides a hands-on guide for using the **OpenAI Responses API** to analyze tweets. 
It covers essential techniques such as:

- **Connecting to a MongoDB database** to store and retrieve tweets.
- **Extracting tweets** and converting them into a structured format for further analysis.
- **Creating a vector store** and uploading tweets for semantic search.
- **Using file search** to analyze private datasets.
- **Performing web search** to retrieve the latest public information.
- **Utilizing stateful responses** to maintain conversation context.
- **Combining file and web search** to enhance retrieval-augmented generation (RAG) applications.

By the end of this notebook, users will be able to integrate OpenAI's Responses API for efficient data retrieval 
and analysis of structured and unstructured data.

## Install Required Libraries
To use the OpenAI Responses API and interact with a MongoDB database, we need to install the following libraries:

- **`openai`**: Provides access to OpenAI's APIs, including the Responses API
- **`pymongo`**: A Python driver for MongoDB to store and retrieve tweets.

In [1]:
pip install openai pymongo -q

Note: you may need to restart the kernel to use updated packages.


## Import Required Libraries

In [2]:
from IPython.display import Markdown, display
import boto3
from botocore.exceptions import ClientError
import json
import io

## Retrieve Secrets from AWS Secrets Manager

In [3]:
def get_secret(secret_name):
    region_name = "us-east-1"

    # Create a Secrets Manager client
    session = boto3.session.Session()
    client = session.client(
        service_name='secretsmanager',
        region_name=region_name
    )

    try:
        get_secret_value_response = client.get_secret_value(
            SecretId=secret_name
        )
    except ClientError as e:
        raise e

    secret = get_secret_value_response['SecretString']
    
    return json.loads(secret)

## Connect to MongoDB

In [4]:
import pymongo
from pymongo import MongoClient
mongodb_connect = get_secret('mongodb')['connection_string']

mongo_client = MongoClient(mongodb_connect)
db = mongo_client.demo # use or create a database named demo
tweet_collection = db.tweet_collection #use or create a collection named tweet_collection
# tweet_collection.create_index([("tweet.id", pymongo.ASCENDING)],unique = True) # make sure the collected tweets are unique

## Extract Tweets from MongoDB

In [5]:
filter={

    
}
project={
    'tweet.text': 1,
    '_id':0
}
#rename the client to mongo_client
result = mongo_client['demo']['tweet_collection'].find(
  filter=filter,
  projection=project
)

After retrieving tweets from MongoDB, we convert the query result into a list format for easier processing.
The data is then serialized into a JSON-formatted string, ensuring it can be properly stored and shared across different services.
Using `io.BytesIO`, we create an in-memory JSON file, eliminating the need for disk writes.
This approach is particularly useful for applications that require temporary file storage, such as uploading datasets
to OpenAI's file search API or cloud storage for further analysis.

In [6]:
result_list = list(result)

# Convert result list to JSON string
json_data = json.dumps(result_list, default=str, indent=4)

# Create an in-memory JSON file
json_bytes = io.BytesIO(json_data.encode("utf-8"))
json_bytes.name = "tweet.json" 

In [7]:
print('Number of tweets: ',len(result_list))

Number of tweets:  100


## Initialize OpenAI Client

In [8]:
from openai import OpenAI
openai_api_key  = get_secret('openai')['api_key']

client = OpenAI(api_key=openai_api_key)

## File Search API

### Introduction to File Search
File search API enables efficient retrieval of relevant information 
from uploaded files by leveraging vector-based indexing. This feature is particularly useful 
for searching large datasets, extracting insights, and improving retrieval-augmented generation (RAG) applications.

Unlike traditional keyword-based searches, the Responses API uses embeddings 
to identify semantically relevant content, making it ideal for analyzing structured 
and unstructured text data (OpenAI, 2025).

For more details, visit the official OpenAI documentation: 
[File Search in Responses API](https://platform.openai.com/docs/guides/tools-file-search).

### Create a Vector Store

In [9]:
vector_store = client.vector_stores.create(
    name="tweet_base"
)
vector_store_id = vector_store.id
# print(vector_store_id)

### Upload Tweets File

In [10]:
file = client.files.create(
            file=json_bytes,
            purpose="assistants",)

file_id = file.id
# print(file_id)

### Attach File to Vector Store

In [11]:
attach_status =client.vector_stores.files.create(
    vector_store_id=vector_store_id,
    file_id=file_id
            )

# print(attach_status.id)

### Query the Vector Store

In [27]:
query = "marine le pen prison sentence"

In [28]:
search_results = client.vector_stores.search(
    vector_store_id=vector_store_id,
    query=query
)

for result in search_results.data[:5]:
    print(result.content[0].text[:100] + '\n Relevant score: ' + str(result.score))

\n\nShe can\u2019t run for office for five years\u2026"
        }
    },
    {
        "tweet": {
  
 Relevant score: 0.9040199560269871
co/ct\u2026"
        }
    },
    {
        "tweet": {
            "text": "RT @JordanFlrtn: Il y a 
 Relevant score: 0.9008480464820222
{
        "tweet": {
            "text": "RT @ZeClint: Si marine le pen a enfreint la loi je suis po
 Relevant score: 0.8980308990166704
{
        "tweet": {
            "text": "RT @BonneDroite: Bien vu : @tegnererik avait pr\u00e9dit l
 Relevant score: 0.8957969831839437
Les juges osent tout, parce que la classe p\u2026"
        }
    },
    {
        "tweet": {
       
 Relevant score: 0.8943379769838723


## OpenAI Response API

### Simple Response

In [29]:
simple_response = client.responses.create(
  model="gpt-4o",
  input=[
      {
          "role": "user",
          "content": query
      }
  ]
)

In [30]:
display(Markdown(simple_response.output_text))

As of my last update, Marine Le Pen, the leader of the National Rally party in France, has not received a prison sentence. However, she has faced legal challenges, including investigations related to the misuse of European Parliament funds and the distribution of violent images on social media. It's essential to check the latest news sources for the most current information, as legal situations can evolve.

### File Search Response

In [31]:

file_search_response = client.responses.create(
    input= query,
    model="gpt-4o",
    temperature = 0,
    tools=[{
        "type": "file_search",
        "vector_store_ids": [vector_store_id],
    }]
)

In [32]:
display(Markdown(file_search_response.output_text))


Marine Le Pen has been sentenced to four years in prison and banned from holding public office for five years. This conviction is related to charges of embezzling European Union funds.

## Web Search API

### Introduction to Web Search
The OpenAI Web Search tool allows models to retrieve real-time information from the internet. 
This capability is particularly useful for obtaining up-to-date data, fact-checking, and expanding knowledge 
without relying solely on pre-trained information. 

By leveraging OpenAI's web search functionality, the Responses API can fetch external data 
and provide accurate, relevant results in real time (OpenAI, 2025). 
This feature enhances applications that require the latest insights, such as news aggregation, research, 
or dynamic content generation.

For more details, visit the official OpenAI documentation: 
[Web Search in Responses API](https://platform.openai.com/docs/guides/tools-web-search).

### Perform Web Search

In [33]:
web_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= query,
    tools=[
        {
            "type": "web_search"
        }
    ]
)

In [34]:
display(Markdown(web_search_response.output_text))

On March 31, 2025, Marine Le Pen, leader of France's far-right Rassemblement National party, was convicted by a Paris criminal court for embezzling over €4 million in European Parliament funds. She received a four-year prison sentence, with two years to be served under electronic monitoring, a €100,000 fine, and a five-year ban from holding public office, effectively barring her from the 2027 presidential election. ([ft.com](https://www.ft.com/content/2c92f862-0e01-4cab-89a1-f1ebe34bdb24?utm_source=openai))

The court determined that between 2009 and 2016, Le Pen misused EU funds by paying party staff with money allocated for parliamentary assistants. This misappropriation was intended to alleviate the financial burden on her party. ([cadenaser.com](https://cadenaser.com/nacional/2025/03/30/francia-decide-el-futuro-de-le-pen-con-una-sentencia-que-podria-inhabilitarla-para-las-proximas-elecciones-cadena-ser/?utm_source=openai))

Le Pen has announced her intention to appeal the verdict. The appeal process is expected to conclude by the summer of 2026, ahead of the 2027 presidential elections. While the appeal is pending, the prison sentence and fine are suspended; however, the ban from public office remains in effect immediately. ([huffingtonpost.es](https://www.huffingtonpost.es/global/justicia-francesa-preve-resolver-recurso-le-pen-inhabilitacion-2026-presidenciales.html?utm_source=openai))

This ruling has intensified political tensions in France, with Le Pen's supporters denouncing the decision as an attack on democracy. The court, however, emphasized the importance of upholding electoral integrity and the rule of law. ([ft.com](https://www.ft.com/content/2c92f862-0e01-4cab-89a1-f1ebe34bdb24?utm_source=openai))


## Recent Developments in Marine Le Pen's Legal Case:
- [A bombshell judgment on Marine Le Pen](https://www.ft.com/content/2c92f862-0e01-4cab-89a1-f1ebe34bdb24?utm_source=openai)
- [Marine Le Pen election ban worsens French chaos](https://www.reuters.com/breakingviews/marine-le-pen-election-ban-worsens-french-chaos-2025-03-31/?utm_source=openai)
- [La justicia francesa prevé resolver el recurso de Le Pen contra su inhabilitación en 2026, antes de las presidenciales](https://www.huffingtonpost.es/global/justicia-francesa-preve-resolver-recurso-le-pen-inhabilitacion-2026-presidenciales.html?utm_source=openai) 

### Stateful Response

The OpenAI Responses API includes a stateful feature that enables continuity in interactions. 
By using the `response_id`, a conversation can persist across multiple queries, 
allowing users to refine or expand upon previous searches. This is particularly useful for iterative research, 
dynamic content generation, and applications that require follow-up queries based on prior responses.

In [35]:
fetched_response = client.responses.retrieve(response_id=web_search_response.id)
display(Markdown(fetched_response.output_text[:100]))

On March 31, 2025, Marine Le Pen, leader of France's far-right Rassemblement National party, was con

### Continue Query with Web Search

In [36]:
continue_query = 'find different news'

continue_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= continue_query,
    previous_response_id=web_search_response.id,
    tools=[
        {
            "type": "web_search"
        }
    ]
)

In [37]:
display(Markdown(continue_search_response.output_text))

In addition to the recent conviction of Marine Le Pen for embezzling European Union funds, she has faced other legal challenges:

**Defamation Conviction (October 2023):**
In October 2023, Marine Le Pen was found guilty of defamation against the French NGO Cimade. During a January 2022 interview, she accused Cimade of being "accomplices of smugglers" involved in an "illegal immigration network from the Comoros" in Mayotte. The court ruled that her remarks exceeded permissible limits of freedom of expression and imposed a suspended fine of €500, along with €2,000 in court costs and €1 in damages. ([euronews.com](https://www.euronews.com/2023/10/13/marine-le-pen-found-guilty-of-defamation-after-accusing-french-ngo-of-smuggling-migrants-i?utm_source=openai))

**Campaign Kits Affair (June 2024):**
In June 2024, France's Court of Cassation upheld the conviction of the Rassemblement National (formerly Front National) in the "Campaign Kits Affair." The party was fined €250,000 for orchestrating a financial scheme during the 2012 legislative elections. This scheme involved overcharging candidates for campaign materials, with services provided by the company Riwal being significantly overpriced. The funds were then used to finance future campaigns and benefit entrepreneurs linked to the party. Several party members received various sentences, including suspended prison terms and fines. ([lemonde.fr](https://www.lemonde.fr/societe/article/2024/06/19/la-cour-de-cassation-confirme-la-condamnation-du-rassemblement-national-dans-l-affaire-des-kits-de-campagne_6241414_3224.html?utm_source=openai))

These legal proceedings have significantly impacted Marine Le Pen's political career and the Rassemblement National's standing in French politics.


## Recent Legal Challenges for Marine Le Pen:
- [La Cour de cassation confirme la condamnation du Rassemblement national dans l'affaire des kits de campagne](https://www.lemonde.fr/societe/article/2024/06/19/la-cour-de-cassation-confirme-la-condamnation-du-rassemblement-national-dans-l-affaire-des-kits-de-campagne_6241414_3224.html?utm_source=openai)
- [Ces candidats RN qui ont été condamnés par la justice ou font l'objet de procédures judiciaires](https://www.lemonde.fr/politique/article/2024/07/05/ces-candidats-rn-qui-ont-ete-condamnes-par-la-justice-ou-font-l-objet-de-procedures-judiciaires_6247214_823448.html?utm_source=openai)
- [Affaire des assistants parlementaires du FN : tout comprendre aux enjeux du procès qui s'ouvre à Paris](https://www.lemonde.fr/les-decodeurs/article/2024/09/30/affaire-des-assistants-parlementaires-du-fn-tout-comprendre-aux-enjeux-du-proces-qui-s-ouvre-a-paris_6339923_4355770.html?utm_source=openai) 

### Combining File Search and Web Search

This is an example of using file search to analyze private data and web search to retrieve public or the latest data. 
The Responses API allows developers to integrate these tools to enhance retrieval-augmented generation (RAG) applications. 
By combining file search with web search, users can leverage structured internal knowledge while also retrieving real-time 
information from external sources, ensuring comprehensive and up-to-date responses. 

In [38]:
combined_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= query,
    temperature = 0,
    instructions="Retrieve the results from the file search first, and use the web search tool to expand the results with news resources",
    tools=[{
        "type": "file_search",
        "vector_store_ids": [vector_store_id],
    },
        {
            "type": "web_search"
        }
    ]
)

In [39]:
display(Markdown(combined_search_response.output_text))

Marine Le Pen has been sentenced to four years in prison and banned from holding public office for five years. This conviction is related to charges of embezzling European Union funds.