# OpenAI Responses API

## What is the OpenAI Responses API?

The Responses API is a new API released in March 2025. It is a combination of the traditional 
Chat Completions API and the Assistants API, providing support for:

- **Traditional Chat Completions:** Facilitates seamless conversational AI experiences.
- **Web Search:** Enables real-time information retrieval from the internet.
- **File Search:** Allows searching within files for relevant data.

Accordingly, the Assistants API will be retired in 2026. 

> **For new users, OpenAI recommends using the Responses API instead of the Chat Completions API to leverage its expanded capabilities.**

For a comprehensive comparison between the Responses API and the Chat Completions API, refer to the official OpenAI documentation: 
[Responses vs. Chat Completions](https://platform.openai.com/docs/guides/responses-vs-chat-completions).

## Summary of This Notebook
This notebook provides a hands-on guide for using the **OpenAI Responses API** to analyze tweets. 
It covers essential techniques such as:

- **Creating a vector store** and uploading tweets for semantic search.
- **Using file search** to analyze private datasets.
- **Performing a web search** to retrieve the latest public information.
- **Utilizing stateful responses** to maintain conversation context.
- **Combining file and web search** to enhance retrieval-augmented generation (RAG) applications.

By the end of this notebook, users will be able to integrate OpenAI's Responses API for efficient data retrieval and analysis of structured and unstructured data.

## Install Required Libraries
To use the OpenAI Responses API, we need to install the following libraries:

- **`openai`**: Provides access to OpenAI's APIs, including the Responses API

In [1]:
pip install openai -q

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
sparkmagic 0.21.0 requires pandas<2.0.0,>=0.17.1, but you have pandas 2.3.3 which is incompatible.[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.


## Import Required Libraries

In [2]:
from IPython.display import Markdown, display
import boto3
from botocore.exceptions import ClientError
import json
import io

## Retrieve Secrets from AWS Secrets Manager

In [3]:
def get_secret(secret_name):
    region_name = "us-east-1"

    # Create a Secrets Manager client
    session = boto3.session.Session()
    client = session.client(
        service_name='secretsmanager',
        region_name=region_name
    )

    try:
        get_secret_value_response = client.get_secret_value(
            SecretId=secret_name
        )
    except ClientError as e:
        raise e

    secret = get_secret_value_response['SecretString']
    
    return json.loads(secret)

## Initialize OpenAI Client

In [4]:
from openai import OpenAI
openai_api_key  = get_secret('openai')['api_key']

client = OpenAI(api_key=openai_api_key)

## File Search API

### Introduction to File Search
File search API enables efficient retrieval of relevant information 
from uploaded files by leveraging vector-based indexing. This feature is particularly useful 
for searching large datasets, extracting insights, and improving retrieval-augmented generation (RAG) applications.

Unlike traditional keyword-based searches, the Responses API uses embeddings 
to identify semantically relevant content, making it ideal for analyzing structured 
and unstructured text data (OpenAI, 2025).

For more details, visit the official OpenAI documentation: 
[File Search in Responses API](https://platform.openai.com/docs/guides/tools-file-search).

### Create a Vector Store

In [5]:
vector_store = client.vector_stores.create(
    name="my_vector_store"
)
vector_store_id = vector_store.id
print(vector_store_id)

vs_6914e300bea88191a0f4abe12b8c6dc7


### Upload Files

In [6]:
with open('tweet_text.json', 'rb') as f:
    file = client.files.create(
        file=f,            # file-like object
        purpose="assistants"
    )

file_id = file.id
print(file_id)

file-QEMuudr6WRbx7QC4F6Y7Wq


### Attach File to Vector Store

In [7]:
attach_status =client.vector_stores.files.create(
    vector_store_id=vector_store_id,
    file_id=file_id
            )

print(attach_status.id)

file-QEMuudr6WRbx7QC4F6Y7Wq


### Query the Vector Store

In [8]:
query = "the latest development in generativeAI"

In [9]:
search_results = client.vector_stores.search(
    vector_store_id=vector_store_id,
    query=query
)

for result in search_results.data[:5]:
    print(result.content[0].text[:100] + '\n Relevant score: ' + str(result.score))

## OpenAI Response API

### Simple Response

In [10]:
simple_response = client.responses.create(
  model="gpt-4o",
  input=[
      {
          "role": "user",
          "content": query
      }
  ]
)

In [11]:
display(Markdown(simple_response.output_text))

As of the latest updates in 2023, generative AI continues to make significant strides in various areas. Key developments include:

1. **Advanced Language Models**: Models like GPT-4 have become more sophisticated, with improved understanding, context handling, and creativity in generating human-like text.

2. **Multimodal Models**: Systems like DALL-E 3 and CLIP are pushing boundaries by integrating text and image generation, allowing for more interactive and versatile AI applications.

3. **Real-Time Applications**: AI-generated content is being used in real-time applications, such as virtual assistants, customer service bots, and real-time translation services.

4. **Ethical and Safe AI**: There is a growing focus on developing ethical AI frameworks to ensure generative AI is used responsibly, addressing concerns such as bias, misinformation, and privacy.

5. **Creative Industries**: AI is becoming a tool for artists, musicians, and writers, aiding in creating artworks, composing music, and writing stories, enhancing creativity with computational power.

6. **Customization and Personalization**: Generative AI is being used to tailor content and experiences to individual preferences, from personalized learning environments to customized marketing strategies.

7. **Improved Training Techniques**: Researchers are working on more efficient training methods, reducing the computational resources needed and improving the scalability of models.

These developments highlight the rapid evolution and expanding capabilities of generative AI, influencing a wide range of industries and applications.

### File Search Response

In [12]:

file_search_response = client.responses.create(
    input= query,
    model="gpt-4o",
    temperature = 0,
    tools=[{
        "type": "file_search",
        "vector_store_ids": [vector_store_id],
    }]
)

In [13]:
display(Markdown(file_search_response.output_text))


The latest developments in generative AI include:

1. **Agentic Workflows**: Amazon Web Services is exploring the future of AI with a focus on agentic workflows, which are being showcased alongside startups like NeuralSeek and Tarpit AI.

2. **AI in Supply Chain**: Atos has developed an AI-powered Supply Chain Disruption Analysis using generative AI, SAP BTP, and AWS Bedrock to assess risk and boost resilience.

3. **AI in Content Creation**: Generative AI is being used to create cinematic videos from prompts, incorporating audio and physics, which is a game changer for creators.

4. **Market Growth**: The global generative AI market is expected to reach $1.18 billion this year, highlighting its rapid growth and adoption across various sectors.

5. **Enterprise Applications**: IBM's Watsonx is bringing generative AI to enterprises, allowing teams to build custom large language models for enhanced customer engagement and streamlined processes.

These developments indicate a broadening application of generative AI across industries, from creative content to enterprise solutions.

## Web Search API

### Introduction to Web Search
The OpenAI Web Search tool allows models to retrieve real-time information from the internet. 
This capability is particularly useful for obtaining up-to-date data, fact-checking, and expanding knowledge 
without relying solely on pre-trained information. 

By leveraging OpenAI's web search functionality, the Responses API can fetch external data 
and provide accurate, relevant results in real time (OpenAI, 2025). 
This feature enhances applications that require the latest insights, such as news aggregation, research, 
or dynamic content generation.

For more details, visit the official OpenAI documentation: 
[Web Search in Responses API](https://platform.openai.com/docs/guides/tools-web-search).

### Perform Web Search

In [14]:
web_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= query,
    tools=[
        {
            "type": "web_search"
        }
    ]
)

In [15]:
display(Markdown(web_search_response.output_text))

Here‚Äôs a detailed overview of the **latest developments in generative AI** as of November 12, 2025, based on recent news and industry announcements. For clarity, this response is structured into key focus areas, each supported with up‚Äëto‚Äëdate evidence.

---

##  Foundational Models & Multimodal Advances

- **OpenAI‚Äôs GPT‚Äë5**  
  Released on **August 7, 2025**, GPT‚Äë5 is the fifth-generation multimodal large language model. It combines reasoning and non‚Äëreasoning capabilities and is integrated into ChatGPT, Microsoft Copilot, and accessible via OpenAI‚Äôs API, delivering state-of-the-art benchmark performance ([en.wikipedia.org](https://en.wikipedia.org/wiki/GPT-5?utm_source=openai)).

- **Google‚Äôs Gemini Enhancements and ‚ÄúNano Banana‚Äù**  
  - Throughout 2025, Google rolled out several advanced variants in its Gemini model family, culminating in Gemini 2.5 Pro (state-of-the-art reasoning and coding capabilities) and Gemini 2.5 Flash (optimized for speed), both becoming generally available in **June 2025** ([en.wikipedia.org](https://en.wikipedia.org/wiki/Gemini_%28language_model%29?utm_source=openai)).  
  - In **August 2025**, Google introduced ‚ÄúNano Banana‚Äù (Gemini‚ÄØ2.5 Flash Image), a viral image editing model notable for photorealistic 3D-style ‚Äúfigurine‚Äù effects, subject consistency, multi-image fusion, and invisible SynthID watermarking ([en.wikipedia.org](https://en.wikipedia.org/wiki/Nano_Banana?utm_source=openai)).

- **Runway Gen‚Äë4 (Text-to‚ÄëVideo AI)**  
  Launched on **March 31, 2025**, Runway Gen-4 generates 5‚Äì10 second video clips from text prompts and reference images, supporting various resolutions (e.g., 720p) and delivering consistent characters within individual clips‚Äîa milestone in AI video generation ([en.wikipedia.org](https://en.wikipedia.org/wiki/Gen-4_%28AI_image_and_video_model%29?utm_source=openai)).

- **OpenAI Sora 2**  
  On **September 30, 2025**, OpenAI released Sora 2, a model capable of generating synchronized video with audio, enabling multi-shot consistency. This launch was accompanied by the Sora app, which functions as a social platform for AI-generated short clips‚Äîsignaling a shift toward socialized content creation, albeit with raised concerns about IP and content moderation ([champaignmagazine.com](https://champaignmagazine.com/2025/10/05/ai-by-ai-weekly-top-5-september-29-october-5-2025/?utm_source=openai)).

- **Other Notable Model Launches**  
  - **Anthropic Claude Haiku 4.5** ‚Äî A compact model launched in **October 2025** delivering flagship-level reasoning and coding abilities at 4‚Äì5√ó the speed and ~1/3 the cost of larger models. It supports a 200K-token context window and includes agentic capabilities ([voxfor.com](https://www.voxfor.com/what-is-new-in-ai-the-latest-news-from-october-2025/?utm_source=openai)).  
  - **Gemini Diffusion** ‚Äî A novel experimental model by DeepMind applying diffusion techniques to text generation, enabling simultaneous text production and in-process error correction, achieving speeds up to 1,479 tokens per second ([spglobal.com](https://www.spglobal.com/market-intelligence/en/news-insights/research/generative-ai-digest-a-wave-of-notable-ai-model-launches?utm_source=openai)).  
  - **Alibaba Qwen3 Series** ‚Äî Released in **April 2025**, this open-source model family (under Apache 2.0) includes dense models up to 32B parameters and MoE variants, delivering robust performance across coding, math, and instruction tasks in both ‚Äúthinking‚Äù and ‚Äúnon‚Äëthinking‚Äù modes ([spglobal.com](https://www.spglobal.com/market-intelligence/en/news-insights/research/generative-ai-digest-a-wave-of-notable-ai-model-launches?utm_source=openai)).

---

##  Specialized Applications & Research Innovations

- **Material Science with Generative AI**  
  - **SCIGEN**: MIT researchers introduced a tool that guides generative AI to design materials following structural ‚Äúdesign rules,‚Äù accelerating the discovery of novel quantum materials with desired geometric properties ([news.mit.edu](https://news.mit.edu/2025/new-tool-makes-generative-ai-models-likely-create-breakthrough-materials-0922?utm_source=openai)).  
  - **Energy Storage**: NJIT scientists applied AI to explore sustainable alternatives to lithium-ion batteries, advancing material innovation in energy domains ([news.njit.edu](https://news.njit.edu/ai-breakthrough-njit-unlocks-new-materials-replace-lithium-ion-batteries?utm_source=openai)).

- **Personalized Object Localization**  
  MIT developed a training method enabling vision-language generative AI to recognize and localize personalized objects (e.g., a pet named ‚ÄúSnoofkin‚Äù) across new scenes‚Äîenhancing customization in image generation tasks ([news.mit.edu](https://news.mit.edu/2025/method-teaches-generative-ai-models-locate-personalized-objects-1016?utm_source=openai)).

- **Optical Generative Models**  
  UCLA researchers pioneered optical generative models that leverage light physics rather than electronic computation to create images, pointing toward more sustainable and energy-efficient AI methods ([phys.org](https://phys.org/news/2025-08-optical-generative-ushering-era-sustainable.html?utm_source=openai)).

- **Lightweight 3D Image Modeling**  
  The University of Tennessee at Chattanooga unveiled a lightweight AI model optimized for interpretable 3D image modeling with reduced computational demands‚Äîa significant step for efficient visual AI applications ([webpronews.com](https://www.webpronews.com/utcs-lightweight-ai-breakthrough-in-3d-image-modeling/?utm_source=openai)).

- **Sustainable Packaging with AI**  
  Nestl√© and IBM jointly released a generative AI tool capable of identifying innovative high-barrier packaging materials, highlighting AI‚Äôs growing role in sustainable product development ([nestle.com](https://www.nestle.com/about/research-development/news/ibm-ai-powered-sustainable-packaging?utm_source=openai)).

---

##  Adoption Trends, Ethical Considerations & Market Dynamics

- **Enterprise Adoption & Scaling Challenges**  
  According to the 2025 McKinsey Global Survey, generative and agentic AI are increasingly adopted across organizations. High-performing players are defining clear processes for when human oversight is needed, investing substantial portions (>20%) of their digital budgets in AI, yet most are still working to scale pilot projects into broad deployment ([mckinsey.com](https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai?utm_source=openai)).

- **Rapid Global Adoption**  
  A recent report indicates that **78% of organizations now use AI in at least one business function**, up from 55% a year prior‚Äîunderscoring AI's transition from pilot to mainstream utility ([netguru.com](https://www.netguru.com/blog/ai-adoption-statistics?utm_source=openai)).

- **Market Fragmentation & Supply Considerations**  
  Trade tensions and chip tariffs are driving a geographic shift in AI hardware manufacturing‚Äîmoving supply chains to Vietnam, Mexico, and India. Concurrently, China is reinforcing a self-sufficient AI ecosystem, marking a bifurcation in generative AI infrastructure and development trends ([globenewswire.com](https://www.globenewswire.com/news-release/2025/09/16/3150581/28124/en/Generative-AI-Market-Report-Highlights-US-Tariffs-Rising-Costs-and-Global-AI-Ecosystem-Shifts.html?utm_source=openai)).

- **Ethical Adoption in Professional Services**  
  At the 2025 Emerging Technology and Generative AI Forum, industry experts emphasized the importance of critical human oversight in ensuring ethical, accountable use of generative AI in professional sectors ([thomsonreuters.com](https://www.thomsonreuters.com/en-us/posts/technology/emerging-technology-generative-ai-forum-ethical-ai-adoption/?utm_source=openai)).

---

##  Summary of the Latest Trends

1. **Advanced Multimodal Models**: GPT‚Äë5, Gemini‚ÄØ2.5, Sora‚ÄØ2, and others are pushing the boundaries of reasoning, video, and real-time capabilities.
2. **Efficiency & Accessibility**: Smaller, faster models‚Äîlike Claude Haiku 4.5 and open-source Qwen3‚Äîare democratizing powerful AI.
3. **Purpose-Built AI Applications**: Tools like SCIGEN, personalized object localization, and optical image generation highlight AI‚Äôs expanding reach.
4. **Rapid Organizational Adoption**: Generative AI is becoming mission-critical, though challenges in scaling and oversight remain.
5. **Global Ecosystem Dynamics**: Supply-chain reconfigurations and regional AI ecosystems are influencing innovation trajectories.
6. **Ethics & Governance**: As generative AI becomes pervasive, ethical deployment and human governance are paramount.

---

If you‚Äôd like to explore any of these areas further‚Äîbe it technical specifics of a model, enterprise strategy implications, or ethical frameworks‚ÄîI‚Äôd be glad to dive deeper!

### Stateful Response

The OpenAI Responses API includes a stateful feature that enables continuity in interactions. 
By using the `response_id`, a conversation can persist across multiple queries, 
allowing users to refine or expand upon previous searches. This is particularly useful for iterative research, 
dynamic content generation, and applications that require follow-up queries based on prior responses.

In [16]:
fetched_response = client.responses.retrieve(response_id=web_search_response.id)
display(Markdown(fetched_response.output_text[:100]))

Here‚Äôs a detailed overview of the **latest developments in generative AI** as of November 12, 2025, 

### Continue Query with Web Search

In [17]:
continue_query = 'find different news'

continue_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= continue_query,
    previous_response_id=web_search_response.id,
    tools=[
        {
            "type": "web_search"
        }
    ]
)

In [18]:
display(Markdown(continue_search_response.output_text))

Here‚Äôs a refreshed and richly detailed overview of **recent developments in generative AI**, drawn from diverse, contemporary sources. This summary spans advances in foundational models, hardware innovations, and industry shifts‚Äîeach with timely citations, and structured for clarity.

---

## Open-Weight Model Releases & Accessibility

- **OpenAI‚Äôs gpt‚Äëoss‚Äë120b and gpt‚Äëoss‚Äë20b**  
  In **August 2025**, OpenAI released two open-weight models‚Äîthe first since GPT‚Äë2 in 2019‚Äîdesigned to broaden access to generative AI. The lighter 20B model can run locally on consumer hardware (e.g., 16‚ÄØGB RAM PCs, Snapdragon processors), while the 120B version suits powerful GPUs like NVIDIA RTX and was trained using substantial compute resources. ([windowscentral.com](https://www.windowscentral.com/artificial-intelligence/openai-chatgpt/openai-launches-two-gpt-models-theyre-not-gpt-5-but-they-run-locally-on-snapdragon-pcs-and-nvidia-rtx-gpus?utm_source=openai))

- **AWS Integration of OpenAI Models**  
  In tandem with OpenAI‚Äôs release, **Amazon Web Services (AWS)** began offering these open models on Amazon Bedrock and SageMaker platforms. The 120B model is notably more cost-efficient‚Äîthree times better than Gemini, five times than DeepSeek‚ÄëR1, and twice than OpenAI‚Äôs o4‚Äîempowering millions of developers with flexible, economical generative AI access. ([timesofindia.indiatimes.com](https://timesofindia.indiatimes.com/technology/tech-news/amazon-announces-first-ever-availability-of-openai-models-for-its-cloud-customers-company-says-the-addition-of-/articleshow/123125170.cms?utm_source=openai))

---

## Multimodal & Reasoning Model Milestones

- **OpenAI‚Äôs GPT‚Äë5 (Released August 7, 2025)**  
  GPT‚Äë5 delivers integrated reasoning and non-reasoning capabilities as a full-fledged multimodal foundation model. It's available via ChatGPT, Microsoft Copilot, and the OpenAI API, setting new performance benchmarks upon launch. ([en.wikipedia.org](https://en.wikipedia.org/wiki/GPT-5?utm_source=openai))

- **Google‚Äôs Gemini Model Enhancements**  
  - **Gemini‚ÄØ2.5 Pro and Gemini‚ÄØ2.5 Flash** debuted in mid‚Äë2025, introducing ‚ÄúDeep Think‚Äù reasoning and enhanced multimodal fluency. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Gemini_%28language_model%29?utm_source=openai))  
  - **Nano Banana** (Gemini‚ÄØ2.5‚ÄØFlash Image) launched publicly on **August‚ÄØ26,‚ÄØ2025**, gaining viral popularity for photorealistic ‚Äú3D figurine‚Äù image generation and driving over 10 million new Gemini app users and 200 million image edits in subsequent weeks. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Nano_Banana?utm_source=openai))  

- **Gemini Diffusion**  
  An experimental model from DeepMind that uses diffusion techniques to generate text more quickly‚Äîup to **1,479 tokens per second**, compared to ~400 for Gemini‚ÄØ2.5 Flash and ~150 for GPT‚Äë4o‚Äîmarking a breakthrough in the speed of text generation. ([spglobal.com](https://www.spglobal.com/market-intelligence/en/news-insights/research/generative-ai-digest-a-wave-of-notable-ai-model-launches?utm_source=openai))

- **Alibaba‚Äôs Qwen Series**  
  - Released **Qwen‚ÄØ2.5‚ÄëVL‚Äë32B‚ÄëInstruct** in March 2025: a multimodal model excelling at reasoning and visual tasks, open-sourced under Apache‚ÄØ2.0. ([sourceforge.net](https://sourceforge.net/articles/breaking-news-biggest-ai-advances-2025-heres-what-you-need-to-know/?utm_source=openai))  
  - In **April 2025**, Alibaba launched **Qwen3** family: dense and MoE variants with up to 32B parameters, available via open license and featuring reasoning modes and 128K token context windows. ([spglobal.com](https://www.spglobal.com/market-intelligence/en/news-insights/research/generative-ai-digest-a-wave-of-notable-ai-model-launches?utm_source=openai))  
  - On **September 5, 2025**, **Qwen3‚ÄëMax** was released, surpassing other non‚Äëreasoning models in performance. Its thinking/reasoning capability rolled out publicly in early November 2025. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Qwen?utm_source=openai))

---

## Infrastructure: Compute & Hardware Advancements

- **NVIDIA‚Äôs Next-Gen AI Chips**  
  At **GTC 2025**, NVIDIA revealed its upcoming chip architectures: **Blackwell Ultra** (coming late 2025), **Vera Rubin** (2026 launch), and **Rubin Ultra** (2027). These advances support the growing need for AI and agentic systems. NVIDIA also introduced open-source tools‚Äîincluding the Isaac GR00T N1 robotics model and the Cosmos AI synthetic data model‚Äîand announced the Newton physics engine for robotics simulation. ([apnews.com](https://apnews.com/article/457e9260aa2a34c1bbcc07c98b7a0555?utm_source=openai))

- **Qualcomm‚Äôs AI200 and AI250 Accelerators**  
  In **October 2025**, Qualcomm announced plans for new data center inference accelerators: **AI200** (2026) and **AI250** (2027). Built on advanced Hexagon NPUs, these systems offer high memory capacity, micro-tile inferencing, Gen AI encryption, and support for AI framework integration‚Äîpositioning Qualcomm as a rising competitor to AMD and NVIDIA in AI infrastructure. ([tomshardware.com](https://www.tomshardware.com/tech-industry/artificial-intelligence/qualcomm-unveils-ai200-and-ai250-ai-inference-accelerators-hexagon-takes-on-amd-and-nvidia-in-the-booming-data-center-realm?utm_source=openai))

- **AMD‚Äôs MI400 Chips & Helios System**  
  Announced in **mid‚Äë2025**, AMD's **Instinct MI400** GPU series and the *Helios* rack-scale architecture (launching 2026) aim to rival NVIDIA‚Äôs solutions by unifying thousands of chips into a cohesive AI compute engine‚Äîendorsed by OpenAI and representing significant competition in the rack-scale AI hardware space. ([linkedin.com](https://www.linkedin.com/pulse/top-5-generative-ai-news-updates-from-week-24-2025-8th-shankar-k2cgc?utm_source=openai))

---

## Regional & Organizational Developments

- **Tencent‚Äôs Open-Source 3D Generation Tools**  
  In **March 2025**, Tencent introduced *Hunyuan3D‚Äë2.0*, releasing five open-source models‚Äîincluding fast ‚Äúturbo‚Äù versions that generate 3D visuals in 30 seconds. This move underscores escalating Chinese capabilities in generative AI, particularly for design and game development. ([reuters.com](https://www.reuters.com/technology/artificial-intelligence/tencent-expands-ai-push-with-open-source-3d-generation-tools-2025-03-18/?utm_source=openai))

- **India‚Äôs BharatGen LLM & OpenAI Academy Expansion**  
  In **June 2025**, India launched *BharatGen*, a multimodal large language model supporting 22 Indian languages, tailored for localized use in sectors like healthcare and governance. Concurrently, OpenAI revealed its first overseas AI Academy in India, aiming to train developers, educators, and civil servants, backed by substantial computing access and crowdsourced engagement initiatives. ([linkedin.com](https://www.linkedin.com/pulse/top-5-generative-ai-news-updates-from-week-23-2025-1st-shankar-1tj5c?utm_source=openai))

- **Hugging Face Enters Robotics**  
  Hugging Face unveiled two open-source humanoid robots‚Äî*HopeJR* and *Reachy Mini*‚Äîpriced approximately at US‚ÄØ$3,000 and $250‚Äì300 respectively, expected to ship by end of 2025. These devices are part of the company's mission to democratize robotics innovation and expand beyond software into physical AI. ([linkedin.com](https://www.linkedin.com/pulse/top-5-generative-ai-news-updates-from-week-22-2025-25th-shankar-uahbc?utm_source=openai))

---

### Summary Table (Key Developments)

| Category                      | Highlights |
|------------------------------|------------|
| **Open Models**              | OpenAI‚Äôs gpt‚Äëoss series; AWS integration |
| **Model Advances**           | GPT‚Äë5; Gemini 2.5/Flash/Nano Banana; Gemini Diffusion; Alibaba Qwen series |
| **Hardware & Compute**       | NVIDIA Rubin chips; Qualcomm AI200/250; AMD MI400 & Helios |
| **Regional Innovations**     | Tencent 3D tools; India‚Äôs BharatGen & AI Academy |
| **Robotics**                 | Hugging Face‚Äôs HopeJR & Reachy Mini |

---

If you're interested in any specific area‚Äîwhether it's a deep dive into GPU architectures, licensing strategies, regional AI ecosystems, or the future of robotics‚Äîfeel free to ask. I'm happy to provide more focused analysis!

### Combining File Search and Web Search

This is an example of using file search to analyze private data and web search to retrieve public or the latest data. 
The Responses API allows developers to integrate these tools to enhance retrieval-augmented generation (RAG) applications. 
By combining file search with web search, users can leverage structured internal knowledge while also retrieving real-time 
information from external sources, ensuring comprehensive and up-to-date responses. 

In [19]:
combined_search_response = client.responses.create(
    model="gpt-4o",  # or another supported model
    input= query,
    temperature = 0,
    instructions="Retrieve the results from the file search first, and use the web search tool to expand the results with news resources",
    tools=[{
        "type": "file_search",
        "vector_store_ids": [vector_store_id],
    },
        {
            "type": "web_search"
        }
    ]
)

In [20]:
display(Markdown(combined_search_response.output_text))

Recent developments in generative AI include:

1. **Market Growth**: The global generative AI market is expected to reach $1.18 billion this year, highlighting its rapid expansion and the increasing importance of adopting these technologies.

2. **Enterprise Integration**: IBM's Watsonx is bringing generative AI to enterprises, allowing teams to build custom large language models to enhance customer engagement and streamline processes.

3. **Innovation in Content Creation**: Generative AI is being used to create cinematic videos from simple prompts, revolutionizing content creation for marketing and entertainment.

4. **Industry Applications**: Generative AI is reshaping various sectors, from supply chain disruption analysis to financial services, by providing innovative solutions and improving efficiency.

5. **Ethical and Regulatory Considerations**: There is ongoing discussion about the regulation and ethics of generative AI, focusing on issues like deepfakes and content ownership.

These developments indicate a significant shift in how businesses and industries are leveraging AI to innovate and solve complex problems.

# üß© Try It Yourself: Two-Step RAG (Private Data + Combined Search)

## Step 1 ‚Äî Upload & Create Vector Store
1. Upload a short text file (e.g., `my_notes.txt`) to your notebook instance.  
2. Create a **vector store** and **ingest** your uploaded file.  
3. Run a simple test query to verify retrieval:  

In [21]:
tiy_vector_store = client.vector_stores.create(
    name="tiy_vector_store"
)
tiy_vector_store_id = tiy_vector_store.id
print("Vector store created:", tiy_vector_store_id)

with open("Drone_Interception_Rate.txt", "rb") as f:
    uploaded_file = client.files.create(
        file=f,
        purpose="assistants"
    )

print("File uploaded:", uploaded_file.id)


attach_result = client.vector_stores.files.create(
    vector_store_id=tiy_vector_store_id,
    file_id=uploaded_file.id,
)
print("File attached to vector store:", attach_result.id)


test_query = "Summarize the main ideas from my notes."
search_results = client.vector_stores.search(
    vector_store_id=tiy_vector_store_id,
    query=test_query,
)

print("\nTop retrieved chunks:")
for item in search_results.data[:3]:
    print("-" * 40)
    print(item.content[0].text.strip())

Vector store created: vs_6914e47d5a4481919727f1951253ae44
File uploaded: file-PLZkN5U47s5N4SDu7QtoBp
File attached to vector store: file-PLZkN5U47s5N4SDu7QtoBp

Top retrieved chunks:


## Step 2 ‚Äî Combine File Search with Web Search
1. Enable both **file_search** and **web_search** in the Responses API.  
2. Use a prompt that asks the model to merge insights from both sources.  
   > Example: ‚ÄúUsing my uploaded notes and the latest web information, summarize the current trends on this topic.‚Äù  
3. Review how the answer from your file and **current info** from the web.

‚úÖ You‚Äôve created a RAG system that combines **private** and **public** data for comprehensive, up-to-date analysis.


In [22]:
query = (
    "Using the uploaded news article as the primary source, and also checking the latest web information, "
    "give me a short, structured summary of the topic (This is a abc story about the falling Ukrainian interception rate and its effect on the nation). Clearly separate 'From my file' vs 'From the web'."
)

combined_response = client.responses.create(
    model="gpt-4o",
    input=query,
    temperature=0,
    instructions=(
        "First look up relevant passages from the attached vector store. "
        "Then augment with web_search to bring in current/public info. "
        "Present the answer in two sections: 'From my file(s)' and 'From the web'."
    ),
    tools=[
        {
            "type": "file_search",
            "vector_store_ids": [tiy_vector_store_id],
        },
        {
            "type": "web_search"
        }
    ],
)

display(Markdown(combined_response.output_text))

### From my file

The news article discusses the declining interception rates of Ukrainian air defenses against Russian drones and missiles. In October, the interception rate for drones fell to just under 80%, the lowest of 2025, down from over 90% earlier in the year. Similarly, missile interception rates dropped to 54%, the lowest since April. This decline is attributed to several factors, including an increase in the number of drones launched by Russia, potential shortages in munitions, and possibly adverse weather conditions.

The reduced interception rates have significant implications for Ukraine. Russian attacks have increasingly targeted critical infrastructure, particularly energy facilities, leading to widespread blackouts. The Ukrainian government, including President Volodymyr Zelenskyy, has highlighted the inadequacy of the country's air defense systems and the need for more Western support. The cost of intercepting drones with advanced missile systems is high, and Ukraine is struggling to keep up with the growing sophistication and volume of Russian attacks.

### From the web

I will now look up the latest information on this topic. Please hold on.Here is a structured summary of the topic, clearly separated into two sections:

From my file  
- The article reports a decline in Ukraine‚Äôs air defense interception rates in 2025. Drone interception fell to just under 80% in October‚Äîthe lowest level of the year‚Äîdown from over 90% earlier. Missile interception dropped to 54%, the lowest since April.  
- Contributing factors include a surge in Russian drone launches, possible shortages of interceptors, and adverse weather conditions.  
- The decline has had serious consequences: Russian strikes increasingly target critical infrastructure, especially energy facilities, causing widespread blackouts. Ukrainian leaders, including President Zelenskyy, have emphasized the insufficiency of current air defenses and the urgent need for more Western support. The high cost of intercepting drones with advanced missile systems is straining Ukraine‚Äôs resources.  

From the web  
- In September 2025, Russia‚Äôs upgraded Iskander‚ÄëM and Kinzhal missiles‚Äîwith new terminal-phase maneuvering capabilities‚Äîcaused Ukraine‚Äôs ballistic missile interception rate to plummet from around 37% in August to just 6% in September. This shift has been described as a ‚Äúgame-changer,‚Äù enabling successful strikes on key targets including drone factories and diplomatic offices in Kyiv ([newstarget.com](https://www.newstarget.com/2025-10-06-russia-upgraded-missiles-reduce-ukraine-interception-rate.html?utm_source=openai)).  
- As of October 11, 2025, Ukraine‚Äôs overall air defense effectiveness stood at approximately 74%, according to Commander-in-Chief Oleksandr Syrskyi. He noted a 1.3-fold increase in Russian airstrikes over the past month and stressed the need to better protect energy and logistics infrastructure ([kyivindependent.com](https://kyivindependent.com/ukrainian-air-defenses-operating-at-74-effectiveness-syrskyi-says/?utm_source=openai)).  
- Ukraine has shifted toward drone-on-drone defense. President Zelenskyy reported a 68% interception rate against Russian Shahed drones using interceptor drones, which cost $3,000‚Äì$5,000 each‚Äîfar less than the $120,000‚Äì$150,000 cost of a Shahed ([united24media.com](https://united24media.com/latest-news/zelenskyy-ukraines-interceptors-achieve-68-success-rate-against-russian-shahed-drones-12417?utm_source=openai)).  
- Kyiv has allocated 260 million hryvnias (about $6.2 million) to a drone interceptor program. A pilot phase has already intercepted nearly 550 Russian drones. Plans include establishing a training center and mobile interceptor units ([reuters.com](https://www.reuters.com/business/aerospace-defense/kyiv-allocate-62-million-drone-interceptor-program-2025-07-11/?utm_source=openai)).  
- As of early November 2025, Ukraine is ramping up production of interceptor drones, aiming for 600‚Äì800 per day by the end of the month‚Äîthough this falls short of the initial goal of 1,000 per day ([businessinsider.com](https://www.businessinsider.com/ukraine-interceptor-drone-war-800-production-zelenskyy-2025-11?utm_source=openai)).  
- The broader air war remains intense: in July 2025, Russia launched a record 728 drones and 13 missiles in a single night. Ukrainian defenses intercepted 296 drones and jammed or lost 415, highlighting the scale of the threat and the strain on Ukraine‚Äôs air defense systems ([apnews.com](https://apnews.com/article/fe3d23673b9b5696bb5097def9ed0775?utm_source=openai)).  
- Overall, Ukraine‚Äôs air defense is under severe pressure. In May 2025, interception rates for Shahed drones dropped to around 30%, down from over 90% in 2024, due to improved Russian drone tactics, higher altitudes, heavier payloads, and anti-jamming capabilities ([lemonde.fr](https://www.lemonde.fr/en/international/article/2025/05/26/ukraine-s-air-defense-is-struggling-to-keep-up-with-intensifying-russian-strikes_6741666_4.html?utm_source=openai)).

Summary  
Ukraine‚Äôs interception rates have declined significantly in 2025, particularly against advanced Russian missiles and drone swarms. The fall in effectiveness has led to increased damage to critical infrastructure and heightened civilian risk. In response, Ukraine is innovating with cost-effective interceptor drones, scaling up production, and investing in new defense programs. However, the evolving nature of Russian attacks continues to challenge Ukraine‚Äôs air defense capabilities.