GPT-5 does not work well compared to 4.1

### Describe the bug
A clear and concise description of what the bug is.

### Debug information
- Agents SDK version: 0.2.4
- Python version 3.13

### Repro steps

I've been using GPT-5 as a drop in for GPT-4.1 for the last hour. It's a disaster. While 4.1 understood my requests, routed to the right agent, and called tools, GPT-5 keeps asking for more specifics and details before it'll call a tool. It's a much worse experience.

Like, for example, I sent in "What is the weather tomorrow night?"

My system prompt has a location in it for each request as well as the current date time and time zone. There is a tool that takes latitude and longitude and a date and gets the hourly forecast.

ChatGPT 4.1 does this 100% correctly. Gets the hourly forecast and gives a great answer.

GPT-5 asks if I mean tomorrow night (with the date) specifically and doesn't call the tool without more input.

It even sometimes say it'll do something but doesn't actually call the tool.

This is an unmitigated disaster and GPT-5 doesn't work at all for agents.

It's also very slow at handing off while actual responses do come back quickly. 4.1 overall is faster.


### Expected behavior
GPT-5 should outperform 4.1 in terms of tool calling but it understand commands and tools less, fails to take the correct actions, and is slower than 4.1


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPT-5 does not work well compared to 4.1 #1397

Describe the bug

Debug information

Repro steps

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GPT-5 does not work well compared to 4.1 #1397

Description

Describe the bug

Debug information

Repro steps

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions