Skip to content

Realtime history gets corrupted when user speaks multiple times #106

Open
@yusuf-eren

Description

@yusuf-eren

Hi, I was checking out the examples/realtime-next example and noticed a bug:

When I speak multiple times before getting a response from the API, the UI doesn’t show the user’s message properly. Instead, it just displays a placeholder (three dots):

Image

It looks like the transcript becomes null whenever there is a second(or more) speech input.
Here's the relevant history_updated log I saved to localStorage:

[
    {
        "itemId": "item_BiOeLkjsA8mRMdIaKzbuq",
        "previousItemId": null,
        "type": "message",
        "role": "user",
        "status": "completed",
        "content": [
            {
                "type": "input_audio",
                "audio": null,
                "transcript": "Hello"
            }
        ]
    },
    {
        "itemId": "item_BiOeNJZCBzGFVS415Pe0E",
        "type": "message",
        "role": "assistant",
        "status": "in_progress",
        "content": [
            {
                "type": "audio",
                "transcript": "Top of the morning to you! How can I assist you today?",
                "audio": null
            }
        ]
    },
    {
        "itemId": "item_BiOeZP90TMwMVtovNyI8q",
        "previousItemId": "item_BiOeNJZCBzGFVS415Pe0E",
        "type": "message",
        "role": "user",
        "status": "completed",
        "content": [
            {
                "type": "input_audio",
                "transcript": null,
                "audio": null
            }
        ]
    },
    {
        "itemId": "item_BiOecAps53nugatsqDAdS",
        "type": "message",
        "role": "assistant",
        "status": "in_progress",
        "content": [
            {
                "type": "audio",
                "transcript": "It sounds like you're having some trouble with your Google Calendar. Unfortunately, I can't directly access or manage Google accounts. However, I recommend checking your account settings, making sure you're logged into the correct account, and verifying that your calendar isn't hidden. Do you need any other assistance?",
                "audio": null
            }
        ]
    }
]

To work around the issue, I replaced the default history_updated handler in app/page.tsx with a custom listener on the transport_event. This ensures user speech is added to history as soon as the transcription is available:

// Only replace the 'transport_event' in app/page.tsx and comment the `session.current.on('history_updated', ...)`
    session.current.on('transport_event', (event) => {
      setEvents((events) => [...events, event]);

      // Handle user speech transcription completed
      if (
        event.type === 'conversation.item.input_audio_transcription.completed'
      ) {
        console.log('User spoke:', event.transcript);
        // Update history with user transcription
        setHistory((prevHistory) => {
          // Check that this item isn't already in history
          const existingItem = prevHistory.find(
            (item) => item.itemId === event.item_id,
          );
          if (existingItem) {
            return prevHistory;
          }

          return [
            ...prevHistory,
            {
              itemId: event.item_id,
              type: 'message',
              role: 'user',
              content: [{ type: 'input_audio', transcript: event.transcript }],
              status: 'completed',
            } as RealtimeItem,
          ];
        });
      }

      // Handle AI speech completed
      if (
        event.type === 'response.content_part.done' &&
        event.part?.type === 'audio'
      ) {
        console.log('AI spoke:', event.part.transcript);
        // Update history with AI transcription
        setHistory((prevHistory) => {
          // Check that this item isn't already in history
          const existingItem = prevHistory.find(
            (item) => item.itemId === event.item_id,
          );
          if (existingItem) {
            return prevHistory;
          }

          return [
            ...prevHistory,
            {
              itemId: event.item_id,
              type: 'message',
              role: 'assistant',
              content: [{ type: 'audio', transcript: event.part.transcript }],
              status: 'completed',
            } as RealtimeItem,
          ];
        });
      }
    });

User interface with my solution;
Image

Debug information

  • Agents SDK version: 0.0.7
  • Runtime environment: Node.js v22.16.0

Expected behavior

Expected Behavior

This should ideally be fixed inside the 'history_updated' event emitted by realtimeSession.ts.
It should handle overlapping user input correctly and avoid inserting null transcripts into the conversation history.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions