FIX: Fix multi-turn attacks using RealtimeTarget#1638
Merged
Conversation
romanlutz
approved these changes
Apr 22, 2026
Contributor
|
Did you check the GUI? Does it work there? It should, just asking. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR addresses the bug where executing multi-turn attacks against RealtimeTarget would not work properly because previous turn context was getting lost. The bug was caused by forced connection closes and reconnections to the OpenAI/AzureOpenAI server in
send_text_asyncandsend_audio_async, which caused all server-side conversation context to be lost.send_text_asyncandsend_audio_async, and the WebSocket connection now persists across turns, preserving server-side conversation stateresponse.doneevent: skips the event if it's the very first event received (usingcurrent_turn_event_counttracking variable) and contains no audio. This occurs when an unconsumedresponse.doneevent is left in the WebSocket buffer from a prior turn's soft-finish when anaudio.doneevent was received but noresponse.doneafter 1 second grace period. The method breaks normally if preceded by other events or if audio data is presentTests and Documentation
Added tests for multi-turn attacks with RealtimeTarget and re-ran notebook.