Skip to content

Fix Phonic generate_reply to resolve with the current GenerationCreatedEvent#5147

Merged
theomonnom merged 4 commits intolivekit:mainfrom
Phonic-Co:qiong/fix-generate-reply-await-playout
Mar 19, 2026
Merged

Fix Phonic generate_reply to resolve with the current GenerationCreatedEvent#5147
theomonnom merged 4 commits intolivekit:mainfrom
Phonic-Co:qiong/fix-generate-reply-await-playout

Conversation

@qionghuang6
Copy link
Contributor

@qionghuang6 qionghuang6 commented Mar 18, 2026

Previously, the Phonic plugin's generate_reply immediately creates a dummy generation and resolves the returned future before the request immediately,

This means that when we do await session.generate_reply(instructions="Greet the user, asking about their day."), this would actually finish at the wrong time, since with the other plugins, this resolves when the reply speech actually finishes (or if it were interrupted by user speech).

This change makes it rather than making a dummy generation, we resolve the generate_reply future the next time assistant_started_speaking is received from Phonic and we make a new generation.

We can repro this behavior when doing something like:

    logger.info(f"Generating reply, {time.time()}")
    await session.generate_reply(
        instructions="Greet the user, asking about their day.",
    )
    logger.info(f"Reply generated, {time.time()}")

Please see the video demo for a repro:

https://screen.studio/share/00Yuk0vN?state=uploading

@qionghuang6 qionghuang6 marked this pull request as ready for review March 18, 2026 21:44
devin-ai-integration[bot]

This comment was marked as resolved.

@theomonnom
Copy link
Member

theomonnom commented Mar 19, 2026

If we send multiple generate_reply, are we sure the future is picking the right one? It isn't really synchronized with the server?

self._close_current_generation(interrupted=False)

if self._pending_generate_reply_fut and not self._pending_generate_reply_fut.done():
self._pending_generate_reply_fut.cancel()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g the future here is cancelled, but nothing is cancelled on the serverside right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phonic currently doesn't explicitly associate any audio chunks with a certain generate_reply message, but any generate_reply message received over the WebSocket will cease any existing audio generation and begin a new one, so the next assistant_started_speaking message will correspond to that of the latest generate_reply message.

I think in this case I'm just cancelling the future so that it doesn't finish with an exception.

@theomonnom theomonnom merged commit 3c20a1a into livekit:main Mar 19, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants