Skip to content

Comments

add bey avatar example#1669

Closed
niqodea wants to merge 36 commits intolivekit:mainfrom
niqodea:add-bey-example
Closed

add bey avatar example#1669
niqodea wants to merge 36 commits intolivekit:mainfrom
niqodea:add-bey-example

Conversation

@niqodea
Copy link
Contributor

@niqodea niqodea commented Mar 17, 2025

Add resources for the Beyond Presence API.

  • livekit-agents-bey: plugin to handle API calls and local setup for avatar generation
  • examples/avatar/bey: a basic script demonstrating how to use the API via the plugin

Marking this as a draft since the API is not live yet. Feedback on integration or improvements is welcome!

Note: we reserved the livekit-plugins-bey PyPI package name, let me know if I should add someone from the LiveKit team as owner. 🙏

@changeset-bot
Copy link

changeset-bot bot commented Mar 17, 2025

⚠️ No Changeset found

Latest commit: 7b8acb3

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@CLAassistant
Copy link

CLAassistant commented Mar 17, 2025

CLA assistant check
All committers have signed the CLA.

Comment on lines 80 to 109
# allow your local agent to publish transcripts on behalf of the avatar agent
.with_attributes({ATTRIBUTE_PUBLISH_ON_BEHALF: ctx.room.local_participant.identity})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this comment is correct

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's to mark that the avatar agent publishes video and audio on behalf of the agent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@longcw maybe related to our prior discussion on #1424: Is there a canonical way for hiding one of the call participants now such that the user has the experience of a 1:1 call?

How we implemented the plugin here is that only the avatar worker publishes to the room such that the agent can be hidden client side later, and we thought this line above allows the agent to publish transcripts in the name of the avatar.

However, from your comment it sounds like the intended way now is to somehow have the avatar publish audio/video in the name of the agent and then have the avatar worker hidden?

Maybe you can clarify this and give us some suggestions for best practices around this once you review this file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally it works in the latter way you described, the avatar publish audio/video in the name of the agent and then have the avatar worker hidden.

The client can detect this by reading the ATTRIBUTE_PUBLISH_ON_BEHALF attribute and then handle the avatar participant as designed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit is that the other operations related to the agent can keep unchanged, e.g. perform a RPC call, send a text or file through data stream to the agent. These operations need a dest participant identity, using the agent identity is straightforward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image I see the advantage of keeping the other operations unchanged but I guess the main thing we want to achieve is that we can speak to an avatar as a user right? Do you have an example of how the user could see the video through the "agent worker" in the frontend? We're using ATTRIBUTE_PUBLISH_ON_BEHALF like in your latest example on the dev branch https://github.com/livekit/agents/blob/dev-1.0/examples/avatar/agent_worker.py#L48 but in the end we get our "Avatar Worker" who outputs video and audio (see screenshot). The other "agent worker" neither outputs audio or video and I guess is just there to forward the user audio. So we're currently hiding that one (agent worker) on the frontend. So does the ATTRIBUTE_PUBLISH_ON_BEHALF config not work properly? I.e. should the "agent worker" receive the audio and video we send through the "avatar worker"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs some modification on the frontend client. For example, if you using livekit playground it will only show the avatar video without another participant (just to show it's feasible)
image

I think we will support this in client SDK or you can customize the frontend to hide the participant with ATTRIBUTE_PUBLISH_ON_BEHALF.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the misunderstanding part is that ATTRIBUTE_PUBLISH_ON_BEHALF is not supported automatically by the client sdk at this point.

Copy link
Contributor

@fa9r fa9r Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@longcw could you share a code snippet for how this is implemented in the LK playground?

EDIT: I'd also suggest to add this somewhere in your docs / avatar examples / ... so people know what's the proper way to handle it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I'll share an example on how to handle the ATTRIBUTE_PUBLISH_ON_BEHALF in the client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this requirements.txt correct? Not sure, other plugins seem to not specify livekit but maybe I am missing something

Comment on lines +1 to +11
# LiveKit Beyond Presence Avatar Example

This example demonstrates how to create an animated avatar using Beyond Presence that responds to audio input using LiveKit's agent system.
The avatar worker generates synchronized video and audio based on received audio input using the Beyond Presence API.

## How it Works

1. The LiveKit agent and the Beyond Presence avatar worker both join into the same LiveKit room as the user.
2. The LiveKit agent listens to the user and generates a conversational response, as usual.
3. However, instead of sending audio directly into the room, the agent sends the audio via WebRTC data channel to the Beyond Presence avatar worker.
4. The avatar worker only listens to the audio from the data channel, generates the corresponding avatar video, synchronizes audio and video, and publishes both back into the room for the user to experience.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be inclined to change "avatar worker" to "avatar agent", since that seems more in line with what it actually is (an agent joining the call). A worker, from what I understood, is a process that takes care of a job and can spawn zero or more agents for the room. WDYT?

Comment on lines +52 to +60
@local_agent_session.output.audio.on("playback_finished")
def on_playback_finished(ev: PlaybackFinishedEvent) -> None:
logger.info(
"playback_finished",
extra={"playback_position": ev.playback_position, "interrupted": ev.interrupted},
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied this instruction from other examples, what is its purpose exactly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here it's just for logging. You can ignore it.

Copy link
Contributor

@fa9r fa9r Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@longcw Any reason why this is included in all avatar examples then? Is there some common use case related to avatars that you would use this for?

If not, I'd probably suggest to keep the examples minimal and omit these

Suggested change
@local_agent_session.output.audio.on("playback_finished")
def on_playback_finished(ev: PlaybackFinishedEvent) -> None:
logger.info(
"playback_finished",
extra={"playback_position": ev.playback_position, "interrupted": ev.interrupted},
)

Comment on lines +6 to +11
## How it Works

1. The LiveKit agent and the Beyond Presence avatar worker both join into the same LiveKit room as the user.
2. The LiveKit agent listens to the user and generates a conversational response, as usual.
3. However, instead of sending audio directly into the room, the agent sends the audio via WebRTC data channel to the Beyond Presence avatar worker.
4. The avatar worker only listens to the audio from the data channel, generates the corresponding avatar video, synchronizes audio and video, and publishes both back into the room for the user to experience.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you see any problem with this explanation? Please let me know if you think something is wrong or unclear! 🙏

@niqodea
Copy link
Contributor Author

niqodea commented Mar 19, 2025

@longcw Since you're leading the integration of avatar examples for LK agents, I had a few questions:

  • Do you plan to merge [draft] Avatar integration example for agent 1.0 #1614 before or soon after the release of LK Agents 1.0?
  • Would it make sense to merge this PR into yours before you finalize it? If so, would you be open to taking over the example or plugin files?
  • Looking ahead, how do you see ownership of these components? For instance, would it make sense for the LiveKit team to take over the plugin project and design it as they see fit using our API docs as a reference? Or would it be better for us to maintain ownership and handle the plugin abstractions and API interactions on our end?

Let me know your thoughts. Happy to collaborate to make this as smooth as possible!

@longcw
Copy link
Contributor

longcw commented Mar 20, 2025

@longcw Since you're leading the integration of avatar examples for LK agents, I had a few questions:

  • Do you plan to merge [draft] Avatar integration example for agent 1.0 #1614 before or soon after the release of LK Agents 1.0?
  • Would it make sense to merge this PR into yours before you finalize it? If so, would you be open to taking over the example or plugin files?
  • Looking ahead, how do you see ownership of these components? For instance, would it make sense for the LiveKit team to take over the plugin project and design it as they see fit using our API docs as a reference? Or would it be better for us to maintain ownership and handle the plugin abstractions and API interactions on our end?

Let me know your thoughts. Happy to collaborate to make this as smooth as possible!

  1. Probably not, so you can create the PR to the dev branch for your plugin.
  2. I think it would be good if you can help to maintain the plugin if there is an API change on your sever side. I can help to review and clean it. Also, I'll help to update the plugin if there is any change on the agent side.

@niqodea
Copy link
Contributor Author

niqodea commented Mar 21, 2025

@longcw Thank you, makes sense! I'll rebase this on top of #1364 then.

@niqodea niqodea changed the base branch from longc/avatar-example to dev-1.0 March 21, 2025 14:41
@niqodea niqodea changed the base branch from dev-1.0 to longc/avatar-example March 21, 2025 15:02
@niqodea
Copy link
Contributor Author

niqodea commented Mar 21, 2025

I updated the PR to:

  1. Specify the correct setup of hiding the avatar agent and have it post audio and video on behalf of the local agent
  2. Remove avatar_agent_name as a configurable parameter in the plugin (since the avatar agent will be hidden anyway)
  3. Other minor refactoring

niqodea and others added 5 commits March 31, 2025 17:25
Co-authored-by: Felix Altenberger <felix@beyondpresence.ai>
Co-authored-by: Lucas Jacobson <lucas@beyondpresence.ai>
Co-authored-by: Nicola De Angeli <nicola@beyondpresence.ai>
@niqodea niqodea marked this pull request as ready for review April 14, 2025 10:09
@niqodea niqodea requested a review from longcw April 14, 2025 10:09
@niqodea
Copy link
Contributor Author

niqodea commented Apr 14, 2025

Hi @longcw, any actionable for me to help merging this into the main examples PR? 🙏

@longcw
Copy link
Contributor

longcw commented Apr 14, 2025

Hi @longcw, any actionable for me to help merging this into the main examples PR? 🙏

can you rebase this pr to main and change the target branch to main.

@niqodea niqodea changed the base branch from longc/avatar-example to main April 15, 2025 06:09
@niqodea
Copy link
Contributor Author

niqodea commented Apr 15, 2025

@longcw I merged the latest main since the previous history already had a lot of merges which made rebasing a bit difficult, hope that's also ok! The diff should now be meaningful again.

@longcw
Copy link
Contributor

longcw commented Apr 15, 2025

Thanks @niqodea! I have tested your avatar api with the token and it works well. If you don't mind I can take this one, I may create a new pr with some clean up.

@niqodea
Copy link
Contributor Author

niqodea commented Apr 15, 2025

Sure, go ahead! Thank you!

niqodea and others added 9 commits April 15, 2025 11:10
@github-actions
Copy link
Contributor

⚠️ Changeset Required

We detected changes in the following package(s) but no changeset file was found. Please add one for proper versioning:

  • livekit-agents

👉 Create a changeset file by clicking here.

@niqodea
Copy link
Contributor Author

niqodea commented Apr 15, 2025

@longcw I just merged the latest #1700 which contains a small fix and specifies the correct versions for livekit-plugins-bey.

@longcw longcw mentioned this pull request Apr 17, 2025
@longcw longcw closed this Apr 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants