# Deploy a Voice AI bot with Pipecat AI and NIM (and Riva TTS & STT)
In this notebook, we walk through how to craft and deploy a voice AI bot using Pipecat AI. We illustrate the basic Pipecat flow with the `nvidia/llama-3.1-nemotron-70b-instruct` LLM model and Riva for STT (Speech-To-Text) & TTS (Text-To-Speech). However, Pipecat is not opinionated and other models and TTS/STT services can easily be used. See [Pipecat documentation](https://docs.pipecat.ai/server/services/supported-services#supported-services) for other supported services.

Pipecat AI is an open-source framework for building voice and multimodal conversational agents. Pipecat simplifies the complex voice-to-voice AI pipeline, and lets developers build AI capabilities easily and with Open Source, commercial, and custom models.
The framework was developed by Daily, a company that has provided real-time video and audio communication infrastructure since 2016. It is fully vendor neutral and is not tightly coupled to Daily's infrastructure.

## Step 1 - Install dependencies
Here we use Daily for transport, OpenAI for context aggregation, Riva for TTS & TTS, and Silero for VAD (Voice Activity Detection). If using different services, for example Cartesia for TTS, one would run `pip install pipecat-ai[cartesia]`.

In [1]:
!pip install pipecat-ai[daily,openai,riva,silero]


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


## Step 2 - Basic AI Voice bot Pipecat code

In [3]:
# Url to talk to the NVIDIA NIM bot
DAILY_SAMPLE_ROOM_URL="https://pc-34b1bdc94a7741719b57b2efb82d658e.daily.co/prod-test"

In [1]:
import os
import sys

from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import LLMMessagesFrame, EndFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.nim import NimLLMService
from pipecat.services.riva import FastPitchTTSService, ParakeetSTTService
from pipecat.transports.services.daily import DailyParams, DailyTransport

async def main():
    transport = DailyTransport(
        DAILY_SAMPLE_ROOM_URL,
        None,
        "NVIDIA NIM",
        DailyParams(
            audio_out_enabled=True,
            vad_enabled=True,
            vad_analyzer=SileroVADAnalyzer(),
            vad_audio_passthrough=True,
        ),
    )

    stt = ParakeetSTTService(api_key=os.getenv("NVIDIA_API_KEY"))

    llm = NimLLMService(
        api_key=os.getenv("NVIDIA_API_KEY"), model="meta/llama-3.1-70b-instruct"
    )

    tts = FastPitchTTSService(api_key=os.getenv("NVIDIA_API_KEY"))

    messages = [
        {
            "role": "system",
            "content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way that makes a cat pun if it is possible.",
        },
    ]

    context = OpenAILLMContext(messages)
    context_aggregator = llm.create_context_aggregator(context)

    pipeline = Pipeline(
        [
            transport.input(),  # Transport user input
            stt,  # STT
            context_aggregator.user(),  # User responses
            llm,  # LLM
            tts,  # TTS
            transport.output(),  # Transport bot output
            context_aggregator.assistant(),  # Assistant spoken responses
        ]
    )

    task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))

    @transport.event_handler("on_first_participant_joined")
    async def on_first_participant_joined(transport, participant):
        # Kick off the conversation.
        messages.append({"role": "system", "content": "Please introduce yourself to the user and deliver a cat fact."})
        await task.queue_frames([LLMMessagesFrame(messages)])

    @transport.event_handler("on_participant_left")
    async def on_participant_left(transport, participant, reason):
        print(f"Participant left: {participant}")
        await task.queue_frame(EndFrame())            

    runner = PipelineRunner()

    await runner.run(task)

## Step 3 - Run the bot! Then talk to the bot [HERE](https://pc-34b1bdc94a7741719b57b2efb82d658e.daily.co/prod-test)

In [4]:
import asyncio
import nest_asyncio
nest_asyncio.apply()
from loguru import logger

logger.add(sys.stderr, level="DEBUG")
asyncio.run(main())

[32m2024-12-12 20:43:34.436[0m | [1mINFO    [0m | [36mpipecat.audio.vad.vad_analyzer[0m:[36mset_params[0m:[36m69[0m - [1mSetting VAD params to: confidence=0.7 start_secs=0.2 stop_secs=0.8 min_volume=0.6[0m
[32m2024-12-12 20:43:34.436[0m | [1mINFO    [0m | [36mpipecat.audio.vad.vad_analyzer[0m:[36mset_params[0m:[36m69[0m - [1mSetting VAD params to: confidence=0.7 start_secs=0.2 stop_secs=0.8 min_volume=0.6[0m
[32m2024-12-12 20:43:34.436[0m | [1mINFO    [0m | [36mpipecat.audio.vad.vad_analyzer[0m:[36mset_params[0m:[36m69[0m - [1mSetting VAD params to: confidence=0.7 start_secs=0.2 stop_secs=0.8 min_volume=0.6[0m
[32m2024-12-12 20:43:34.438[0m | [34m[1mDEBUG   [0m | [36mpipecat.audio.vad.silero[0m:[36m__init__[0m:[36m114[0m - [34m[1mLoading Silero VAD model...[0m
[32m2024-12-12 20:43:34.438[0m | [34m[1mDEBUG   [0m | [36mpipecat.audio.vad.silero[0m:[36m__init__[0m:[36m114[0m - [34m[1mLoading Silero VAD model...[0m
[32m2024-12

Participant left: {'id': '8c9697b5-a58a-40b7-9286-1277fffc8593', 'info': {'isLocal': False, 'joinedAt': 1734057824, 'permissions': {'canAdmin': [], 'canSend': ['screenAudio', 'camera', 'customAudio', 'screenVideo', 'microphone', 'customVideo'], 'hasPresence': True}, 'userName': 'vanessa', 'isOwner': False}}


[32m2024-12-12 20:44:14.036[0m | [34m[1mDEBUG   [0m | [36mpipecat.transports.base_output[0m:[36m_bot_stopped_speaking[0m:[36m218[0m - [34m[1mBot stopped speaking[0m
[32m2024-12-12 20:44:14.036[0m | [34m[1mDEBUG   [0m | [36mpipecat.transports.base_output[0m:[36m_bot_stopped_speaking[0m:[36m218[0m - [34m[1mBot stopped speaking[0m
[32m2024-12-12 20:44:14.036[0m | [34m[1mDEBUG   [0m | [36mpipecat.transports.base_output[0m:[36m_bot_stopped_speaking[0m:[36m218[0m - [34m[1mBot stopped speaking[0m
[32m2024-12-12 20:44:14.041[0m | [1mINFO    [0m | [36mpipecat.transports.services.daily[0m:[36mleave[0m:[36m435[0m - [1mLeaving https://pc-34b1bdc94a7741719b57b2efb82d658e.daily.co/prod-test[0m
[32m2024-12-12 20:44:14.041[0m | [1mINFO    [0m | [36mpipecat.transports.services.daily[0m:[36mleave[0m:[36m435[0m - [1mLeaving https://pc-34b1bdc94a7741719b57b2efb82d658e.daily.co/prod-test[0m
[32m2024-12-12 20:44:14.041[0m | [1mINFO    [0m