# Ambient Patient 
This notebook helps you get started with the [Ambient Healthcare Agent for Patients](https://github.com/NVIDIA-AI-Blueprints/ambient-patient) repository, where we have an example implementation of a healthcare agent assisting with the patient intake process via voice interactions powered by NVIDIA ACE Controller, and safeguarded by NVIDIA NeMo Guardrails.

## Prerequisites
- [Docker Compose](https://docs.docker.com/compose/install/)
- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
- [NVIDIA API Key](https://build.nvidia.com) This notebook uses NVIDIA NIM microservices hosted on build.nvidia.com for the majority of services. A NVIDIA API Key is required.
- [NGC API Key](https://docs.nvidia.com/ngc/latest/ngc-private-registry-user-guide.html#ngc-api-keys) A NGC API Key is required to utilize the various NGC assets needed in this Developer Example.

## NIMs Utilized
### What is a NIM?
NVIDIA NIM provides containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models across clouds, data centers, and RTX™ AI PCs and workstations. NIM microservices expose industry-standard APIs for simple integration into AI applications, development frameworks, and workflows and optimize response latency and throughput for each combination of foundation model and GPU. Learn more about NIMs at https://developer.nvidia.com/nim.
### NIMs in Ambient Patients
- meta/llama-3.3-70b-instruct
- nvidia/llama-3.1-nemoguard-8b-content-safety (Optional for NeMo Guardrails)
- nvidia/llama-3.1-nemoguard-8b-topic-control (Optional for NeMo Guardrails)
- nvidia/magpie-tts-multilingual
- nvidia/parakeet-ctc-1.1b-asr

## Hardware Requirements
This notebook deploys all NIMs utilized in Ambient Healthcare Agents for Patients locally on this instance.

Running this notebook requires:
### Disk Space

302 GB of disk space

### GPU Requirement

Use | Service(s)| Recommended GPU* 
--- | --- | --- 
[RIVA ASR NIM](https://build.nvidia.com/nvidia/parakeet-ctc-1_1b-asr/modelcard) | `nvidia/parakeet-ctc-1_1b-asr` |  1 x various options including L40, A100, and more (see [modelcard](https://build.nvidia.com/nvidia/parakeet-ctc-1_1b-asr/modelcard))
[RIVA TTS NIM](https://build.nvidia.com/nvidia/magpie-tts-multilingual/modelcard) | `nvidia/magpie-tts-multilingual` | 1 x various options including L40, A100, and more (see [modelcard](https://build.nvidia.com/nvidia/parakeet-ctc-1_1b-asr/modelcard)) 
Instruct Model for Agentic Orchestration | `llama-3.3-70b-instruct` | 2 x H100 80GB <br /> or <br />4 x A100 80GB
[NemoGuard Content Safety Model](https://build.nvidia.com/nvidia/llama-3_1-nemoguard-8b-content-safety/modelcard) (Optional for Enabling NeMo Guardrails) | `nvidia/llama-3_1-nemoguard-8b-content-safety` | 1x options including A100, H100, L40S, A6000
[NemoGuard Topic Control Model](https://build.nvidia.com/nvidia/llama-3_1-nemoguard-8b-topic-control/modelcard) (Optional for Enabling NeMo Guardrails) | `nvidia/llama-3_1-nemoguard-8b-topic-control` | 1x options including A100, H100, L40S, A6000
**Total** | Entire Ambient Healthcare Agent for Patients  | 8 x A100 80GB <br /> or other combinations of the above

*For details on optimized configurations for LLMs, please see the documentation [Supported Models for NVIDIA NIM for LLMs](https://docs.nvidia.com/nim/large-language-models/latest/supported-models.html).

Alternatively, if you're interested in utilizing the public NVIDIA NIM microservices hosted on build.nvidia.com for all microservices, follow the [docker compose deploy using public endpoints](https://github.com/NVIDIA-AI-Blueprints/ambient-patient/blob/main/docs/docker-compose-deploy-using-public-endpoints.md) documentation outside of this notebook. 



# Checking Docker Storage Location

Before we start with this notebook, we need to check the Docker storage location of the Brev instance. Since self deploying the NIMs will require 302 GB of disk space for the Docker related artifacts, we need to make sure the docker storage is specified to a location with enough disk space.


In [None]:
# view the disk space of the Brev instance you are using,
# you should see a partition /ephemeral with enough space (more than 302 GB)
!df -h

In [None]:
# next view the content of the docker service file
!cat /etc/docker/daemon.json

If the "data-root" is not specified, or specified to a partition that does not have enough disk space, modify the /etc/docker/daemon.json file so that it has `"data-root": "/ephemeral"`:
```
{
 ...,
 "data-root": "/ephemeral"
}
```

Open the terminal, and run 
```
sudo nano /etc/docker/daemon.json 
```

to open the file for editing.

In [None]:
# view the content of the docker service file to make sure it has the correct setting
!cat /etc/docker/daemon.json

In [None]:
# then restart the docker service
!sudo systemctl restart docker

### Update Submodule for `ambient-patient`
We should currently be in the repo `ambient-healthcare-agents`. Both `ambient-patient` and `ambient-provider` are submodules of the repo. Since we would like to go through the content in `ambient-patient`, first let's initialize and update the submodules to get the submodules' content

In [None]:
# this should show the current directory as `.../ambient-healthcare-agents`
%pwd

In [None]:
# update the submodules
!git submodule init && git submodule update --remote --recursive

### Log in to NGC
First, authenticate Docker with nvcr.io.

In [None]:
import subprocess
import os

NGC_API_KEY = input("Enter your NGC API key: ")
cmd = f"echo {NGC_API_KEY} | docker login nvcr.io -u '$oauthtoken' --password-stdin"
result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
print(result.stdout)


You should see `Login Succeeded`.



## Getting Started 
### 1. Set Environment Variables for Agent Backend
Read through this section [1. Edit the vars.env file to set environment variables](https://github.com/NVIDIA-AI-Blueprints/ambient-patient/tree/main/agent#1-edit-the-varsenv-file-to-set-environment-variables) in the agent/README to configure each of the environment variables needed in [agent/vars.env](ambient-patient/agent/vars.env) for bringing up the agent backend.

- Since we are going to self host the agent LLM NIM, set both of the AGENT_LLM_BASE_URL and AGENT_LLM_MODEL differently:
    ```sh
    AGENT_LLM_BASE_URL="http://agent-instruct-llm:8000/v1"
    AGENT_LLM_MODEL="meta/llama-3.3-70b-instruct"
    ```
- In order to utilize the NemoGuard NIMs for Nemo Guardrails around your agent LLM, set the config path for NeMo Guardrails:
    ```sh
    NEMO_GUARDRAILS_CONFIG_PATH=nmgr-config-store/patient-intake-nemoguard-self-hosted-nim
    ```
    Note the differences in base_url and model_name in the config.yml files for directories `patient-intake-nemoguard-self-hosted-nim` and `patient-intake-nemoguard`.

Now, open [agent/vars.env](ambient-patient/agent/vars.env) and edit your variables, then save the file.

In [None]:
# navigate to ambient-patient
%cd ambient-patient
%pwd

In [None]:
# Make sure you have opened vars.env and finished editing the variables.
# Now, let's check the vars.env file.
!cat agent/vars.env

Check the output to make sure that in your [agent/vars.env](ambient-patient/agent/vars.env) file that you have set 
```
AGENT_LLM_BASE_URL="http://agent-instruct-llm:8000/v1"
AGENT_LLM_MODEL="meta/llama-3.3-70b-instruct"
```
and 
```
NEMO_GUARDRAILS_CONFIG_PATH=nmgr-config-store/patient-intake-nemoguard-self-hosted-nim
```

### 2. Deploy Agent LLM NIM Locally
We will first bring up the LLM powering the agent.

In [None]:
# set the GPU IDs for the meta/llama-3.3-70b-instruct NIM
# since we are on a system of A100s, we use 4 GPUs
# if you are on a different compute setup, set the GPU IDs accordingly
os.environ['AGENT_LLM_GPU_ID'] = "0,1,2,3"

In [None]:
# now run the NIM container, which will take a few minutes to pull
try:
    result = subprocess.run(
    ["docker", "compose", "-f", "agent/docker-compose.yaml", "up", "-d", "agent-instruct-llm"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
    check=True
    )
    print(result.stdout[-1000:], flush=True)
except subprocess.CalledProcessError as e:
    print(e.stderr)

In [None]:
# check that the container is running
result = subprocess.run(
    ["docker", "ps", "--format", "table {{.Names}}\t{{.Image}}\t{{.Status}}"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
)

print(result.stdout)

You should see the name `agent-instruct-lm` in the result:

NAMES  |  IMAGE | STATUS
--- | --- | --- 
agent-instruct-llm  |nvcr.io/nim/meta/llama-3.3-70b-instruct:1.8.5   |Up About a minute (health: starting)

Note: after the image is pulled, it should take less than 20 minutes for the status of the container to change from starting to healthy. You can continue to the next steps 3 and 4 while the status is health: starting.

### 3. Deploy NemoGuard NIMs Locally
   
Since we would like to utilize the NemoGuard NIMs for guardrailing, and you have set `NEMO_GUARDRAILS_CONFIG_PATH=nmgr-config-store/patient-intake-nemoguard-self-hosted-nim` in your vars.env file, first set the GPU IDs for the NemoGuard NIMs.

In [None]:
# set the GPU IDs for the NemoGuard NIMs utilized by NemoGuardrails
# since we are on a system of A100s, specify the 5th and 6th GPU
# if you are on a different compute setup, set the GPU IDs accordingly
os.environ['NEMOGUARD_CONTENT_SAFETY_LLM_GPU_ID'] = "4"
os.environ['NEMOGUARD_TOPIC_CONTRIL_LLM_GPU_ID'] = "5"

Next bring up the two NemoGuard NIMs deployment:

In [None]:
# this will spin up both NIM containers:
try:
    result = subprocess.run(
        ["docker", "compose", "-f", "agent/docker-compose.yaml", "up", "-d", 
         "nemoguard-content-safety-llm", "nemoguard-topic-control-llm"],
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        text=True,
        check=True
    )
    print(result.stdout[-1000:], flush=True)
except subprocess.CalledProcessError as e:
    print(e.stderr)

In [None]:
# check that the containers are running
result = subprocess.run(
    ["docker", "ps", "--format", "table {{.Names}}\t{{.Image}}\t{{.Status}}"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
)

print(result.stdout)

You should see the new names `nemoguard-content-safety-llm` and `nemoguard-topic-control-llm` in the result:

NAMES  | IMAGE                  |      STATUS
--- | --- | --- 
nemoguard-content-safety-llm |nvcr.io/nim/nvidia/llama-3.1-nemoguard-8b-content-safety:1.10.1 |Up 7 minutes (healthy)
nemoguard-topic-control-llm          |nvcr.io/nim/nvidia/llama-3.1-nemoguard-8b-topic-control:1.10.1  |Up 7 minutes (healthy)
...

Note: after the images are pulled, it should take about 5 minutes for the status of the containers to change from starting to healthy. You can continue to the next step 4 while the status is health: starting.

### 4. Deploy Agent Backend App Server

 We will be bringing up the `app-server` service in [agent/docker-compose.yaml](https://github.com/NVIDIA-AI-Blueprints/ambient-patient/blob/main/agent/docker-compose.yaml)

 Double check that your environment variables are set correctly according to 
 [1. Set Environment Variables for Agent Backend](https://github.com/NVIDIA-AI-Blueprints/ambient-patient/tree/main/agent#1-edit-the-varsenv-file-to-set-environment-variables), then bring up the app-server service:

We will be bringing up the `app-server` service in [agent/docker-compose.yaml](https://github.com/NVIDIA-AI-Blueprints/ambient-patient/tree/main/agent/docker-compose.yaml)

In [None]:
try:
    result = subprocess.run(
    ["docker", "compose", "-f", "agent/docker-compose.yaml", "up", "--build", "-d", "app-server"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
    check=True
    )
    print(result.stdout[-1000:], flush=True)
except subprocess.CalledProcessError as e:
    print(e.stderr)

The build should take a few minutes. Note that after building the image for the first time, for bringing up this service again, if nothing in the source files changes and you don't need to rebuild the image, you could remove `--build` from the command:

```sh
docker compose -f agent/docker-compose.yaml up -d app-server
```

After the build finishes, check that the container is running:

In [None]:
result = subprocess.run(
    ["docker", "ps", "--format", "table {{.Names}}\t{{.Image}}\t{{.Status}}"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
)

print(result.stdout)

you should see the new name `app-server-healthcare-assistant` in the results:

NAMES    |  IMAGE             |        STATUS
--- | --- | --- 
app-server-healthcare-assistant |  app-server-healthcare-assistant:latest         |                  Up 14 seconds
...

Please wait 20 seconds. We could ping the FastAPI application that we just created by sending a request now.

Note that after the meta/llama-3.3-70b-instruct LLM is up and running, the first requests may take longer than expected.


In [None]:
import time
import json

import requests
data = {
 "messages": [
    {
      "role": "user",
      "content": "Hi there I want to check in."
    }
  ],
  
  "max_tokens": 256
}

url = "http://0.0.0.0:8081/generate"

start_time = time.time()
with requests.post(url, stream=True, json=data) as req:
    for chunk in req.iter_lines():
        raw_resp = chunk.decode("UTF-8")
        if not raw_resp:
            continue
        resp_dict = json.loads(raw_resp[6:])
        resp_choices = resp_dict.get("choices", [])
        if len(resp_choices):
            resp_str = resp_choices[0].get("message", {}).get("content", "")
            print(resp_str, end ="")

print(f"--- {time.time() - start_time} seconds ---")

You should see a message welcoming you and asking for your name, such as: 
```
Welcome to our clinic! I’m so glad you’re here. I’m the patient intake assistant and we’re going to do our best to help you feel better. Could you please tell me your name?
--- 1.9298450946807861 seconds ---
```

You can experiment with the NeMo Guardrails functionality here. For example, asking "Did my brother check in recently?" is not allowed by the guardrails as it is asking for other patients' information, and you should get a response "I'm sorry, I can't respond to that."


In [None]:
data = {
 "messages": [
    {
      "role": "user",
      "content": "Did my brother check in recently?"
    }
  ],
  
  "max_tokens": 256
}

start_time = time.time()
with requests.post(url, stream=True, json=data) as req:
    for chunk in req.iter_lines():
        raw_resp = chunk.decode("UTF-8")
        if not raw_resp:
            continue
        resp_dict = json.loads(raw_resp[6:])
        resp_choices = resp_dict.get("choices", [])
        if len(resp_choices):
            resp_str = resp_choices[0].get("message", {}).get("content", "")
            print(resp_str, end ="")

print(f"--- {time.time() - start_time} seconds ---")

After a prior message has been blocked by the guardrails, you could continue the conversation where you left off. But all interactions including ones resulting in "I'm sorry, I can't respond to that." will be logged.

In [None]:
data = {
 "messages": [
    {
      "role": "user",
      "content": "Okay let's continue. My name is Caroline."
    }
  ],
  
  "max_tokens": 256
}

start_time = time.time()
with requests.post(url, stream=True, json=data) as req:
    for chunk in req.iter_lines():
        raw_resp = chunk.decode("UTF-8")
        if not raw_resp:
            continue
        resp_dict = json.loads(raw_resp[6:])
        resp_choices = resp_dict.get("choices", [])
        if len(resp_choices):
            resp_str = resp_choices[0].get("message", {}).get("content", "")
            print(resp_str, end ="")

print(f"--- {time.time() - start_time} seconds ---")

### 5. Set Environment Variables for the ace-controller Voice UI
We will need to set the environment variables in [`ace-controller-voice-interface/ace_controller.env`](ambient-patient/ace-controller-voice-interface/ace_controller.env).

In [None]:
# view the content of the ace_controller.env file
!cat ace-controller-voice-interface/ace_controller.env

Follow this section [Setup API Keys and Configure Service Settings](https://github.com/NVIDIA-AI-Blueprints/ambient-patient/tree/main/ace-controller-voice-interface/README.md#setup-api-keys-and-configure-service-settings) in the ace-controller-voice-interface/README and set the variables in [`ace-controller-voice-interface/ace_controller.env`](https://github.com/NVIDIA-AI-Blueprints/ambient-patient/tree/main/ace-controller-voice-interface/ace_controller.env).

- Set your API Keys
- Since we're self hosting the RIVA ASR and TTS NIMs and not using the public endpoints, change the default   `CONFIG_PATH` to 

    ```sh
    CONFIG_PATH=./configs/config_riva_self_hosting.yaml
    ```

Now, edit the [`ace-controller-voice-interface/ace_controller.env`](ambient-patient/ace-controller-voice-interface/ace_controller.env) file and save it before moving on.

In [None]:
# view the content of the ace_controller.env file before proceeding
!cat ace-controller-voice-interface/ace_controller.env

### 6. Deploy RIVA NIMS
First, set the GPU IDs for the RIVA NIMs.

In [None]:
# since in the brev launchable with 8 x A100s, we could set the RIVA ASR and TTS NIMs to the 7th and 8th GPU
# both NIMs could also be deployed on the same A100
# if you are on a different compute setup, set the GPU IDs accordingly
os.environ['RIVA_ASR_NIM_GPU_ID'] = "6"
os.environ['RIVA_TTS_NIM_GPU_ID'] = "7"

Next deploy the RIVA NIMs. 

When in the Brev launchable, the RIVA NIMs should have prebuilt model profiles, but if not running in the Brev launchable and self deploying the RIVA NIMs, please see the [Known Issues](https://github.com/NVIDIA-AI-Blueprints/ambient-patient/blob/main/docs/known_issues.md) documentation on the RIVA TTS NIM known issue.


In [None]:
# bring up the RIVA ASR AND TTS NIMs
try:
    result = subprocess.run(
        ["docker", "compose", "--profile", "riva-nims-local", "-f", "ace-controller-voice-interface/docker-compose.yml", "up", "-d"],
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        text=True,
        check=True
    )
    print(result.stdout[-1000:], flush=True)
except subprocess.CalledProcessError as e:
    print(e.stderr)

In [None]:
# check that the containers are up and running
result = subprocess.run(
    ["docker", "ps", "--format", "table {{.Names}}\t{{.Image}}\t{{.Status}}"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
)

print(result.stdout)

You should see the additional names `voice-agents-webrtc-riva-tts-magpie-1` and `voice-agents-webrtc-riva-asr-parakeet-1` in the result:

NAMES     |        IMAGE       |         STATUS
--- | --- | --- 
voice-agents-webrtc-riva-asr-parakeet-1  | nvcr.io/nim/nvidia/parakeet-1-1b-ctc-en-us:1.3.0  | Up 2 minutes (healthy)
voice-agents-webrtc-riva-tts-magpie-1    | nvcr.io/nim/nvidia/magpie-tts-multilingual:1.3.0   |Up 2 minutes (health: starting)
...

Note: after the images are pulled, it should take about 8 minutes for the status of the containers to change from starting to healthy.


### 7. Stand up a Turn Server
When deploying on cloud providers such as Brev, a Turn server is needed. A Turn server is needed for WebRTC connections when clients are behind NATs or firewalls that prevent direct peer-to-peer communication. 
#### 7.1: Run the Turn server docker container


In [None]:
import requests
# find out the external ip address of the instance
HOST_IP_EXTERNAL = requests.get('https://ifconfig.me').text.strip()
print(HOST_IP_EXTERNAL)

In [None]:
# set the environment variable
os.environ['HOST_IP_EXTERNAL'] = HOST_IP_EXTERNAL

# run the turn server docker container
try:
    result = subprocess.run(
    ["docker", "run", "-d", "--name", "turn-server", "--network=host", "instrumentisto/coturn",
    "-n", "--verbose", "--log-file=stdout", "--external-ip=" + HOST_IP_EXTERNAL, "--listening-ip=0.0.0.0",
    "--lt-cred-mech", "--fingerprint", "--user=admin:admin", "--no-multicast-peers", "--realm=tokkio.realm.org",
    "--min-port=51000", "--max-port=51010"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
    check=True
    )
    print(result.stdout[-1000:], flush=True)
except subprocess.CalledProcessError as e:
    print(e.stderr)

In [None]:
# check that the container is running
result = subprocess.run(
    ["docker", "ps", "--format", "table {{.Names}}\t{{.Image}}\t{{.Status}}"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
)

print(result.stdout)

You should see the container named `turn-server` in the results:

NAMES    |   IMAGE     |       STATUS
--- | --- | --- 
turn-server    | instrumentisto/coturn   |         Up 25 seconds
...

#### 7.2: Modify ace-controller app configuration
Next, modify the `ace_controller.env` file and `config.ts` file under `ace-controller-voice-interface`. The `config.ts` file will be utilized in the docker build process for the ui-app container from the webrtc_ui example.

In [None]:
# view the content of the existing ace-controller-voice-interface/ace_controller.env
!cat ace-controller-voice-interface/ace_controller.env

In [None]:
# add three relevant env vars to ace-controller-voice-interface/ace_controller.env
!echo -e "\n\nTURN_USERNAME=admin\nTURN_PASSWORD=admin\nTURN_SERVER_URL=turn:$HOST_IP_EXTERNAL:3478" >> ace-controller-voice-interface/ace_controller.env

In [None]:
# check the modified content of the ace-controller-voice-interface/ace_controller.env
!cat ace-controller-voice-interface/ace_controller.env

In [None]:
# next check the content of the existing config.ts file
!cat ace-controller-voice-interface/config.ts

In [None]:
# replace the ice server definition in config.ts
!sed -i "s/export const RTC_CONFIG = {};/export const RTC_CONFIG: ConstructorParameters<typeof RTCPeerConnection>[0] = {\n    iceServers: [\n      {\n        urls: \"turn:$HOST_IP_EXTERNAL:3478\",\n        username: \"admin\",\n        credential: \"admin\",\n      },\n    ],\n  };/" ace-controller-voice-interface/config.ts

In [None]:
# next check the modified content of the config.ts file
!cat ace-controller-voice-interface/config.ts

#### 7.3: Expose ports on your cloud provider instance

On the cloud provider instance, make sure the following ports are exposed:
- 4400
- 7860
- 3478
- 51000-51010 (this is from the range specified by the Turn server docker run command)

If on Brev, make sure the ports have been exposed using the `TCP/UDP Ports` section in your web console's `Access` tab. In the end your section should look like this: https://github.com/NVIDIA-AI-Blueprints/ambient-patient/blob/main/docs/images/all_ports_exposed.png


### 8. Deploy the ace-controller Python App and WebRTC UI

In [None]:
try:
    result = subprocess.run(
    ["docker", "compose", "--profile", "ace-controller", "-f", "ace-controller-voice-interface/docker-compose.yml", "up", "--build", "-d"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
    check=True
    )
    print(result.stdout[-1000:], flush=True)
except subprocess.CalledProcessError as e:
    print(e.stderr)


The build should take a few minutes. Remove `--build` from the command if you haven't change any sources files and are bringing up the services again:
```sh
docker compose --profile ace-controller -f ace-controller-voice-interface/docker-compose.yml up -d
```

In [None]:
# check that the containers are running
result = subprocess.run(
    ["docker", "ps", "--format", "table {{.Names}}\t{{.Image}}\t{{.Status}}"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
)

print(result.stdout)


You should see the new names `voice-agents-webrtc-ui-app-1` and `voice-agents-webrtc-python-app-1` in the docker ps results, for example:


NAMES                  |            IMAGE                       |             STATUS
--- | --- | ---
voice-agents-webrtc-python-app-1       |   voice-agents-webrtc-python-app                                |    Up 6 minutes (healthy)
voice-agents-webrtc-ui-app-1           |   voice-agents-webrtc-ui-app                                    |    Up 6 minutes
turn-server                            |   instrumentisto/coturn                                         |    Up 10 minutes
voice-agents-webrtc-riva-tts-magpie-1  |   nvcr.io/nim/nvidia/magpie-tts-multilingual:1.3.0               |   Up 25 minutes (healthy)
voice-agents-webrtc-riva-asr-parakeet-1  | nvcr.io/nim/nvidia/parakeet-1-1b-ctc-en-us:1.3.0               |   Up 25 minutes (healthy)
app-server-healthcare-assistant         |  app-server-healthcare-assistant:latest                           | Up 39 minutes
nemoguard-content-safety-llm            |  nvcr.io/nim/nvidia/llama-3.1-nemoguard-8b-content-safety:1.10.1  | Up 50 minutes (healthy)
nemoguard-topic-control-llm             |  nvcr.io/nim/nvidia/llama-3.1-nemoguard-8b-topic-control:1.10.1    |Up 50 minutes (healthy)
agent-instruct-llm                      |  nvcr.io/nim/meta/llama-3.3-70b-instruct:1.8.5                 |  Up 59 minutes (healthy)



### 9. Go to the Voice UI in your Web Browser

First, to enable microphone access in Chrome, go to `chrome://flags/` in your Chrome browser, enable "Insecure origins treated as secure", add `http://<brev-instance-ip>:4400` to the list, and restart Chrome.

You could find the brev instance ip by looking at the expose port section.

Next, go to `http://<brev-instance-ip>:4400` in your browser to visit the voice UI. Upon loading, the page should look like the following: https://github.com/NVIDIA-AI-Blueprints/ambient-patient/blob/main/ace-controller-voice-interface/assets/ui_at_start.png

Please note that after the meta/llama-3.3-70b-instruct LLM is up and running, the first requests may take longer than expected.

#### Troubleshooting
##### Permission Issue
If you're getting an error `Cannot read properties of undefined (reading 'getUserMedia')`, that means you have not enabled microphone access in Chrome. Go to `chrome://flags/`, enable "Insecure origins treated as secure", add `http://<machine-ip>:4400` to the list, and restart Chrome.

![](ambient-patient/docs/images/webpage_permission_error.png)

##### Timeout Issue
If you're getting a timeout issue where the button shows `Connecting...` and then "WebRTC connection failed", double check all the steps in the document. It's likely due to incorrect configurations.

![](ambient-patient/docs/images/webrtc_connection_failed.png)

After setting the correct configurations, make sure to **close the browser tab**, and open a new browser tab to access the application. If that doesn't seem to work, clear your browser cache and open the link again.


### 10. Bring down the services
Bring down all services specified in the agent backend

In [None]:
!docker compose -f agent/docker-compose.yaml down 

Bring down all services specified in the ace-controller web UI

In [None]:
!docker compose -f ace-controller-voice-interface/docker-compose.yml down
