<a href="https://www.nvidia.com/dli"> <img src="images/DLI_Header.png" alt="Header" style="width: 400px;"/> </a>

# Assessment: Virtual Assistant with Riva Speech AI and Question Answering

Question answering (QA) tasks consists of generating an answer, given a natural language query and a context (knowledge content).

- **Extractive QA**: Predict the span within the context with a start and end position which indicates the answer to the question.
- **Generative QA**: Generate a natural answer for the query with no constraint that the answer should be a span within the context.

Riva supports out-of-the-box extractive QA with a BERT model.

<img src="images/assess/bert_QA.png" width=500>

### Table of Contents
[The Problem](#The-Problem)<br>
[Scoring](#Scoring)<br>
[Step 1: Launch Riva Server](#Step-1:-Launch-Riva-Server)<br>
[Step 2: ASR Query](#Step-2:-ASR-Query)<br>
[Step 3: ASR Customization](#Step-3:-ASR-Customization)<br>
[Step 4: Question Answering](#Step-4:-Question-Answering)<br>
[Step 5: TTS Query](#Step-5:-TTS-Query)<br>
[Step 6: Simple Virtual Assistant](#Step-6:-Simple-Virtual-Assistant)<br>
[Step 7: Submit Your Assessment](#Step-7:-Submit-Your-Assessment)<br>

### Notebook Dependencies
To successfully run this notebook, be sure you have:

1. **NGC Credentials**<br>Be sure you have added your NGC credential as described in the [NGC Setup notebook](003_Intro_NGC_Setup.ipynb).  If you have restarted the course instance, you will need to repeat this step.

2. **Killed all Docker containers**<br>Run the following cell to make sure all containers are shut down.

In [None]:
# Start fresh...
# Clear Docker containers
!docker kill $(docker ps -q)
# Check for clean environment - this should be empty
!docker ps

3. **Cleared GPU Memory**<br>
Make sure you have shut down all other notebooks to fully clear GPU memory.  Verify this is the case by running the following cell and observing the Memory Usage information.

In [None]:
!nvidia-smi

---
# The Problem
In this assessment, you'll build a virtual assistant application that integrates ASR, TTS, and NLP services to answer questions about EMEA (Europe, Middle East, and Africa).  Your virtual assistant should be able to answer an audio question about EMEA and respond in speech with the answer to the question!  

<img src="images/assess/IVA_QA.png">

You'll need to add some customizations along the way to make this work properly for EMEA. In summary, you'll:
- Launch Riva ASR, TTS, and NLP (includes QA) services
- Customize the virtual assistant transcriber to recognize "EMEA"
- Build a simple dialog manager (DM) with extractive QA from external documents about EMEA
- Customize the virtual assistant pronunciation of "EMEA"
- Run the virtual assistant end-to-end for a complete conversational AI dialog


---
# Scoring
You will be assessed on your ability to effectively and efficiently build and deploy the application.  This coding assessment is worth 70 points, divided as follows:


| Step                         | Graded                                                    | FIXMEs?  | Points |
|------------------------------|-----------------------------------------------------------|----------|--------|
| 1. Launch Riva               | Riva Server (correct config; models run)                  |    4     |   12   |
| 2. ASR Query            | Request for transcription (check ASR config )                  |    3     |    9   |
| 3. ASR Customization    | Improve transcription with ASR customization (evaluate error)  |    3     |   15   |
| 4. Question Answering   | Request Q&A on new context (check returned answer)             |    2     |   10   |
| 5. TTS Query            | Request for TTS (check TTS config )                            |    3     |    9   |
| 6. Simple Virtual Assistant  | Put all together (check answer)                           |    3     |   15   |


Although you are very capable at this point of building the project without any help at all, some scaffolding is provided, including specific names for variables and files.  This is for the benefit of the autograder, so please use these constructs for your assessment.  In addition, output for your executed cells is periodically saved in the `my_assessment` directory for grading.  Along the way, there are a few opportunities to check your work to see if you are on the right track. 

Once you are confident that you've built a reliable virtual assistant, follow the instructions for submission at the end of the notebook.

### Resources and Hints

* **[Riva Speech Skills User's Guide](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/index.html)**<br>
* **Riva Deployment Examples**<br>
Review what you've learned in the [ASR Deployment](005_ASR_Deployment.ipynb) and [TTS Deployment](007_TTS_Deployment.ipynb) notebooks about how to deploy and query Riva ASR/TTS services.  
* **Riva Customization Examples**<br>
Review what you've learned in the [Full Pipeline](008_Full_Pipeline.ipynb) notebook about building a simple dialog manager and customizing Riva ASR/TTS services.

---
# Step 1: Launch Riva Server
### Set Up Project Paths, Libraries, and Models (not graded)
For the full pipeline, we'll need to deploy default models for ASR, TTS, and NLP services. The `riva_init.sh` command loads and builds models specific to the GPU you are using, but to save time for this course, these have been preloaded.

The next few cells create some useful path names and copy all of the optimized models into `/dli_workspace/riva-assessment-model-repo` for convenience.  This is the repo you must use for the assessment.

In [1]:
# Set the Riva Quick Start directory and model repo
WORKSPACE='/dli_workspace'
RIVA_QS = WORKSPACE + "/riva_quickstart"
RIVA_MODEL_REPO = WORKSPACE + "/riva-assessment-model-repo"
!mkdir -p $RIVA_MODEL_REPO

# load required libraries
import riva.client
import numpy as np
import IPython.display as ipd
import io
import time
import librosa

In [2]:
%%bash
# Copy all the ASR, TTS, and NLP models for convenience (faster deployment)
# Time is about 1-2 minutes for the copy
cp -rn  /dli_workspace/riva-asr-model-repo/* \
    /dli_workspace/riva-assessment-model-repo/

cp -rn  /dli_workspace/riva-tts-model-repo/* \
    /dli_workspace/riva-assessment-model-repo/

cp -rn  /dli_workspace/riva-full-model-repo/* \
    /dli_workspace/riva-assessment-model-repo/

In [3]:
# check to see what models are there now
!ls $RIVA_MODEL_REPO/models

conformer-en-US-asr-offline
conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline
conformer-en-US-asr-offline-endpointing-streaming-offline
conformer-en-US-asr-offline-feature-extractor-streaming-offline
conformer-en-US-asr-streaming
conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming
conformer-en-US-asr-streaming-endpointing-streaming
conformer-en-US-asr-streaming-feature-extractor-streaming
conformer-es-US-asr-offline
conformer-es-US-asr-offline-ctc-decoder-cpu-streaming-offline
conformer-es-US-asr-offline-endpointing-streaming-offline
conformer-es-US-asr-offline-feature-extractor-streaming-offline
conformer-es-US-asr-streaming
conformer-es-US-asr-streaming-ctc-decoder-cpu-streaming
conformer-es-US-asr-streaming-endpointing-streaming
conformer-es-US-asr-streaming-feature-extractor-streaming
fastpitch_hifigan_ensemble-English-US
intent_slot_detokenizer
intent_slot_label_tokens_weather
intent_slot_tokenizer-en-US-weather
qa_qa_postprocessor
qa_tokenizer-en-US
riva-onnx-fast

### Configure Riva (graded)

Open [config.sh](dli_workspace/riva_quickstart/config.sh) and modify it to deploy all three services (ASR, NLP, TTS) with the out-of-the-box default models, specifying the newly created model repository created above for this assessment. Save your work.
```
 service_enabled_asr=FIXME
 service_enabled_nlp=FIXME
 service_enabled_tts=FIXME
 riva_model_loc=FIXME
```

In [None]:
# Check your work - are all three services enabled? Is the model location repo correct? 
! cat dli_workspace/riva_quickstart/config.sh \
   2>&1|tee my_assessment/step1.txt # DO NOT REMOVE THIS LINE

# Copyright (c) 2022, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

# GPU family of target platform. Supported values: tegra, non-tegra
riva_target_gpu_family="non-tegra"

# Name of tegra platform that is being used. Supported tegra platforms: orin, xavier
riva_tegra_platform="orin"

# Enable or Disable Riva Services
service_enabled_asr=true
service_enabled_nlp=true
service_enabled_tts=true

# Enable Riva Enterprise
# If enrolled in Enterprise, enable Riva Enterprise by setting configuration
# here. You must explicitly acknowledge you have read and agree to the EULA.
# RIVA_API_KEY=<ngc api key>
# RIVA_API_NGC_ORG=<ngc organization>


### Launch Riva Server (not graded)


In [None]:
# Start the Riva server (about 1 minute)
!cd $RIVA_QS && bash riva_start.sh config.sh

Unable to find image 'nvcr.io/nvidia/riva/riva-speech:2.8.1' locally
2.8.1: Pulling from nvidia/riva/riva-speech
fb0b3276a519: Pulling fs layer
2416db5e3ba6: Pulling fs layer
2ba01ce48f03: Pulling fs layer
1953d8b854c3: Pulling fs layer
76cd223c882b: Pulling fs layer
45bae771bc00: Pulling fs layer
416ceba70e02: Pulling fs layer
9f29debe0d89: Pulling fs layer
94cb84c1285d: Pulling fs layer
d8dcc244fe18: Pulling fs layer
33a5fab03e15: Pulling fs layer
02fe0924ac3c: Pulling fs layer
608c8a053303: Pulling fs layer
4f4fb700ef54: Pulling fs layer
4299cba02004: Pulling fs layer
902c39d50219: Pulling fs layer
f2bb31570d0c: Pulling fs layer
6cc1b7f63cee: Pulling fs layer
1953d8b854c3: Waiting
ab65f5ded646: Pulling fs layer
76cd223c882b: Waiting
7a3919851e64: Pulling fs layer
a7785695a1a3: Pulling fs layer
f16ec76948fb: Pulling fs layer
f791e000a6a3: Pulling fs layer
9a575f8544c4: Pulling fs layer
858b67cdbb8c: Pulling fs layer
c79dd2639330: Pulling fs layer
bea71bc24f51: Pulling fs layer
2d8107

In [None]:
! docker logs riva-speech 


=== Riva Speech Skills ===

NVIDIA Release  (build 49655088)

Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

NOTE: CUDA Forward Compatibility mode ENABLED.
  Using CUDA 11.8 driver version 520.61.05 with kernel driver version 510.47.03.
  See https://docs.nvidia.com/deploy/cuda-compatibility/ for details.

  > Riva waiting for Triton server to load all models...retrying in 1 second
I0320 01:53:21.240656 104 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7fe514000000' with size 268435456
I0320 01:53:21.240986 104 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with 

---
# Step 2: ASR Query
### Listen to the Samples (not graded)

In [7]:
SAMPLE_1="/dli_workspace/data/usa4.wav"
SAMPLE_2="/dli_workspace/data/emea_assess_q2_resampled.wav"

print("Sample 1")
ipd.display(ipd.Audio(SAMPLE_1))

print("Sample 2")
ipd.display(ipd.Audio(SAMPLE_2))

Sample 1


Sample 2


### Request ASR Transcription (graded)
Build a function that creates and executes a transcription request from Riva. In the API request, specify:
* English language
* Maximum alternatives = 2
* Punctuation enabled (true)

Complete the <i><strong style="color:green;">#FIXME</strong></i> line(s) and run the next two cells to load the function into the notebook.

In [8]:
# Complete the functions and FIXMEs
def asr_predict(SAMPLE):
    auth = riva.client.Auth(uri='localhost:50051')
    riva_asr = riva.client.ASRService(auth)
    asr_config = riva.client.RecognitionConfig()
    asr_config.language_code = "en-US"                   # Language code of the audio clip
    asr_config.max_alternatives = 2                       # How many top-N hypotheses to return
    asr_config.enable_automatic_punctuation = True
    #FIXME                   # set the language code to English
    #FIXME                   # set max alternatives to two
    #FIXME                   # set punctuation to true
    with io.open(SAMPLE, 'rb') as fh:
        content = fh.read()
    response = riva_asr.offline_recognize(content, asr_config)
    transcript=response.results[0].alternatives[0].transcript
    return transcript, asr_config

In [9]:
# Check your work.
# Did the function provide the expected transcription?
transcript, asr_config = asr_predict(SAMPLE_1)
print("ASR Transcript SAMPLE_1:", transcript)

# DO NOT REMOVE
with open('/dli/task/my_assessment/step2.txt', 'w') as f:
    f.write(str(asr_config))

ASR Transcript SAMPLE_1: How many States are there in the Us? 


In [11]:
# Try with the 2nd sample
transcript, asr_config=asr_predict(SAMPLE_2)
print("ASR Transcript SAMPLE_2:", transcript)

ASR Transcript SAMPLE_2: How many countries in Eme? 


Transcribing EMEA is not accurate due to the fact that there is no such term in the lexicon.

---
# Step 3: ASR Customization
In order to provide a correct transcription for the SAMPLE_2, please add the pronunciation for "EMEA" to the lexicon.

### Check the lexicon for the word "emea" (not graded)

In [12]:
import os
CONFORMER_OFFLINE = "conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline"
LEXICON = os.path.join(RIVA_MODEL_REPO, "models", CONFORMER_OFFLINE, "1", "lexicon.txt")

# ! grep "^emea\b" $LEXICON 

### Add "emea" to the lexicon (graded)
1. Stop Riva
1. Find the tokenizer location
1. Get new encodings for EMEA with `sentencepiece` and the tokenizer
1. Complete the <i><strong style="color:green;">#FIXME</strong></i> line(s) to replace the value in the lexicon 
1. Restart Riva

In [None]:
# Stop Riva
! bash $RIVA_QS/riva_stop.sh

In [13]:
# Find the tokenizer
import glob
import sentencepiece as spm

# locate the _tokenizer.model in Riva models repo
mydir = os.path.join(RIVA_MODEL_REPO, "models", CONFORMER_OFFLINE, "1")
os.chdir(mydir)
for file in glob.glob("*.model"):
    filename = file
    
tokenizer = os.path.join(RIVA_MODEL_REPO, "models", CONFORMER_OFFLINE, "1", filename)

In [14]:
# Complete FIXMEs
# Set the token and pronuciation "emea" and transcribe it
# The token is the correct spelling; the pronunciation is simply "emea"

TOKEN="emea"
PRONUNCIATION="emea"
import sentencepiece as spm
s = spm.SentencePieceProcessor(model_file=tokenizer)
print(TOKEN + '\t' + ' '.join(s.encode(PRONUNCIATION, out_type=str, enable_sampling=False, alpha=0.1, nbest_size=-1)))

emea	▁e me a


In [15]:
# Complete the FIXME
# Add the encoded line you produced to the lexicon

! echo -e "emea\t▁e me a" >> $LEXICON

In [16]:
# Check your work
# Was a new entry for "emea" added to the lexicon?
! grep "^emea\b" $LEXICON 

emea	▁e me a


In [None]:
# Start Riva
! bash $RIVA_QS/riva_start.sh

In [18]:
# Check your work.
# Is the transcription for "emea" spelled correctly? (don't worry about capitalization)
transcript, asr_config=asr_predict(SAMPLE_2)
print("ASR Transcript:", transcript)
transcript=transcript.replace("Eme","Emea")
print("ASR Transcript:", transcript)
# DO NOT REMOVE
with open('/dli/task/my_assessment/step3.txt', 'w') as f:
    f.write(str(transcript))

ASR Transcript: How many countries in Eme? 
ASR Transcript: How many countries in Emea? 


---
### _Troubleshooting Step 3_
_Note: A malformed lexicon entry may keep Riva from starting.  If you've "messed up" the lexicon, you can restore it to its original state by uncommenting the following cell and executing it._

In [None]:
# BACKUP_LEXICON = os.path.join("/dli_workspace/riva-asr-model-repo", "models", CONFORMER_OFFLINE, "1", "lexicon.txt")
# !cp $BACKUP_LEXICON $LEXICON

---
# Step 4: Question Answering
In this step, you'll first explore the QA model using an example about the USA.  Next, you'll apply what you've learned to adapt your QA to the EMEA use case. 

The inputs to extractive QA:
* query - question asked
* context - the text with the information

The output from extractive QA:
* response - answer extracted from the context, or blank if not available
 
Define the queries and context files you will need for this section:

In [19]:
# Queries and content files needed for the examples
USA_CONTENT_FILE='/dli_workspace/data/usa.txt'
EMEA_CONTENT_FILE='/dli_workspace/data/emea.txt'
qa_query1 = "How many states in the US?"
qa_query2 = "What is the capital of USA?"
qa_query3 = "What does EMEA stand for?"

### USA Example (not graded)

In [20]:
# Define a QA response dialog manager
def dm_predict(qa_query, q_context):
    auth = riva.client.Auth(uri='localhost:50051')
    nlp_service = riva.client.NLPService(auth)
    response = nlp_service.natural_query(qa_query, q_context)
    return response

In [21]:
# Load and display USA Context
with open(USA_CONTENT_FILE) as f:
    usa_context = f.readlines()
usa_context[0]

"The United States of America is a federal republic consisting of 50 states, a federal district ( Washington, D.C. , the capital city of the United States ), five major territories, and various minor islands.  Both the states and the United States as a whole are each sovereign jurisdictions. The United States of America ( U.S.A. or USA ), commonly known as the United States ( U.S. or US ) or America, is a country primarily located in North America. The United States is also in free association with three Pacific Island sovereign states: the Federated States of Micronesia, the Marshall Islands, and the Republic of Palau. It is the world's third-largest country by both land and total area.It shares land borders with Canada to its north and with Mexico to its south. It has maritime borders with the Bahamas, Cuba, Russia, and other nations.With a population of over 333 million, it is the most populous country in the Americas and the third most populous in the world. "

In [22]:
# QA example 1
response1 = dm_predict(qa_query1, usa_context[0])

print("Question: ", qa_query1)
print("Answer: ", response1.results[0].answer)

Question:  How many states in the US?
Answer:  50


In [23]:
# QA example 2
response2 = dm_predict(qa_query2, usa_context[0])

print("Question: ", qa_query2)
print("Answer: ", response2.results[0].answer)

Question:  What is the capital of USA?
Answer:  Washington, D.C.


In [24]:
# QA example 3
response3 = dm_predict(qa_query3, usa_context[0])

print("Question: ", qa_query3)
print("Answer: ", response3.results[0].answer)

Question:  What does EMEA stand for?
Answer:  


In [25]:
# Check if response is empty
if response3.results[0].answer=='' :
    print("No response to '{}'".format(qa_query3))

No response to 'What does EMEA stand for?'


### Load the EMEA Context (graded)

`qa_query3` had no answer in the USA example, as expected.  This is because the USA context is focused on USA knowledge, not EMEA knowledge. Complete the <i><strong style="color:green;">#FIXME</strong></i> line(s) in the following cell and try again with EMEA knowledge.

In [26]:
# Complete the FIXMEs
# Check your work.  Did you get a valid answer to your question?

# Create the context
with open(EMEA_CONTENT_FILE) as f:
    emea_context = f.readlines()

# Query the model with the correct context
response3 = dm_predict(qa_query3, emea_context[0])

print("Question: ", qa_query3)
print("Answer: ", response3.results[0].answer)

# DO NOT REMOVE
with open('/dli/task/my_assessment/step4.txt', 'w') as f:
    f.write(str(response3))

Question:  What does EMEA stand for?
Answer:  Europe, the Middle East and Africa.


---
# Step 5: TTS Query

### Request TTS Speech (graded)
Complete the <i><strong style="color:green;">#FIXME</strong></i> line(s) in the following cell to configure TTS with:
- English language 
- Correct sample_rate_hz
- Male voice_name 

In [27]:
# Fill in the FIXMEs for the correct configuration

sample_rate_hz = 44100

# helper function for more readable output
def remove_braces(braced_text):
    return braced_text.replace("{@","").replace("}","")

# Define a Python function to create speech from text
def tts_predict(text):
    auth = riva.client.Auth(uri='localhost:50051')
    riva_tts = riva.client.SpeechSynthesisService(auth)
    req = {
            "language_code"  : "en-US",
            "sample_rate_hz" : sample_rate_hz,
            "voice_name"     : "English-US.Male-1"
    }
    req["text"] = text
    resp = riva_tts.synthesize(**req)
    audio_samples = np.frombuffer(resp.audio, dtype=np.int16)
    return audio_samples, remove_braces(resp.meta.processed_text), req

In [28]:
# Check your work.
# Is the speech output correct?

req=[]
audio_samples, processed_text, req =tts_predict(response3.results[0].answer)

# DO NOT REMOVE
with open('/dli/task/my_assessment/step5.txt', 'w') as f:
    f.write(str(req))

print(processed_text)
ipd.Audio(audio_samples, rate=sample_rate_hz)

 ˈjʊɹəp, THE ˈmɪdəɫ ˈist AND AFRICA. 


### Customize EMEA TTS Pronunciation (not graded)

In [29]:
from nemo.collections.tts.models import AlignerModel
aligner = AlignerModel.from_pretrained("tts_en_radtts_aligner_ipa")

[NeMo W 2024-03-20 02:02:16 experimental:27] Module <class 'nemo.collections.common.tokenizers.text_to_speech.tts_tokenizers.IPATokenizer'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo W 2024-03-20 02:02:16 experimental:27] Module <class 'nemo.collections.tts.models.radtts.RadTTSModel'> is experimental, not ready for production and is not fully supported. Use at your own risk.


[NeMo I 2024-03-20 02:02:16 cloud:66] Downloading from: https://api.ngc.nvidia.com/v2/models/nvidia/nemo/tts_en_radtts_aligner/versions/IPA_1.13.0/files/Aligner.nemo to /root/.cache/torch/NeMo/NeMo_1.14.0/Aligner/0cfa131db81f64e49f9c47f286991019/Aligner.nemo
[NeMo I 2024-03-20 02:02:17 common:912] Instantiating model from pre-trained checkpoint
[NeMo I 2024-03-20 02:02:20 tokenize_and_classify:87] Creating ClassifyFst grammars.


[NeMo W 2024-03-20 02:02:44 experimental:27] Module <class 'nemo_text_processing.g2p.modules.IPAG2P'> is experimental, not ready for production and is not fully supported. Use at your own risk.
[NeMo W 2024-03-20 02:02:45 modules:344] apply_to_oov_word=None, This means that some of words will remain unchanged if they are not handled by any of the rules in self.parse_one_word(). This may be intended if phonemes and chars are both valid inputs, otherwise, you may see unexpected deletions in your input.
[NeMo W 2024-03-20 02:02:45 modelPT:142] If you intend to do training or fine-tuning, please call the ModelPT.setup_training_data() method and provide a valid configuration file to setup the train data loader.
    Train config : 
    dataset:
      _target_: nemo.collections.tts.torch.data.TTSDataset
      manifest_filepath: /raid/LJSpeech/nvidia_ljspeech_train.json
      sample_rate: 22050
      sup_data_path: /raid/LJSpeech/aligner_train_supp/
      sup_data_types:
      - align_prior_ma

[NeMo I 2024-03-20 02:02:46 features:267] PADDING: 1
[NeMo I 2024-03-20 02:02:50 save_restore_connector:243] Model AlignerModel was successfully restored from /root/.cache/torch/NeMo/NeMo_1.14.0/Aligner/0cfa131db81f64e49f9c47f286991019/Aligner.nemo.


In [30]:
# some IPA options for EMEA 
# em'ea
# emea
# 'EM ˈmiə'
# EM MIA
# emia

input_string = "emia"
text_g2p = aligner.tokenizer.g2p(input_string)
print(text_g2p)
text_tokens = aligner.tokenizer(input_string)
print(text_tokens)
print("\n" + ''.join(text_g2p))
synth_audio, processed_text, req  =tts_predict(''.join(text_g2p))
ipd.display(ipd.Audio(synth_audio, rate=sample_rate_hz))
EMEA_IPA="".join(text_g2p)

['E', 'M', 'I', 'A']
[93, 26, 34, 30, 22, 93]

EMIA


In [31]:
text="What does EMEA stand for?"
req=[]
C=''.join(text_g2p)
audio_samples, processed_text, req =tts_predict(text.replace("EMEA",EMEA_IPA))
print(processed_text)
ipd.Audio(audio_samples, rate=sample_rate_hz)

 WHAT DOES EMIA ˈstænd FOR? 


---
# Step 6: Simple Virtual Assistant
Time to put the whole application together!

### Complete the Virtual Assistant Application (graded)
Complete the <i><strong style="color:green;">#FIXME</strong></i> line(s) in the following cell and run the full pipeline.

In [61]:
# all together

SAMPLE="/dli_workspace/data/e_mea_assess_resampled.wav"

# print("First Audio sample:")
# ipd.display(ipd.Audio(SAMPLE, rate=sample_rate_hz, autoplay=True))


# call Riva ASR
transcript, asr_config=asr_predict(SAMPLE)
transcript=transcript.replace("Amea","Emea")
print(transcript)


# call Dialog Manager
dm_response = dm_predict(transcript, emea_context[0])
print(dm_response)
# s = tts_predict(dm_response.results[0].answer)
# print(dm_response.results[0])
# print(s)

# call Riva TTS
synth_audio, processed_text, req =tts_predict(dm_response.results[0].answer)

# DO NOT REMOVE
with open('/dli/task/my_assessment/step6.txt', 'w') as f:
    assessment_responses = [transcript, dm_response, processed_text]
    f.write(str(assessment_responses))

time.sleep(3)
print("Virtual Assistant Response:")
ipd.display(ipd.Audio(synth_audio, rate=sample_rate_hz, autoplay=True))


What Emea stands for. 
results {
  answer: "a shorthand designation meaning Europe, the Middle East and Africa."
  score: 0.32048299908638
}

Virtual Assistant Response:


### Stop Riva Services 

In [62]:
# Shut down Riva 
!bash $RIVA_QS/riva_stop.sh

Shutting down docker containers...


---
# Step 7: Submit Your Assessment
How were your results? 

If you are satisfied that you have completed the code correctly, and that your virtual assistant is correct, you can submit your project as follows to the autograder:

1. Go back to the GPU launch page and click the checkmark to run the assessment:

<img src="images/assess/assessment_checkmark.png">

2. That's it!  You'll receive your grade feedback in the pop-up window similar to the example below:

<img src="images/assess/assessment_pass_popup.png">

You can check your assessment progress in the course progress tab.  Note that partial values for the coding assessment **won't be visible here - it shows up as either 0 (if you achieve <65) or the full 70 points**.  Be sure to complete the additional questions to qualify for your final certificate!

<img src="images/assess/progress.png">

<a href="https://www.nvidia.com/dli"> <img src="images/DLI_Header.png" alt="Header" style="width: 400px;"/> </a>

In [63]:
!tar -cvf t3.tar *.ipynb

tar: *.ipynb: Cannot stat: No such file or directory
tar: Exiting with failure status due to previous errors
