<img src="http://developer.download.nvidia.com/compute/machine-learning/frameworks/nvidia_logo.png" style="width: 90px; float: right;">

# How do I perform Language Translation using Riva NMT APIs with out-of-the-box models?

This tutorial walks you through the basics of using Riva Neural Machine Translation (NMT) Services, specifically covering how to use Riva NMT APIs with out-of-the-box models. We will also cover pipelining Riva Automated Speech Recognition (ASR) and Neural Machine Translation (NMT) APIs.

## NVIDIA Riva Overview

NVIDIA Riva is a GPU-accelerated SDK for building Speech AI applications that are customized for your use case and deliver real-time performance. <br/>
Riva offers a rich set of speech and natural language understanding services such as:

- Automated speech recognition (ASR)
- Text-to-Speech synthesis (TTS)
- Neural Machine Translation (NMT)
- A collection of natural language processing (NLP) services, such as named entity recognition (NER), punctuation, intent classification.

In this tutorial, we will interact with the Neural Machine Translation (NMT) APIs. We will also cover pipelining Riva ASR and NMT APIs.

For more information about Riva, refer to the [Riva developer documentation](https://developer.nvidia.com/riva).

## Introduction to Language Translation with Riva NMT

Riva Neural machine translation (NMT) is a framework for machine translation based on neural networks. NMT translates text between language pairs, that is, from one language to another. For example, we want a machine to translate text in one language (we call this the source language), to corresponding text in another language (we call this the target language).  

Riva NMT EA offers multiple models for Machine Translation. These models fall into two model architectures:
1. **Multilingual models** support translating from one source language to multiple target languages or vice-versa. For example, the `mnmt_en_deesfr_transformer24x6` model can be used to translate from English to German, Spanish, and French. Multilingual models have several language codes in their name. Use a multilingual model if you need to support multiple languages or if you want to optimize resource utlization since you can translate along multiple language pairs without loading multiple models. Running multilingual models prevents loading multiple models, therefore, preventing overhead. By default, use 24x6 multilingual models. You can use 12x2 instead of a 24x6 multilingual model if you need to reduce the resource consumption even further and can accept a bit of translation quality degradation.  
Please note that Multilingual models will be added to Riva NMT EA in a future release.
2. **Bilingual models** are used for translation from one source language to another target language. For example, the `en_de_24x6` model can be used to translate from English to Russian. Bilingual models have a single pair of language codes in their name. Use a bilingual model when you want the best possible performance for a specific language pair direction. Running bilingual models produces faster results compared to running multilingual models. 

To learn more about Riva NMT, refer to the Riva NMT EA documentation.  
For more information about the NMT model architecture and training, refer to the [NeMo NMT documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/machine_translation.html).

<a id='nmt_language_pairs_supported'></a>
### Language Pairs Supported:  
The below table lists models for all the language pairs supported by NVIDIA Riva Speech Skills NMT service.  
The table also provides the language codes, the model name in Riva Quick Start Guide's `config.sh` file, and the corresponding model name specified during API call - We will be getting into more details later in this tutorial  

| Language Pair | Model name in `config.sh` | Model name specified during API call |
| ---- | --- | --- |
| English (`en`) to Simplified Chinese (`zh`) | `rmir_en_zh_24x6` | `en_zh_24x6` |
| Simplified Chinese (`zh`) to English (`en`) | `rmir_zh_en_24x6` | `zh_en_24x6` |
| English (`en`) to Russian (`ru`) | `rmir_en_ru_24x6` | `en_ru_24x6` |
| Russian (`ru`) to English (`en`) | `rmir_ru_en_24x6` | `ru_en_24x6` |
| English (`en`) to German (`de`) | `rmir_en_de_24x6` | `en_de_24x6` |
| German (`de`) to English (`en`) | `rmir_de_en_24x6` | `de_en_24x6` |
| English (`en`) to Spanish (`es`) | `rmir_en_es_24x6` | `en_es_24x6` |
| Spanish (`es`) to English (`en`) | `rmir_es_en_24x6` | `es_en_24x6` |
| English (`en`) to French (`fr`) | `rmir_en_fr_24x6` | `en_fr_24x6` |
| French (`fr`) to English (`en`) | `rmir_fr_en_24x6` | `fr_en_24x6` |

#### Requirements and setup

1. Start the Riva Speech Skills server.  
To use the Riva NMT models, we first need to deploy them on the Riva Speech Skills server. Follow the instructions in the Riva Quick Start Guide to deploy the OOTB NMT models on the Riva Speech Skills server before running this tutorial.  
Note: For this tutorial, please deploy the `English (en) to French (fr)` and `German (de) to English (en)` models - The model names corresponding to these language pairs in the Riva Quick Start Guide's `config.sh` can be found in the [table above](#nmt_language_pairs_supported).
We also need to deploy the German (language code `de-DE`) ASR model, instructions for which can be found in `config.sh` itself, as the latter section of this tutorial will cover pipelining ASR and NMT.  
<br>
<br>
2. Install the Riva Client library.   
Follow the steps in the 'Running the Riva Client' in the Riva NMT EA Tutorials' [Overview section](https://ngc.nvidia.com/resources/riem1phmzvud:riva:riva_nmt_ea_tutorials) or [README.md](https://ngc.nvidia.com/resources/riem1phmzvud:riva:riva_nmt_ea_tutorials/files?version=2.2.0-ea) to install the Riva Client library.  
<br>
<br>
3. Install additional libraries needed to run this tutorial.  

In [None]:
!apt-get install python3-dev

In [None]:
''' 
Install Pyaudio. portaudio19-dev is a prerequisite for Pyaudio.
'''
!apt-get update && apt-get install -y python3-pyaudio portaudio19-dev
!python -m pip install pyaudio
# If you run into errors running apt-get commands through Jupyter notebook, run this command directly on your local machine's terminal. You might need sudo access to run this command.
# For alternate options to install PyAudio, please refer to PyAudio documentation - https://people.csail.mit.edu/hubert/pyaudio/

'''
Install librosa.
'''
!apt-get update && apt-get install -y libsndfile1
# If you run into errors running apt-get commands through Jupyter notebook, run this command directly on your local machine's terminal. You might need sudo access to run this command.
!python -m pip install librosa

'''
Install nltk
'''
!python -m pip install nltk

## Language Translation with Riva NMT APIs

Now, let's generate language translations using Riva APIs, with an OOTB models.

#### Import the Riva client libraries

In [None]:
import riva.client

#### Create a Riva client and connect to the Riva Speech API server

The following URI assumes a local deployment of the Riva Speech API server is on the default port. In case the server deployment is on a different host or via a Helm chart on Kubernetes, use an appropriate URI.

In [None]:
# `Auth` class wraps a gRPC channel.
auth = riva.client.Auth(uri='localhost:50051')

# `NeuralMachineTranslationClient` is for sending requests to a server.
riva_nmt_client = riva.client.NeuralMachineTranslationClient(auth)

#### Make a gRPC request to the Riva Speech API server 

##### Inference with Bilingual NMT model:

Now, let's make a gRPC request to the Riva Speech server's Bilingual NMT model `rmir_en_fr_24x6` for translation from source_language, English (`en`) to target_language, French (`fr`).

In [None]:
eng_text = (
    "Molecular Biology is the field of biology that studies the composition, structure "
    "and interactions of cellular molecules – such as nucleic acids and proteins – that "
    "carry out the biological processes essential for the cell's functions and maintenance."
)
model_name = 'en_fr_24x6'
source_language = 'en'
target_language = 'fr'

To learn more about `NeuralMachineTranslationClient`, refer to the corresponding [docstring](https://github.com/nvidia-riva/python-clients/blob/db77d1038dff5d2fcbb7bf9e96f8c1ef280710cc/riva/client/nmt.py#L13).  

Now we submit the request to the server.

In [None]:
response = riva_nmt_client.translate([eng_text], model_name, source_language, target_language)
# response.translations is a list of all translations - Each entry corresponds to the 
# corresponding entry in the texts attribute of TranslateTextRequest (nmt_request.texts) from above.

print("English Text: ", eng_text)
# Fetch the translated text from the 1st entry of response.translations
print("Translated French Text: ", response.translations[0].text)

<br>
Let us look at another example showing how the Riva Speech server's Bilingual NMT model `rmir_de_en_24x6` for translation from source_language, German (`de`) to target_language, English (`en`).

In [None]:
german_text = (
    "Molekularbiologie ist das Gebiet der Biologie, das die Zusammensetzung, Struktur "
    "und Wechselwirkungen von Zellmolekülen – wie Nukleinsäuren und Proteinen – "
    "untersucht, die die biologischen Prozesse ausführen, die für die Funktionen "
    "und den Erhalt der Zelle unerlässlich sind."
)
model_name = 'de_en_24x6'
source_language = 'de'
target_language = 'en'

response = riva_nmt_client.translate([german_text], model_name, source_language, target_language)
print("Response Object from the Riva Server: ", response, "\n")

print("German Text: ", german_text)
print("Translated English Text: ", response.translations[0].text)

##### Riva NMT APIs - Handling large input text:

Riva NMT API has a maximum input token limit of 512 tokens. If an input larger than 512 tokens is provided, the NMT API doesn't return the complete transcription:

In [None]:
eng_text = """
The effects of climate change span the impacts on physical environment, ecosystems and human societies due to ongoing human-caused climate change. The future impact of climate change depends on how much nations reduce greenhouse gas emissions and adapt to climate change. Effects that scientists predicted in the past—loss of sea ice, accelerated sea level rise and longer, more intense heat waves—are now occurring. The changes in climate are not expected to be uniform across the Earth. In particular, land areas change more quickly than oceans, and northern high latitudes change more quickly than the tropics. There are three major ways in which global warming will make changes to regional climate: melting ice, changing the hydrological cycle (of evaporation and precipitation) and changing currents in the oceans.
Physical changes include extreme weather, glacier retreat, sea level rise, declines in Arctic sea ice, and changes in the timing of seasonal events (such as earlier spring flowering). Since 1970, the ocean has absorbed more than 90% of the excess heat in the climate system. Even if global surface temperature is stabilized, sea levels will continue to rise and the ocean will continue to absorb excess heat from the atmosphere for many centuries. The uptake of carbon dioxide from the atmosphere is leading to ocean acidification.
Climate change has degraded land by raising temperatures, drying soils and increasing wildfire risk. Recent warming has strongly affected natural biological systems. Species worldwide are migrating poleward to colder areas. On land, species move to higher elevations, whereas marine species find colder water at greater depths. Between 1% and 50% of species on land were assessed to be at substantially higher risk of extinction due to climate change. Coral reefs and shellfish are vulnerable to the combined threat of ocean warming and acidification.
Food security and access to fresh water are at risk due to rising temperatures. Climate change has profound impacts on human health, directly via heat stress and indirectly via the spread of infectious diseases.
The effects of climate change span the impacts on physical environment, ecosystems and human societies due to ongoing human-caused climate change. The future impact of climate change depends on how much nations reduce greenhouse gas emissions and adapt to climate change. Effects that scientists predicted in the past—loss of sea ice, accelerated sea level rise and longer, more intense heat waves—are now occurring. The changes in climate are not expected to be uniform across the Earth. In particular, land areas change more quickly than oceans, and northern high latitudes change more quickly than the tropics. There are three major ways in which global warming will make changes to regional climate: melting ice, changing the hydrological cycle (of evaporation and precipitation) and changing currents in the oceans.
Physical changes include extreme weather, glacier retreat, sea level rise, declines in Arctic sea ice, and changes in the timing of seasonal events (such as earlier spring flowering). Since 1970, the ocean has absorbed more than 90% of the excess heat in the climate system. Even if global surface temperature is stabilized, sea levels will continue to rise and the ocean will continue to absorb excess heat from the atmosphere for many centuries. The uptake of carbon dioxide from the atmosphere is leading to ocean acidification.
Climate change has degraded land by raising temperatures, drying soils and increasing wildfire risk. Recent warming has strongly affected natural biological systems. Species worldwide are migrating poleward to colder areas. On land, species move to higher elevations, whereas marine species find colder water at greater depths. Between 1% and 50% of species on land were assessed to be at substantially higher risk of extinction due to climate change. Coral reefs and shellfish are vulnerable to the combined threat of ocean warming and acidification.
Food security and access to fresh water are at risk due to rising temperatures. Climate change has profound impacts on human health, directly via heat stress and indirectly via the spread of infectious diseases.
The effects of climate change span the impacts on physical environment, ecosystems and human societies due to ongoing human-caused climate change. The future impact of climate change depends on how much nations reduce greenhouse gas emissions and adapt to climate change. Effects that scientists predicted in the past—loss of sea ice, accelerated sea level rise and longer, more intense heat waves—are now occurring. The changes in climate are not expected to be uniform across the Earth. In particular, land areas change more quickly than oceans, and northern high latitudes change more quickly than the tropics. There are three major ways in which global warming will make changes to regional climate: melting ice, changing the hydrological cycle (of evaporation and precipitation) and changing currents in the oceans.
Physical changes include extreme weather, glacier retreat, sea level rise, declines in Arctic sea ice, and changes in the timing of seasonal events (such as earlier spring flowering). Since 1970, the ocean has absorbed more than 90% of the excess heat in the climate system. Even if global surface temperature is stabilized, sea levels will continue to rise and the ocean will continue to absorb excess heat from the atmosphere for many centuries. The uptake of carbon dioxide from the atmosphere is leading to ocean acidification.
Climate change has degraded land by raising temperatures, drying soils and increasing wildfire risk. Recent warming has strongly affected natural biological systems. Species worldwide are migrating poleward to colder areas. On land, species move to higher elevations, whereas marine species find colder water at greater depths. Between 1% and 50% of species on land were assessed to be at substantially higher risk of extinction due to climate change. Coral reefs and shellfish are vulnerable to the combined threat of ocean warming and acidification.
Food security and access to fresh water are at risk due to rising temperatures. Climate change has profound impacts on human health, directly via heat stress and indirectly via the spread of infectious diseases.
The effects of climate change span the impacts on physical environment, ecosystems and human societies due to ongoing human-caused climate change. The future impact of climate change depends on how much nations reduce greenhouse gas emissions and adapt to climate change. Effects that scientists predicted in the past—loss of sea ice, accelerated sea level rise and longer, more intense heat waves—are now occurring. The changes in climate are not expected to be uniform across the Earth. In particular, land areas change more quickly than oceans, and northern high latitudes change more quickly than the tropics. There are three major ways in which global warming will make changes to regional climate: melting ice, changing the hydrological cycle (of evaporation and precipitation) and changing currents in the oceans.
Physical changes include extreme weather, glacier retreat, sea level rise, declines in Arctic sea ice, and changes in the timing of seasonal events (such as earlier spring flowering). Since 1970, the ocean has absorbed more than 90% of the excess heat in the climate system. Even if global surface temperature is stabilized, sea levels will continue to rise and the ocean will continue to absorb excess heat from the atmosphere for many centuries. The uptake of carbon dioxide from the atmosphere is leading to ocean acidification.
Climate change has degraded land by raising temperatures, drying soils and increasing wildfire risk. Recent warming has strongly affected natural biological systems. Species worldwide are migrating poleward to colder areas. On land, species move to higher elevations, whereas marine species find colder water at greater depths. Between 1% and 50% of species on land were assessed to be at substantially higher risk of extinction due to climate change. Coral reefs and shellfish are vulnerable to the combined threat of ocean warming and acidification.
Food security and access to fresh water are at risk due to rising temperatures. Climate change has profound impacts on human health, directly via heat stress and indirectly via the spread of infectious diseases.
"""
model_name = 'en_fr_24x6'
source_language = 'en'
target_language = 'fr'

response = riva_nmt_client.translate([eng_text], model_name, source_language, target_language)
print("English Text: ", eng_text)
print("Translated French Text: ", response.translations[0].text)

As can be seen above, the translated French text cuts off after 512 tokens of the input English text.  
<br>
The best way to handle such large input texts is to split the input text and send these as list of text to the NMT API:   
Unfortunately there is currently no precise way to find the number of tokens in input text. From WMT test sets, on average across multiple sentences and models, 1 token maps to 1.2 characters (This is just provided as an estimate and the mapping can vary significantly based on the model/sentence pair). Based on this estimate, we should keep size of input text per API call to less than 615 characters (512 * 1.2). We also need to ensure that we respect sentence boundaries while splitting text.  
Let us look at example on how to handle large text translation.

In [None]:
import nltk
nltk.download('punkt')

from nltk import tokenize

def nmt_large_text_split(input_text, max_chars = 615):
    """Function to split large input text"""
    
    def nmt_text_split_sentence_splitter(sentence_text, max_chars):
        """Function to split a sentence while respecting word boundaries, if sentence length > max_chars"""
        sentence_splits = []
        if len(sentence_text) > max_chars:
            words = sentence_text.split()
            for word in words:
                if len(sentence_splits) > 0 and (len(sentence_splits[-1]) + len(word) <= max_chars):
                    sentence_splits[-1] += word
                else:
                    sentence_splits.append(word)
        else:
            sentence_splits.append(sentence_text)
        return sentence_splits
    
    # 1. Split the input text into sentences 
    sentences = tokenize.sent_tokenize(input_text) # nltk.tokenize is the best way to split large text into sentences.
    
    # 2. Add input text to nmt_input_texts, ensuring no entry is greater than max_chars
    nmt_input_texts = []
    for i in range(len(sentences)):
        # 2.1. Split sentence if sentence length > max_chars, and update sentences 
        sentence_splits = nmt_text_split_sentence_splitter(sentences[i], max_chars)
        sentences = sentences[:i] + sentence_splits + sentences[i+1:]
        # 2.2. Adding entry to nmt_input_texts        
        if len(nmt_input_texts) > 0 and (len(nmt_input_texts[-1]) + len(sentences[i]) <= max_chars):
            nmt_input_texts[-1] += sentences[i]
        else:
            nmt_input_texts.append(sentences[i])    
    return nmt_input_texts
    

eng_text = """
The effects of climate change span the impacts on physical environment, ecosystems and human societies due to ongoing human-caused climate change. The future impact of climate change depends on how much nations reduce greenhouse gas emissions and adapt to climate change. Effects that scientists predicted in the past—loss of sea ice, accelerated sea level rise and longer, more intense heat waves—are now occurring. The changes in climate are not expected to be uniform across the Earth. In particular, land areas change more quickly than oceans, and northern high latitudes change more quickly than the tropics. There are three major ways in which global warming will make changes to regional climate: melting ice, changing the hydrological cycle (of evaporation and precipitation) and changing currents in the oceans.
Physical changes include extreme weather, glacier retreat, sea level rise, declines in Arctic sea ice, and changes in the timing of seasonal events (such as earlier spring flowering). Since 1970, the ocean has absorbed more than 90% of the excess heat in the climate system. Even if global surface temperature is stabilized, sea levels will continue to rise and the ocean will continue to absorb excess heat from the atmosphere for many centuries. The uptake of carbon dioxide from the atmosphere is leading to ocean acidification.
Climate change has degraded land by raising temperatures, drying soils and increasing wildfire risk. Recent warming has strongly affected natural biological systems. Species worldwide are migrating poleward to colder areas. On land, species move to higher elevations, whereas marine species find colder water at greater depths. Between 1% and 50% of species on land were assessed to be at substantially higher risk of extinction due to climate change. Coral reefs and shellfish are vulnerable to the combined threat of ocean warming and acidification.
Food security and access to fresh water are at risk due to rising temperatures. Climate change has profound impacts on human health, directly via heat stress and indirectly via the spread of infectious diseases.
"""
model_name = 'en_fr_24x6'
source_language = 'en'
target_language = 'fr'

parts = nmt_large_text_split(eng_text)

response = riva_nmt_client.translate(parts, model_name, source_language, target_language)

print("English Text:\n", eng_text)
print("Translated French Text:\n")
for i, translation in enumerate(response.translations):
    print(translation.text)

<div class="alert alert-block alert-warning">
WARNING: Please take into account that you cannot pass more than 8 texts to the model. If you pass more than 8 inputs, then the response will be empty.
</div>

## Pipelining Riva ASR and Riva NMT APIs

Automated Speech Recognition (ASR) takes an audio stream or audio buffer as input and returns one or more text transcripts, along with additional optional metadata. Speech recognition in Riva is a GPU-accelerated compute pipeline, with optimized performance and accuracy.  
Riva provides state-of-the-art OOTB (out-of-the-box) models and pipelines for multiple languages, like English, Spanish, German, Russian and Mandarin, that can be easily deployed with the Riva Quick Start Scripts. Riva also supports easy customization of the ASR pipeline, in various ways, to meet your specific needs.  
Refer to the [Riva ASR documentation](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/asr/asr-overview.html) for more information. For examples on how to use Riva ASR APIs, refer to the [Riva tutorials repository](https://github.com/nvidia-riva/tutorials).  

In this section, let us look at examples showing how to directly generate translations from audio, by pipelining Riva ASR and Riva NMT APIs.

### Pipelining Riva ASR Offline APIs and Riva NMT APIs

Riva ASR can be used in either streaming mode or offline mode. In streaming mode, a continuous stream of audio is captured and recognized, producing a stream of transcribed text. In offline mode, an audio clip of a set length is transcribed to text.  
<br>
Let us start with an example showing pipeling Riva ASR Offline APIs with Riva NMT APIs.

#### Import the Riva client libraries

Let's import some of the required libraries, including the Riva Client libraries.

In [None]:
import io
import IPython.display as ipd

# Riva ASR client import
import riva.client

#### Create a Riva client and connect to the Riva Speech API server

The following URI assumes a local deployment of the Riva Speech API server is on the default port. In case the server deployment is on a different host or via a Helm chart on Kubernetes, use an appropriate URI.

In [None]:
auth = riva.client.Auth(uri="localhost:50051")

asr_service = riva.client.ASRService(auth)
riva_nmt_client = riva.client.NeuralMachineTranslationClient(auth)

#### Pipeline Riva ASR and NMT

Now, let's pipeline Riva ASR and NMT APIs:  
First, we apply ASR to a German `.wav` file in offline mode.  
Next, we apply NMT model `de_en_24x6` for translation from source_language, German (`de`) to target_language, English (`en`), to the German transcript, obtained from the Riva ASR call.

##### Make Riva ASR gRPC requests to the Riva Speech API server:

In [None]:
'''
Start by loading the audio.
'''
# This example uses a .wav file with LINEAR_PCM encoding.
# read in an audio file from local disk
my_wav_file = "./audio_samples/de-DE_sample.wav"
with open(my_wav_file, 'rb') as fh:
    data = fh.read()

offline_config = riva.client.RecognitionConfig(enable_automatic_punctuation=True, language_code='de-DE', max_alternatives=1)
riva.client.add_audio_file_specs_to_config(offline_config, my_wav_file)

response = asr_service.offline_recognize(data, offline_config)
asr_best_transcript = response.results[0].alternatives[0].transcript
print("German ASR Transcript:", asr_best_transcript)

##### Make Riva NMT gRPC requests to the Riva Speech API server:

In [None]:
'''
We start by creating the `TranslateTextRequest` object, setting the configuration parameters as required.
'''
german_text = asr_best_transcript

'''
Now we submit the request to the server.
'''
response = riva_nmt_client.translate([german_text], 'de_en_24x6', 'de', 'en')
print("Translated English Text: ", response.translations[0].text)

### Pipelining Riva ASR Streaming APIs and Riva NMT APIs

Now, let us look at an example showing pipeling Riva ASR Streaming APIs with Riva NMT APIs.  

Riva ASR's Streaming APIs generate transcription as they recieve the streaming chunks of audio, leading to a stream of transcripts. The stream of transcripts represent the intermediate transcription of the audio that has been recieved so far. Once the sentence boundary is detected in the audio, the corresponding transcript can be considered as the complete transcript for that sentence.  
When pipelining Riva Streaming ASR with Riva NMT, we will generate translations for the complete sentence transcript, as well as the intermediate transcripts.

#### Import the Riva client libraries

Let's import some of the required libraries, including the Riva Client libraries.

In [None]:
import sys
import io
import wave
import librosa
import IPython.display as ipd

import riva.client
import riva.client.audio_io

#### Load audio file

Let's load up an audio file and create pyaudio stream to send as input

In [None]:
my_wav_file = "./audio_samples/de-DE_sample.wav"
audio, sr = librosa.core.load(my_wav_file, sr=None)
with io.open(my_wav_file, 'rb') as fh:
    content = fh.read()
ipd.Audio(my_wav_file)

#### Create Riva ASR and NMT client and connect to the Server

The following URI assumes a local deployment of the Riva Speech API server is on the default port. In case the server deployment is on a different host or via a Helm chart on Kubernetes, use an appropriate URI.

In [None]:
auth = riva.client.Auth(uri="localhost:50051")

asr_service = riva.client.ASRService(auth)
riva_nmt_client = riva.client.NeuralMachineTranslationClient(auth)

#### Create config for Riva Streaming ASR

Creating a `RecognitionConfig` object and setting it as `config` in a `StreamingRecognitionConfig` object. `streaming_config` is created.

In [None]:
offline_config = riva.client.RecognitionConfig(
    encoding=riva.client.AudioEncoding.LINEAR_PCM,
    max_alternatives=1,
    enable_automatic_punctuation=True,
    verbatim_transcripts=False,
    language_code='de-DE'
)
streaming_config = riva.client.StreamingRecognitionConfig(config=offline_config, interim_results=True)
# Uncomment the following line and edit `add_word_boosting_to_config()` parameters if you wish to use word boosting.
#riva.client.add_word_boosting_to_config(streaming_config, boosted_lm_words=['Begriff'], boosted_lm_score=30.)
riva.client.add_audio_file_specs_to_config(streaming_config, my_wav_file)

#### Define a listen print loop that gets the streaming audio responses and sends to translation

In [None]:
model = 'de_en_24x6'

def listen_print_loop(responses, src_language, target_language, model, asr_only=False):
    num_chars_printed = 0
    prev_utterances = []
    for response in responses:
        if not response.results:
            continue
        result = response.results[0]
        if not result.alternatives:
            continue
        transcript = result.alternatives[0].transcript
        original_transcript = transcript
        if not asr_only:
            transcript = riva_nmt_client.translate(
                [transcript], model, src_language, target_language
            ).translations[0].text
        overwrite_chars = ' ' * (num_chars_printed - len(transcript))
        if not result.is_final:
            sys.stdout.write(">> " + transcript + overwrite_chars + '\r')
            sys.stdout.flush()
            num_chars_printed = len(transcript) + 3
        else:
            print("## " + transcript + overwrite_chars + "\n")
            num_chars_printed = 0
            prev_utterances.append(original_transcript)

#### Run Streaming ASR + NMT on the audio file

Finally, we run pipelined inference on Riva Streaming ASR and NMT on the same audio file.

In [None]:
riva.client.audio_io.list_output_devices()

In [None]:
output_device = None  # use default device
translation_model = 'de_en_24x6'
wav_parameters = riva.client.get_wav_file_parameters(my_wav_file)
sound_callback = riva.client.audio_io.SoundCallBack(
    output_device, wav_parameters['sampwidth'], wav_parameters['nchannels'], wav_parameters['framerate'])
audio_chunk_iterator = riva.client.AudioChunkFileIterator(
    my_wav_file, chunk_n_frames=4800, delay_callback=sound_callback)
response_generator = asr_service.streaming_response_generator(audio_chunk_iterator, streaming_config)
listen_print_loop(response_generator, src_language='de', target_language='en', model=translation_model)
sound_callback.close()

As seen above, we generate a stream of translations, corresponding to the stream of transcripts generated by Riva ASR. These translations might be intermediate or complete sentence translations, depending on the sentence boundaries in the audio. 
Once the input audio stream ends, we have the final transcript of the entire audio.