# Python Riva API

This tutorial demonstates how to use Python Riva API.

## Server

Before running client part of Riva, please set up a server. The simplest
way to do this is to follow
[quick start guide](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/quick-start-guide.html#local-deployment-using-quick-start-scripts).


## Authentication

Before using Riva services you will need to establish connection with a server.

In [None]:
import riva_api

uri = "localhost:50051"  # Default value

auth = riva_api.Auth(uri=uri)

## ASR

To instantiate a service pass `riva_api.Auth` instance to a constructor.

In [None]:
asr_service = riva_api.ASRService(auth)

For speech recognition you will need to create a recognition config (an instance of `riva_api.RecognitionConfig`). 
A detailed description of config fields is available in Riva 
[documentation](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/protos/riva_asr.proto.html?highlight=max%20alternatives#riva-proto-riva-asr-proto).
If you intend to use streaming recognition, an offline config has to wrapped into `riva_api.StreamingRecognitionConfig`.


In [None]:
from copy import deepcopy
offline_config = riva_api.RecognitionConfig(
    encoding=riva_api.AudioEncoding.LINEAR_PCM,
    max_alternatives=1,
    enable_automatic_punctuation=True,
    verbatim_transcripts=False,
)
streaming_config = riva_api.StreamingRecognitionConfig(config=deepcopy(offline_config), interim_results=True)

You also need to a set frame rate and number of channels of audio which is going to be processed. If you'd like to process file `examples/en-US_AntiBERTa_for_word_boosting_testing.wav`, then your code will be

In [None]:
my_wav_file = 'examples/en-US_AntiBERTa_for_word_boosting_testing.wav'
riva_api.add_audio_file_specs_to_config(offline_config, my_wav_file)
riva_api.add_audio_file_specs_to_config(streaming_config, my_wav_file)

If you intent to use word boosting, then use convenience method `riva_api.add_word_boosting_to_config()` to add boosting parameters to config.

In [None]:
boosted_lm_words = ['AntiBERTa', 'ABlooper']
boosted_lm_score = 20.0
riva_api.add_word_boosting_to_config(offline_config, boosted_lm_words, boosted_lm_score)
riva_api.add_word_boosting_to_config(streaming_config, boosted_lm_words, boosted_lm_score)

In [None]:
print(offline_config)

In [None]:
print(streaming_config)

## Offline

To run offline speech recognition read data from a file and pass to a service.

In [None]:
with open(my_wav_file, 'rb') as fh:
    data = fh.read()

response = asr_service.offline_recognize(data, offline_config)

In [None]:
print(response)

To extract a transcript you may use

In [None]:
print(response.results[0].alternatives[0].transcript)

In [None]:
print(response.results[0].alternatives[0].confidence)

## Streaming

To imitate audio streaming use `riva_api.AudioChunkFileIterator`. You can imitate realtime audio by providing a delay callback to the iterator.

In [None]:
wav_parameters = riva_api.get_wav_file_parameters(my_wav_file)
# correponds to 1 second of audio
chunk_size = wav_parameters['framerate']
with riva_api.AudioChunkFileIterator(
    my_wav_file, chunk_size, delay_callback=riva_api.sleep_audio_length,
) as audio_chunk_iterator:
    for i, chunk in enumerate(audio_chunk_iterator):
        print(i, len(chunk))

Then audio chunks are passed to `ASRService.streaming_response_generator()` and response generator is created.

In [None]:
audio_chunk_iterator = riva_api.AudioChunkFileIterator(my_wav_file, 4800)
response_generator = asr_service.streaming_response_generator(audio_chunk_iterator, streaming_config)

You may find description of streaming response (`StreamingRecognizeResponse`) fields in Riva [documentation](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/protos/riva_asr.proto.html?highlight=max%20alternatives#riva-proto-riva-asr-proto).

In [None]:
streaming_response = next(response_generator)

For showing streaming results it is convenient to use function `riva_api.print_streaming()`.

In [None]:
riva_api.print_streaming(response_generator, additional_info='time')

If you set a delay callback in audio chunk iterator and `show_intermediate=True` in `riva_api.print_streaming()`, then you will be able watch transcript forming.

In [None]:
audio_chunk_iterator = riva_api.AudioChunkFileIterator(my_wav_file, 4800, riva_api.sleep_audio_length)
response_generator = asr_service.streaming_response_generator(audio_chunk_iterator, streaming_config)
riva_api.print_streaming(response_generator, show_intermediate=True)