### pipeline example: `TextToAudioPipeline` and `AutomaticSpeechRecognitionPipeline`

> extracting spoken text contained within some audio.
> https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.AutomaticSpeechRecognitionPipeline
>
>  generates an audio file from an input text and optional other conditional input
> https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.TextToAudioPipeline

![](./static/audio/automatic-speech-recognition.png)

In [1]:
from transformers import pipeline

# https://huggingface.co/suno/bark-small
# https://huggingface.co/openai/whisper-base

my_text_to_audio_pipe = pipeline(model="suno/bark-small")

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
BANK = "BANK OF AMERICA"
PERSON = "JOHN SMITH"

bank_message = my_text_to_audio_pipe(f"hello. this is ... {BANK}. calling for ... {PERSON}. Give us all your personal info.")


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:10000 for open-end generation.


In [3]:
bank_message["audio"]

array([[ 0.0075569 ,  0.00753531,  0.00726118, ..., -0.00867635,
        -0.01060779, -0.01193837]], dtype=float32)

use `IPython` display helper for Jupyter notebook..

In [4]:
from IPython.display import Audio

sampling_rate = my_text_to_audio_pipe.model.generation_config.sample_rate
Audio(bank_message["audio"], rate=sampling_rate)

In [5]:
from transformers import pipeline

my_audio_to_text_pipe = pipeline(model="openai/whisper-base")


result = my_audio_to_text_pipe("./audio/christmascarol_00_dickens_64kb.mp3")

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [6]:
from pprint import pprint
pprint(result)

{'text': ' A Christmas Carol by Charles Dickens. This is a Librevox recording. '
         'All Librevox recordings are in the public domain. A Christmas Carol. '
         'Preface. I have endeavoured in this ghostly little book to raise the '
         'ghost of an idea, which will not put my readers out of humour with '
         'themselves, with each other, with a season, or with'}
