# Getting started with Speech AI

*A guide to use Automatic Speech Recognition (ASR), Neural Machine Translation (NMT) and Text-to-Speech (TTS) APIs easily through an AI Notebook.*

![ASR](./images/asr_diarization_tutorial.png)

## Instructions

The different steps are as follow:
- **Making API Requests to Speech AI Models**: Discover how to make API calls to the AI Endpoints models using Python, and start using `ASR`, `NMT`, and `TTS` APIs in this notebook.
- **ASR**: Send an audio file as input and receive a text transcription.
- **NMT**: Translate a given text from one language to another.
- **TTS**: Convert text to human-like speech.

By the end of this Getting Started notebook, you will be able to use `ASR`, `NMT`, and `TTS` APIs in your projects, opening up a wide range of possibilities for Speech AI applications.

Happy learning!

<br>

### 1 - Making API Requests to Speech AI Models

Before continuing, please consult the documentation named `/docs/00-making-http-request.MD` to understand how to make HTTP requests using Python.

This will be essential for interacting with the different AI models!

<br>

### ASR Pipeline

Let's start with Automatic Speech Recognition model.

**You can find all the necessary information, such as the endpoint URL, format of expected input data, and code examples, in the `/docs/01-ASR.MD` documentation file.**

Your task is to:

- Determine the `ASR` model you want to work with among the ones available on [AI Endpoints](https://endpoints.ai.cloud.ovh.net/)
- Get its endpoint `URL`
- Set up the necessary request headers
- Provide the input data expected by the `ASR` model. You can use the audio examples provided in `/audio_samples/` folder
- Send your request
- Print the request answer and analyse the audio transcription !

*You can write the code in the cell below.*

*Please note that the ASR solution is provided in the `/notebook_solutions/asr_basics.ipynb` notebook. But try to tackle the task on your own and ask questions before checking the solution!*

In [None]:
# Write your code here

# To execute code cell, select the cell and then ckick the ▶️ button in the menu above the notebook. You can also select the cell and execute `SHIFT + ENTER`.



## NMT Pipeline

Now that you have successfully completed the `ASR` task, let's move on to the Neural Machine Translation model. 

**You can find all the necessary information, such as the NMT endpoint URL, format of expected input data (we are not sending an audio file anymore), and code examples, in the `/docs/02-NMT.MD` documentation file.**

Remember that you'll need to: 

- Determine the `NMT` model you want to work with among the ones available on [AI Endpoints](https://endpoints.ai.cloud.ovh.net/)
- Get its endpoint `URL`
- Set up the necessary request headers
- Adjust the `JSON` request data to suit the newly chosen `NMT` model
- Send your request
- Print the response and analyze the translated text!

*You can code that in the cell below.* 

*Solution is provided in the `/notebook_solutions/nmt_basics.ipynb notebook`. Feel free to try the task on your own and ask questions before checking the solution.*

In [None]:
# Here is a text example that you can try to translate from English to another language:
input_text = "Devoxx Belgium is an annual conference for software developers and IT professionals. Held in Antwerp, Belgium, it is one of the largest and most well-known conferences in Europe, attracting thousands of attendees from around the world."

# Write your code here

# To execute code cell, select the cell and then ckick the ▶️ button in the menu above the notebook. You can also select the cell and execute `SHIFT + ENTER`.



## TTS Pipeline

Great job with the NMT model! 

Let's explore the Text-to-Speech model now. The process is similar to what you have done for the `ASR` and `NMT` sections, but be aware that the `accept` headers will be different this time, as we want to generate an audio file, not a text.

**You can find all the necessary information, such as the endpoint URL, format of expected input data, and code examples, in the `/docs/03-TTS.MD` documentation file.**

*You can code that in the cell below.*

*Solution is provided in the `/notebook_solutions/tts_basics.ipynb` notebook. Feel free to try the task on your own and ask questions before checking the solution.*

In [None]:
# Write your code here

# To execute code cell, select the cell and then ckick the ▶️ button in the menu above the notebook. You can also select the cell and execute `SHIFT + ENTER`.

### Conclusion

Congratulations on completing this introduction to Speech AI models and AI APIs!

You have successfully:

- Discovered the `ASR`, `NMT`, and `TTS` models
- Learned how to work with `APIs` and send `HTTP` requests
- Provided correct `JSON` input data, and used right header settings to achieve the desired output.

Now, you are ready to move on to another notebook where you will put your new skills into practice with more advanced Speech AI tasks, such as diarization, subtitle generation, and other exciting speech AI features.

The `/notebooks/Speech_AI_key_features.ipynb` notebook is waiting for you.