## Finetune your own Speech-to-Text Whisper model on the language of your choice on a GPU, for free!

### Setup GPU
First, you'll need to enable GPUs for the notebook: Navigate to Edit→Notebook Settings Select T4 GPU from the Hardware Accelerator section Click Save and accept. Next, we'll confirm that we can connect to the GPU:

In [None]:
import torch

if not torch.cuda.is_available():
    print("GPU NOT available!")
else:
    print("GPU is available!")

### Setup and login Hugging Face 

The dataset we use for finetuning is Mozilla's [Common Voice](https://commonvoice.mozilla.org/).

In order to download the Common Voice dataset, track training and evaluation metrics of the finetuning and save your final model to use it and share it with others later, we will be using the Hugging Face (HF) platform. Before starting, make sure you:
1. have a HF [account](https://huggingface.co/join)
2. set up [personal access token](huggingface.co/settings/tokens)
3. login to hugging face in this notebook by running the command below and using your token


In [None]:
!huggingface-cli login

### Download and install speech-to-text-finetune package

In [None]:
!git clone https://github.com/mozilla-ai/speech-to-text-finetune.git
%cd speech-to-text-finetune/

In [None]:
!pip install --quiet -e .

In [None]:
from speech_to_text_finetune.finetune_whisper import run_finetuning

### Configure finetuning parameters

In [None]:
# @title Finetuning configuration and hyperparameter setting

model_id = "openai/whisper-tiny"  # @ ["openai/whisper-tiny", "openai/whisper-small", "openai/whisper-medium"]
dataset_id = "mozilla-foundation/common_voice_17_0"
language = "Greek"

repo_name = "colab-test"
make_repo_private = True
test_max_steps = 10

### Start finetuning job

Note that this might take a while, anything from 10min to 10hours depending on your model choice and hyper-parameter configuration

In [None]:
run_finetuning(config_path="example_data/config.yaml")