# Benchmarking Results in Malayalam datasets

## Whisper-Event Leaderboard

[HuggingFace Team](huggingface.co/) conducted a whisper event on fine tuning Whisper model to achieve the State of the art results performance for various languages.

During this competitions lot of models where evaluated on dataset like [Common Voice](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0). 

For the language Malayalam, the results are as follows in Common Voice dataset subsection of Malayalam:

![Results in common voice](https://user-images.githubusercontent.com/24592806/222974236-44f047ec-e072-4f6a-b49f-ed88afb02999.png)

There was an evaluation in Google Fluers Malaylam subsection as well:

![Results in Fluers](https://user-images.githubusercontent.com/24592806/222974253-0fb96dd3-64ae-4ea2-a022-fd3db35cf721.png)


[Details are from Huggingface whisper-event leaderboard](https://huggingface.co/spaces/whisper-event/leaderboard)

## Benchmarking in Common Voice Dataset

In [None]:
import pandas as pd
from malayalam_asr_benchmarking.commonvoice import evaluate_whisper_model_common_voice 

**ASR models to benchmark**

In [None]:
asr_models = ["thennal/whisper-medium-ml",
              "anuragshas/whisper-large-v2-ml",
              "parambharat/whisper-small-ml",
              "DrishtiSharma/whisper-large-v2-malayalam",
              "parambharat/whisper-base-ml",
              "kurianbenoy/whisper_malayalam_largev2",
              "parambharat/whisper-tiny-ml"
             ]

In [None]:
asr_models[-1]

'parambharat/whisper-tiny-ml'

In [None]:
#|eval: false
evaluate_whisper_model_common_voice(asr_models[-1])

Found cached dataset common_voice_11_0 (/home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0)
Loading cached processed dataset at /home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0/cache-374585c2877047e3.arrow
Loading cached processed dataset at /home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0/cache-22670505c562e0d4.arrow


Total time taken: 84.1298975944519
The WER of model: 38.31
The CER of model: 21.93
The model size is: 37.76M


In [None]:
#|eval: false
evaluate_whisper_model_common_voice(asr_models[0])

Downloading:   0%|          | 0.00/1.03k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/3.06G [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/830 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/494k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/52.7k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/2.11k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/2.06k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/185k [00:00<?, ?B/s]

Found cached dataset common_voice_11_0 (/home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0)
Loading cached processed dataset at /home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0/cache-374585c2877047e3.arrow
Loading cached processed dataset at /home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0/cache-22670505c562e0d4.arrow


Total time taken: 1086.15980052948
The WER of model: 11.56
The CER of model: 5.41
The model size is: 763.86M


In [None]:
#|eval: false
evaluate_whisper_model_common_voice(asr_models[1])

FileNotFoundError: [Errno 2] No such file or directory: '/home/.cache/huggingface/hub/models--anuragshas--whisper-large-v2-ml/refs/main'

#| hide
## How to store results

In [None]:
#| hide
import pandas as pd
import numpy as np
import matplotlib as mpl

df = pd.DataFrame([[38.0, 2.0, 18.0, 22.0, 21, np.nan],[19, 439, 6, 452, 226,232]],
                  index=pd.Index(['Tumour (Positive)', 'Non-Tumour (Negative)'], name='Actual Label:'),
                  columns=pd.MultiIndex.from_product([['Decision Tree', 'Regression', 'Random'],['Tumour', 'Non-Tumour']], names=['Model:', 'Predicted:']))
df.style

Model:,Decision Tree,Decision Tree,Regression,Regression,Random,Random
Predicted:,Tumour,Non-Tumour,Tumour,Non-Tumour,Tumour,Non-Tumour
Actual Label:,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
Tumour (Positive),38.0,2.0,18.0,22.0,21,
Non-Tumour (Negative),19.0,439.0,6.0,452.0,226,232.0


In [None]:
df.style.format(precision=0, na_rep='MISSING', thousands=" ",
                formatter={('Decision Tree', 'Tumour'): "{:.2f}",
                           ('Regression', 'Non-Tumour'): lambda x: "$ {:,.1f}".format(x*-1e6)
                          })

Unnamed: 0,model_name,wer,cer,time_taken,model_size
0,thennal/whisper-medium-ml,0,0,0,1M
1,anuragshas/whisper-large-v2-ml,0,0,0,1M
2,parambharat/whisper-small-ml,0,0,0,1M
3,DrishtiSharma/whisper-large-v2-malayalam,0,0,0,1M
4,parambharat/whisper-base-ml,0,0,0,1M
5,kurianbenoy/whisper_malayalam_largev2,0,0,0,1M
6,parambharat/whisper-tiny-ml,0,0,0,1M


In [None]:
#| hide
df =pd.DataFrame({"model_name": asr_models})

df["wer"] = 0
df["cer"] = 0
df["time_taken"] = 0
df["model_size"] = "1M"

df.head()

Unnamed: 0,model_name,wer,cer,time_taken,model_size
0,thennal/whisper-medium-ml,0,0,0,1M
1,anuragshas/whisper-large-v2-ml,0,0,0,1M
2,parambharat/whisper-small-ml,0,0,0,1M
3,DrishtiSharma/whisper-large-v2-malayalam,0,0,0,1M
4,parambharat/whisper-base-ml,0,0,0,1M
