# Benchmarking Results in Malayalam datasets

## Whisper-Event Leaderboard

[HuggingFace Team](huggingface.co/) conducted a whisper event on fine tuning Whisper model to achieve the State of the art results performance for various languages.

During this competitions lot of models where evaluated on dataset like [Common Voice](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0). 

For the language Malayalam, the results are as follows in Common Voice dataset subsection of Malayalam:

![Results in common voice](https://user-images.githubusercontent.com/24592806/222974236-44f047ec-e072-4f6a-b49f-ed88afb02999.png)

There was an evaluation in Google Fluers Malaylam subsection as well:

![Results in Fluers](https://user-images.githubusercontent.com/24592806/222974253-0fb96dd3-64ae-4ea2-a022-fd3db35cf721.png)


[Details are from Huggingface whisper-event leaderboard](https://huggingface.co/spaces/whisper-event/leaderboard)

## Benchmarking in Common Voice Dataset

In [None]:
import pandas as pd
from tqdm import tqdm

from malayalam_asr_benchmarking.commonvoice import evaluate_whisper_model_common_voice 

### ASR models to benchmark

In [None]:
asr_models = ["thennal/whisper-medium-ml",
              "anuragshas/whisper-large-v2-ml",
              "parambharat/whisper-small-ml",
              "DrishtiSharma/whisper-large-v2-malayalam",
              "parambharat/whisper-base-ml",
              "kurianbenoy/whisper_malayalam_largev2",
              "parambharat/whisper-tiny-ml"
             ]

In [None]:
asr_models[-1]

'parambharat/whisper-tiny-ml'

### Running across all asr models

In [None]:
#| eval: false
wer_list = []
cer_list = []
model_size_list = []
time_list = []

In [None]:
#| eval: false
for asr in tqdm(asr_models):
    evaluate_whisper_model_common_voice(asr, wer_list, cer_list, model_size_list, time_list)

  0%|          | 0/7 [00:00<?, ?it/s]Found cached dataset common_voice_11_0 (/home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0)
Loading cached processed dataset at /home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0/cache-374585c2877047e3.arrow
Loading cached processed dataset at /home/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/ml/11.0.0/2c65b95d99ca879b1b1074ea197b65e0497848fd697fdb0582e0f6b75b6f4da0/cache-22670505c562e0d4.arrow


In [None]:
#| eval: false
wer_list

[11.56, 24.46, 21.65, 26.25, 30.33, 300.7, 38.31]

### Store results in pandas

In [None]:
#| eval: false
df = pd.DataFrame({"models": asr_models, "wer": wer_list, "cer": cer_list, "model size": model_size_list,"time(s)": time_list,})

In [None]:
#| eval: false
df.head(7)

Unnamed: 0,models,wer,cer,model size,time(s)
0,thennal/whisper-medium-ml,11.56,5.41,763.86M,924.979711
1,anuragshas/whisper-large-v2-ml,24.46,11.64,1.54B,1779.561592
2,parambharat/whisper-small-ml,21.65,11.78,241.73M,273.555688
3,DrishtiSharma/whisper-large-v2-malayalam,26.25,13.17,1.54B,1773.661774
4,parambharat/whisper-base-ml,30.33,16.16,72.59M,96.419609
5,kurianbenoy/whisper_malayalam_largev2,300.7,292.82,1.54B,5034.771624
6,parambharat/whisper-tiny-ml,38.31,21.93,37.76M,59.535259


In [None]:
#| eval: false
df.to_parquet("/home/commonvoice_benchmarking_results.parquet")