# Grammar Scoring Competition

## 1. Approach Overview
The objective of this competition is to predict a grammar score, which is fundamentally a **regression problem**. The chosen approach involves conversion of provided audio files to transcripts using a pre-trained ASR model name Whisper. This was then fed as the fine-tuning training data to a transformer based model named DaBERT for grammar scoring task.

## 2. Preprocessing Steps
The initial preprocessing pipeline is designed to standardize the audio data before feature extraction:
* **Package Installation:** All required libraries (`torchaudio`, `wordfreq`, etc.) are installed.
* **Audio Standardization:** The core `load_and_resample` function ensures all raw audio files are loaded and resampled to a consistent rate of **16,000 Hz**.
* **Mono Conversion:** Multi-channel audio (e.g., stereo) is converted to **mono** (single channel) by averaging the channels, which is standard practice for speech processing.


## 3. Pipeline Architecture
The machine learning pipeline follows a standard supervised learning flow:
1.  **Raw Input:** Audio File + Ground Truth Score.
2.  **Audio Preprocessing:** Resampling (16kHz) and Mono Conversion.
3.  **Speech to text:** The audio was then converted to raw text using ASR model name **Whisper**. The generated texts were saved inform of csv.
4.  **Model Training:** Training the **transformer** on the saved text csv.
5.  **Evaluation:** Performance is assessed using **Root Mean Square Error (RMSE)**.
## 4. Evaluation Results
The final results from the model run are presented below.

| Transformer model | Train RMSE | Leaderboard Score
| :--- | :--- | :--- |
| DaBERT |  0.16188851 | 0.600
| DaBERT-small|  0.16698851 | 0.599
| DaBERT-large |  0.1449279 | 0.735

In [None]:
!pip install wordfreq

Collecting wordfreq
  Downloading wordfreq-3.1.1-py3-none-any.whl.metadata (27 kB)
Collecting ftfy>=6.1 (from wordfreq)
  Downloading ftfy-6.3.1-py3-none-any.whl.metadata (7.3 kB)
Collecting langcodes>=3.0 (from wordfreq)
  Downloading langcodes-3.5.1-py3-none-any.whl.metadata (30 kB)
Collecting locate<2.0.0,>=1.1.1 (from wordfreq)
  Downloading locate-1.1.1-py3-none-any.whl.metadata (3.9 kB)
Downloading wordfreq-3.1.1-py3-none-any.whl (56.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.8/56.8 MB[0m [31m34.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading ftfy-6.3.1-py3-none-any.whl (44 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.8/44.8 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading langcodes-3.5.1-py3-none-any.whl (183 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m183.1/183.1 kB[0m [31m14.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading locate-1.1.1-py3-none-any.whl (5.4 kB)
Installing collected 

In [None]:
!pip install evaluate

Collecting evaluate
  Downloading evaluate-0.4.6-py3-none-any.whl.metadata (9.5 kB)
Downloading evaluate-0.4.6-py3-none-any.whl (84 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.1/84.1 kB[0m [31m6.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: evaluate
Successfully installed evaluate-0.4.6


In [None]:
!pip install TorchCodec

Collecting TorchCodec
  Downloading torchcodec-0.9.0-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (11 kB)
Downloading torchcodec-0.9.0-cp312-cp312-manylinux_2_28_x86_64.whl (2.1 MB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/2.1 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m82.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: TorchCodec
Successfully installed TorchCodec-0.9.0


In [None]:
!pip install transformers==4.57.1

Collecting transformers==4.57.1
  Downloading transformers-4.57.1-py3-none-any.whl.metadata (43 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/44.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.0/44.0 kB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
Downloading transformers-4.57.1-py3-none-any.whl (12.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.0/12.0 MB[0m [31m124.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: transformers
  Attempting uninstall: transformers
    Found existing installation: transformers 4.57.2
    Uninstalling transformers-4.57.2:
      Successfully uninstalled transformers-4.57.2
Successfully installed transformers-4.57.1


In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Visualizing waveform

In [None]:
import matplotlib.pyplot as plt
import torch
import torchaudio
import torchaudio.transforms as T
import torchaudio.functional as F

In [None]:
# Function to handle the actual plotting logic for waveform or spectrogram.
def _plot(waveform, sample_rate, title):
  """
  Internal helper function to plot the waveform or spectrogram.
  Handles conversion to numpy and setting up the plot axes.
  """
  # Convert PyTorch tensor to NumPy array for plotting
  waveform = waveform.numpy()

  num_channels, num_frames = waveform.shape
  # Create a time axis based on the number of frames and sample rate
  time_axis = torch.arange(0, num_frames) / sample_rate

  figure, axes = plt.subplots(num_channels, 1)
  if num_channels == 1:
    axes = [axes]
  for c in range(num_channels):
    # Plot the waveform vs time for the Waveform visualization
    if title == "Waveform":
      axes[c].plot(time_axis, waveform[c], linewidth=1)
      axes[c].grid(True)
    # Plot the spectrogram (frequency vs time)
    else:
      axes[c].specgram(waveform[c], Fs=sample_rate)
    if num_channels > 1:
      axes[c].set_ylabel(f'Channel {c+1}')
  figure.suptitle(title)
  plt.show(block=False)

# Public function to display the audio waveform (amplitude over time).
def plot_waveform(waveform, sample_rate):
  """Plots the time-domain waveform of the audio signal."""
  _plot(waveform, sample_rate, title="Waveform")

# Public function to display the audio spectrogram (frequency content over time).
def plot_specgram(waveform, sample_rate):
  """Plots the spectrogram of the audio signal (currently not used but included for completeness)."""
  _plot(waveform, sample_rate, title="Spectrogram")

In [None]:
# Purpose: Ensure all audio files have the same format:
# ✔ Same sample rate (e.g., 16kHz)
# ✔ Converted to mono (1 channel)

def load_and_resample(path, target_sr=16000):
    # Load the audio file from the specified path, obtaining the waveform tensor and original sample rate (sr).
    waveform, sr = torchaudio.load(path)  # shape: [channels, time]

    # Check if the original sample rate (sr) matches the target rate.
    if sr != target_sr:
        # Initialize the Resample transform from torchaudio.
        resampler = T.Resample(orig_freq=sr, new_freq=target_sr)
        # Apply the resampling transformation to the waveform.
        waveform = resampler(waveform)

    # Convert to mono if the audio has multiple channels (e.g., stereo)
    if waveform.shape[0] > 1:
        # Average the channels along the first dimension to create a single mono channel
        waveform = torch.mean(waveform, dim=0, keepdim=True)

    return waveform, target_sr

In [None]:
file_path = "drive/MyDrive/grammar_scoring/audios/train/audio_1.wav"
waveform, sample = load_and_resample(file_path)
plot_waveform(waveform, sample)
waveform = waveform.squeeze() #always squeeze waveform to avoid dimension related errors

## Speech to Text

### Whisper
Loading and tesing Whisper model

In [None]:
from transformers import WhisperProcessor, WhisperForConditionalGeneration, pipeline

In [None]:
device = "cuda:0" if torch.cuda.is_available() else "cpu"

In [None]:
# # load model and processor

#The Whisper model is intrinsically designed to work on audio samples of up to 30s in duration.
#However, by using a chunking algorithm, it can be used to transcribe audio samples of up to arbitrary length.
#This is possible through Transformers pipeline method. Chunking is enabled by setting chunk_length_s=30 when instantiating the pipeline.

pipe = pipeline(
  "automatic-speech-recognition",
  model="openai/whisper-medium",
  chunk_length_s=30,
  stride_length_s=2,
  device=device,
)

In [None]:
#this is the main function where transcripts are generated from audio
def transcript(file_name):
  file_path = "drive/MyDrive/grammar_scoring/audios/train/" + file_name
  waveform, sample = load_and_resample(file_path)
  waveform = waveform.squeeze()
  # input_features = processor(waveform, sampling_rate=sample, return_tensors="pt").input_features
  # predicted_ids = model.generate(input_features)
  # transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
  transcription = pipe(waveform, batch_size=8, return_timestamps=True)["text"]
  # print(transcription)
  return transcription


In [None]:
# #testing with an audio file
file_path = "audio_1.wav"
print(transcript(file_path))

### Generating csv
After loading and testing the pre-trained Whisper model, the audios from train dataset were now converted to their respective transcript which was saved in the form of csv.

In [None]:
import os
import pandas as pd
import warnings
warnings.filterwarnings('ignore')

In [None]:
#creating dataframe
df = pd.DataFrame(columns=["filename", "transcript"])

In [None]:
dir_path = "drive/MyDrive/grammar_scoring/audios/train"
for files in os.listdir(dir_path):

  name = os.path.splitext(os.path.basename(files))[0]
  print(name)

  # speech to audio convertion
  trans = transcript(files)
  print(trans)

  df.loc[len(df)] = {'filename': name, 'transcript': trans}


In [None]:
df.to_csv("drive/MyDrive/grammar_scoring/csvs/transcript_train.csv", index=False)

In [None]:
df

###data cleaning

In [None]:
import pandas as pd

In [None]:
df_main = pd.read_csv("drive/MyDrive/grammar_scoring/csvs/train.csv")
df_train = pd.read_csv("drive/MyDrive/grammar_scoring/csvs/transcript_train.csv")

In [None]:
df_final = pd.merge(df_train, df_main, on="filename", how="inner")

In [None]:
df_final

Unnamed: 0,filename,transcript,label
0,audio_106,The best day of my life would definitely be w...,5.0
1,audio_100,The moment in my life that I cherish the most...,3.0
2,audio_102,My favorite place to visit has always been re...,3.0
3,audio_10,"Okay, so my role model is someone like my mot...",3.0
4,audio_107,This goal is important to me because initiall...,3.5
...,...,...,...
425,audio_81,Hello this is Aisona. Today I will describe a...,2.5
426,audio_89,"My school playground, it was huge. It was ver...",2.0
427,audio_97,I can hear many cars driving by and lot of pe...,2.0
428,audio_94,The playground looks like a grain filled with...,3.5


In [None]:
# removing rows having non-english characters
# Regex pattern allowing only English letters, digits, whitespace, and some punctuation
import re
pattern = re.compile(r'^[\x00-\x7F]*$')

# Function to test each cell (convert to string to avoid errors)
def is_clean(value):
    return bool(pattern.match(str(value)))

# Keep rows where **all columns** satisfy the condition
clean_df = df_final[df_final.apply(lambda row: all(is_clean(x) for x in row), axis=1)]
clean_df

Unnamed: 0,filename,transcript,label
0,audio_106,The best day of my life would definitely be w...,5.0
1,audio_100,The moment in my life that I cherish the most...,3.0
2,audio_102,My favorite place to visit has always been re...,3.0
3,audio_10,"Okay, so my role model is someone like my mot...",3.0
4,audio_107,This goal is important to me because initiall...,3.5
...,...,...,...
425,audio_81,Hello this is Aisona. Today I will describe a...,2.5
426,audio_89,"My school playground, it was huge. It was ver...",2.0
427,audio_97,I can hear many cars driving by and lot of pe...,2.0
428,audio_94,The playground looks like a grain filled with...,3.5


In [None]:
clean_df = df_final.drop(columns=["filename"])
clean_df = clean_df.rename(columns={"label": "labels"})

## Transformer model for regression
A pre-trained DaBERT model was fine-tuned as a regressor for grammar scoring task. General steps like loading and tokenizing were done as per HuggingFace documentation.

In [None]:
import pandas as pd
from datasets import Dataset, Value
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
    TrainingArguments,
    Trainer
)
import torch
import evaluate

In [None]:
dataset_train = Dataset.from_pandas(clean_df)
dataset_train = dataset_train.cast_column("labels", Value("float32"))

Casting the dataset:   0%|          | 0/430 [00:00<?, ? examples/s]

In [None]:
model_name = "microsoft/deberta-v3-large"   # recommended for grammar scoring
tokenizer = AutoTokenizer.from_pretrained(model_name)

tokenizer_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/580 [00:00<?, ?B/s]

spm.model:   0%|          | 0.00/2.46M [00:00<?, ?B/s]



In [None]:
def tokenize_fn(batch):
    return tokenizer(
        batch["transcript"],
        padding="max_length",
        truncation=True,
        max_length=128,
    )

train_ds = dataset_train.map(tokenize_fn, batched=True)

Map:   0%|          | 0/430 [00:00<?, ? examples/s]

In [None]:
train_ds = train_ds.remove_columns(
    [col for col in train_ds.column_names if col not in ["input_ids","attention_mask","labels"]]
)

train_ds.set_format(type="torch")

In [None]:
model = AutoModelForSequenceClassification.from_pretrained(
    model_name,
    num_labels=1,                 # regression
    problem_type="regression"
)

pytorch_model.bin:   0%|          | 0.00/874M [00:00<?, ?B/s]

Some weights of DebertaV2ForSequenceClassification were not initialized from the model checkpoint at microsoft/deberta-v3-large and are newly initialized: ['classifier.bias', 'classifier.weight', 'pooler.dense.bias', 'pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [None]:
mse = evaluate.load("mse")

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    preds = logits.squeeze()
    return {"mse": mse.compute(predictions=preds, references=labels)["mse"]}

model.safetensors:   0%|          | 0.00/874M [00:00<?, ?B/s]

Downloading builder script: 0.00B [00:00, ?B/s]

In [None]:
training_args = TrainingArguments(
    output_dir="./grammar_model",
    num_train_epochs=40,
    per_device_train_batch_size=16,
    learning_rate=2e-5,
    logging_steps=50,
    save_steps=500,          # optional
    load_best_model_at_end=False,   # important: no eval → cannot pick “best”
)


In [None]:
class RMSETrainer(Trainer):
    def compute_loss(self, model, inputs, return_outputs=False, **kwargs):
        # Extract labels
        labels = inputs.pop("labels")

        # Forward pass
        outputs = model(**inputs)
        logits = outputs.logits.squeeze()

        # MSE loss
        mse = torch.nn.functional.mse_loss(logits, labels)

        # RMSE = sqrt(MSE)
        rmse = torch.sqrt(mse)

        return (rmse, outputs) if return_outputs else rmse

In [None]:
trainer = RMSETrainer(
    model=model,
    args=training_args,
    train_dataset=train_ds,
    tokenizer=tokenizer
)

  trainer = RMSETrainer(


In [None]:
#epoch = 100
trainer.train()

The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The model config and generation config were aligned accordingly, being updated with the tokenizer's values. Updated tokens: {'eos_token_id': 2, 'bos_token_id': 1}.
  | |_| | '_ \/ _` / _` |  _/ -_)
[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize?ref=models
[34m[1mwandb[0m: Paste an API key from your profile and hit enter:

 ··········


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mmuskaan-maurya06[0m ([33mmuskaan-maurya06-none[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


Step,Training Loss
50,1.0457
100,0.6449
150,0.4663
200,0.377
250,0.2968
300,0.2807
350,0.253
400,0.215
450,0.2078
500,0.1752


TrainOutput(global_step=1080, training_loss=0.2524117405767794, metrics={'train_runtime': 845.8361, 'train_samples_per_second': 20.335, 'train_steps_per_second': 1.277, 'total_flos': 4007318375731200.0, 'train_loss': 0.2524117405767794, 'epoch': 40.0})

In [None]:
trainer.save_model("drive/MyDrive/grammar_scoring/grammar_model_dabert_large4")

## Full model pipeline for inferencing

In [None]:
#step 1: Whisper for audio to text conversion
import torch
import torchaudio
import torchaudio.transforms as T
import torchaudio.functional as F
from transformers import WhisperProcessor, WhisperForConditionalGeneration, pipeline

device = "cuda:0" if torch.cuda.is_available() else "cpu"

#waveform load function
def load_and_resample(path, target_sr=16000):
    waveform, sr = torchaudio.load(path)  # shape: [channels, time]
    if sr != target_sr:
        resampler = T.Resample(orig_freq=sr, new_freq=target_sr)
        waveform = resampler(waveform)
    # convert to mono (average channels)
    if waveform.shape[0] > 1:
        waveform = torch.mean(waveform, dim=0, keepdim=True)
    return waveform, target_sr

#Whisper model defintion
pipe = pipeline(
  "automatic-speech-recognition",
  model="openai/whisper-medium",
  chunk_length_s=30,
  stride_length_s=2,
  device=device
)

#audio to text function
def transcript(file_name):
  file_path = "drive/MyDrive/grammar_scoring/audios/test/" + file_name
  waveform, sample = load_and_resample(file_path)
  waveform = waveform.squeeze()
  # input_features = processor(waveform, sampling_rate=sample, return_tensors="pt").input_features
  # predicted_ids = model.generate(input_features)
  # transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
  transcription = pipe(waveform, batch_size=8, return_timestamps=True)["text"]
  # print(transcription)
  return transcription


Device set to use cuda:0


In [None]:
# fine-tuned transformer(dabert) for text preprocessing and grammar scoring
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
    TrainingArguments,
    Trainer
)

model_name = "/content/drive/MyDrive/grammar_scoring/grammar_model_dabert_large3"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# def score_batch(sentences):
#     inputs = tokenizer(
#         sentences,
#         return_tensors="pt",
#         padding=True,
#         truncation=True,
#         max_length=128,
#     )

#     with torch.no_grad():
#         outputs = model(**inputs)

#     scores = outputs.logits.squeeze().tolist()
#     return scores

def score_text(sentence):
    inputs = tokenizer(
        sentence,
        return_tensors="pt",
        padding=True,
        truncation=True,
        max_length=128,
        device=device,
    )

    with torch.no_grad():
        outputs = model(**inputs)

    score = outputs.logits.squeeze().item()
    return score


In [None]:
import os
import pandas as pd

file_path = "drive/MyDrive/grammar_scoring/audios/test"
test_df =  pd.DataFrame(columns=["filename", "label"])

for files in os.listdir(file_path):
  name = os.path.splitext(os.path.basename(files))[0]
  print(name)
  # speech to audio convertion
  trans = transcript(files)
  print(trans)
  score = score_text(trans)
  print(score)

  test_df.loc[len(test_df)] = {'filename': name, 'label': score}



audio_106


Using custom `forced_decoder_ids` from the (generation) config. This is deprecated in favor of the `task` and `language` flags/config options.
Transcription using a multilingual Whisper will default to language detection followed by transcription instead of translation to English. This might be a breaking change for your use case. If you want to instead always translate your audio to English, make sure to pass `language='en'`. See https://github.com/huggingface/transformers/pull/28687 for more details.


 The best day of my life would definitely be when I went to the museum with my family. For me, it was one of the best days because it was such a nice time. It was my first time at the museum and my breath was taken away by the architecture, the art itself, and where it's located. It's such a beautiful location and it brought me so much peace. It really helped me. It helped me throughout the days because I felt the happiness because of me being there and with the people I most love and getting matcha. And the air was so crisp and beautiful. The weather was spectacular. There was nothing wrong with that day. And then we went to go eat dinner. a really nice restaurant so even though it's a very simple day but I really loved it.
5.195842266082764
audio_102_1
 My favorite place to visit has always been renting a cabin in southern Indiana or Kentucky. The reason that it's my favorite is because it's very secluded, surrounded by woods. There's a lot of nature around you. It's usually about a 

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


3.630868434906006
audio_1
 There could be a lot of different moments that I can say would be my favorite with my family. First off, it's very simple, like going out, eating out at a restaurant, and as long as the family is complete and together, then that is something that I would be really happy about. Another, another example would be when we go to the mall, for example, or any place which is very cold because, you know, here in the Philippines it's very hot, so I wanted to spend time with my family somewhere cold, like a mall.
3.099558115005493
audio_101
 My hobby is DIY or basically do it yourself. What I enjoy most about my hobby is making something out of raw materials. And of course, to do this, you would have to use a various amount of tools. The mediums I love to work with is metal, wood, or just building computers, building bikes, building things out of wood, basically just DIY, finding the little kinks and how things work and then building upon it. This hobby is done by myse

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 Airport is surrounded by the travelers and tourist people. People are waiting in line to, people are waiting in line for boarding pass and for checking of password. but there is height security and people who came from their journey. वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी के लिए वाज़र्मी क
2.090578079223633
audio_126
 The playground in the school has a wide variety of activities that refer to age groups of 5 to 12. It has rope climbing for the older kids. It has a high stair climber for the big kids and as well for the little kids. It has an area area a soft playground just in case they fall they could be well not hurt and the playground mainly looks li

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 Floods occur when the water overflows onto normally dry land, often due to heavy rainfall. Them failures are rapid rain, snow melt. They can be categorized into various types, with flash floods being particularly dangerous, developing within minutes and capable of sweeping away everything in their path. Okay, we think is run of floods pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pwede ang pw

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 បបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបបប of the but also the but the but the but the but the but the but the but the but the but the but the but the but the మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు మాల్లు �
3.0459086894989014
audio_137
 So my biggest goal in life is to be a doctor, to get a PhD. Right now I'm doing my masters, so it's about two years long, it has a two-year duration. So yeah, getting there will be hard because I have to do research, lab work, there's a lot of things that go into it, it's very lengthy, but I'm very, very, very determined to reach it because it will be so important to me as it will enable me to take care of my child. lot that go into it. Well, yeah, achieving i

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 Yesterday, yesterday last match we have me and my friends are visited the Chinnasong Stadium. We have watched the RCP, Rolchand of Bangalore versus Mumbai. We have me and my friend watched cricket match but unfortunately that match RCP we lost but we enjoyed a lot me and my friends as well as audience but we encouraged it to RCP. at RC we played very well ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముందిక్లు ముం
2.278496742248535
audio_146
 smartphone today is like mini computer which can do wonders so a smartphone is very useful because it contains so many applications so if you talk about the application that I use most it's YouTube a YouTube is an application where you watch videos of different interest like motivation like news like smartphone I will see you in the next one. Bye. news like smartphone availab

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు మారిలు సినిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సిన సి� also worked to inspire young athletes. Think about how can he
3.0459046363830566
audio_153_1
 My favorite movie is really motivational. It actually allows me to really hope something good about life and also to dream something bigger for me, for my family and for society as well. So it actually allows me to be better, to actually experience life as it is, to be more present in the moment and to be more grateful to whatever that I actually have at the moment to not give up in life, to not give up on one

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 One of the role models is someone who embodies resilience and empathy. They approach challenges with determination and support others through their kindness and support. This person emphasizes the importance of continuous learning and personal growth, always striving to make a passive impact on those around them. ability to lea- na makikita ng pasip na impact na ang mga ngayon sa mga nangyari. Pagkainan na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na ngayon na

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 हास्पेक्�ए अपके लि hospital is available. Thank you.
2.0783095359802246
audio_39
 Um, the playground, it looks very, it depends, it could look very big or very small. It just depends on the type of playground and the amount of people, the demographic of the surrounding location, but, uh, it looks very childish. There's very bright colors, um, lots of metal, right? Because, uh, it has to be structured very well because people are going to be climbing on top of it, around it, beside it, every kind of preposition that you want to add to it. Are there any particular games or activities? Well, there's your typical ones, basketball, soccer, there might be some tennis, they're playing tag. There was a specific game when I was a kid that you tie this ball to your ankle and you just, you do a jump rope-esque activity where you hop on that ankle while the ball while the while the other leg that is a that you
3.5389626026153564
audio_43
 My favorite moment with my family is back in 2022 it was m

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने अपने आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको
3.027499198913574
audio_37
 Imagine a vibrant market place in Manila, bustling with activities, vendors with colorful stalls, or shouting out with their wares, voicely, lively. The air is filled with enticing aroma of spices, fresh produce, and sizzling food. food should your eyes meet with excitement you can navigate through the market and carefully examine everything from the handmade crops to exotic delicacies that the merchants are currently selling to you. I know it is sensory overload but the market offers a lot of variety that you could visit.
3.1602180004119873
audio_47
 India is having every year floods basically

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 The one of the most memorable moment, memorable day which I spend with my family was the day when my newborn baby sister was born every day. Everybody in my family was just overjoyed with the birth of my little sister. Then there was fully joy and happy and their surrounding my family was just amazing. I was also a member of the family so I was also... मैं आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आपको आ�
2.799330472946167
audio_32_1
 Ok, my best day of life is by achieving a good marks in the 10th class exam, it is my best day and special day to me and also it is my special day to my parents also. In this, I think when I was writing the exam, it was difficult to me to wri

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 Hospital was one of the most important places for all persons because when we got an emergency situation, we want to go to the hospital. For example, if accident or something, injury, we want to go for hospital and the hospital is very important because they are very rushed because of people who suffered injury or any diseases, many rushes are in hospital. We want to ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 Ok, my best day of life is by achieving good marks in 10th to class examinations. It is my best day and special day to me and also it is my special day to my parents also. In this, I think when I was writing exams, it was difficult to me to write exams. I cannot feel that it was a bad experience to me. While writing exams, I can think that I want to get a good marks. but after getting the result, I will try it........................... was sick but after getting the result I felt very good by getting a good mox which was which was very very hopeful to me and also very pleasant to me while getting a good mox and my parents also very happy very enjoy which was getting my good mox and my friends also which was very pleasant, I mean, getting a good most, I mean, it was a good feel to లిసింది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది ముంది మ

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 It's a pleasant day in the morning. I went to the supermarket to buy some groceries. I was surprised by seeing the crowd at the market. It's a Halloween sale at the market. The people are very excited to buy the groceries, candies mainly. Finally, the Halloween festival is about, mainly about time. మాల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్లల్ల్లల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్లల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్ల్� நான் இங்கே பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాలిసిక మాల్ల్ల్ల్ల్ల్ల్ల్లో మాల్ల్ల్ల్ల్లో మాల్ల్ల్ల్లో మాల్ల్ల్ల్లో మాల్ల్ల్ల్లో మాల్ల్ల్లో మాల్ల్ల్లో మాల్ల్ల్లో మాల్ల్ల్లో మాల్ల్ల్లో మాల్ల్ల్లో మాల్ల్ల్లో మాల్ల్ల్లో మాల్ల్ల్లో మాల్ల్ల్లో మాల్ల్ల్లో మాల్ల్ల్లో మాల్ల్ల్లో మాల్ల్లో మాల్ நான் இங்கே பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று நான் பார்க்கலாம் என்று 

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 When you go to the airport, the staff is very friendly and when you go to the inside of the airport, there are many foreigners and family are sad because their loved ones are leaving out of the country.
2.4749131202697754
audio_70


Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.


 My favorite place is in Andhra Pradesh. It is in Chittoor district. It is the temple of Lord Venkateshwara Swamy. It has 7 hills to visit the temple. It is a good place to visit. ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిలు ముందిరిమానా లాడి ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమానా ఇదిరిమా
2.5732250213623047
audio_74
 A crowded market is easily spotted when you have a lot of small establishments where they sell a lot of things, different things, and they will have food, they will have clothes for sale, they will have a lot of games and cell phones to sell, as well as they will have a lot of people going there to purchase things for chea

In [None]:
test_df.to_csv("output4.csv", index=False)