# Converting MP3 to WAV
![Mp3 to Wav](https://www.videoconverterfactory.com/tips/imgs-sns/mp3-to-wav.webp)


To enable the effective utilization of our Automatic Speech Recognition (ASR) models, including Whisper and FineTune, it is crucial to convert the audio files from MP3 format to WAV format. This conversion is imperative because Kaggle does not support the MP3 audio format, making it necessary for seamless processing and analysis within the platform.


# loading Libraries

In [1]:
import os
import skimage.io
from tqdm.notebook import tqdm
import zipfile
import shutil
from pydub import AudioSegment
from joblib import Parallel, delayed



In [2]:
!mkdir -p /tmp/CV15_ASR_dataset

In [3]:
!ls /tmp

CV15_ASR_dataset  tmptc1ro1my.json
clean-layer.sh	  v8-compile-cache-0
conda		  yarn--1687755480762-0.013045315151684056
core-js-banners   yarn--1687755481914-0.6939425404263324
hsperfdata_root   yarn--1687755488161-0.8879838407104355
kaggle.log	  yarn--1687755659204-0.9573163719512416
openmpi		  yarn--1687755660332-0.006012616638098267
package_list	  yarn--1687755666484-0.773890063104473


In [4]:
data = '''{
  "title": "ASR_CV15_Hindi_wav_16000",
  "id": "SakshiRathi77/ASR_CV15_Hindi_wav_16000",
  "licenses": [
    {
      "name": "CC0-1.0"
    }
  ]
}
'''
text_file = open("/tmp/CV15_ASR_dataset/dataset-metadata.json", 'w+')
n = text_file.write(data)
text_file.close()

# Providing root path and output path

In [5]:

ROOT_PATH = "/kaggle/input/cv15-hindi/hi/hi/clips"
OUTPUT_DIR = "/tmp/CV15_ASR_dataset/audio_wav_16000"

In [6]:
os.mkdir(OUTPUT_DIR)

## Converting and Downsampling
The save_fn function is responsible for converting MP3 audio files to WAV format, setting the frame rate to 16000 Hz, and saving the converted WAV files to the specified output directory.

In [7]:
def save_fn(filename):
    
    path = f"{ROOT_PATH}/{filename}"
    save_path = f"{OUTPUT_DIR}"
    if not os.path.exists(save_path):
        os.makedirs(save_path, exist_ok=True)
    
    if os.path.exists(path):
        try:
            sound = AudioSegment.from_mp3(path)
            sound = sound.set_frame_rate(16000)
            sound.export(f"{save_path}/{filename[:-4]}.wav", format="wav")
        except:
            print(path)

In [8]:
path = "/kaggle/input/cv15-hindi/hi/hi/clips/"
audio_files = os.listdir(path)

In [9]:
%%capture
Parallel(n_jobs=8, backend="multiprocessing")(
    delayed(save_fn)(filename) for filename in tqdm(audio_files)
)

# Result stored in zip file

In [10]:
%%capture
!zip -r "./audio_wav_16000.zip" "/tmp/CV15_ASR_dataset/audio_wav_16000/"