# Standard Sample Rate

* What is standard sample rate?

In audio production, a sample rate (or "sampling rate") defines how many times per second a sound is sampled. Technically speaking, it is the frequency of samples used in a digital recording. The standard sample rate used for audio CDs is 44.1 kilohertz (44,100 hertz).

* Why should we resample?

Some of the audio files are at a different rate than 44100Hz. For example if we have a rate of 48000HZ, this means that 1 second of audio will have an array size of 48000 for some sound files, while it will have a smaller array size of 44100 for the others. Once again, we must standardize and convert all audio to the same sampling rate so that all arrays have the same dimensions.

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
import os
os.chdir("drive/My Drive/AMHARIC")
os.listdir()

['README.md', 'data', 'kaldi-script', 'lang', 'lm', 'meta_data.csv']

In [2]:
import librosa   #for audio processing
import IPython.display as ipd
import matplotlib.pyplot as plt
import numpy as np
from scipy.io import wavfile #for audio processing
#for pipeline
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import FunctionTransformer
import warnings
warnings.filterwarnings("ignore")

* Create meta data to pass the path of the audios to resample it to 44.1 KHZ

In [None]:
import pandas as pd
meta_data=pd.read_csv("data/train/trsTrain.txt",sep="\t",header=None)
def create_meta_data(df:pd.DataFrame, column1:str, column2:str):
    df.rename(columns = {0: column1}, inplace = True)
    df[column2] = df[column1].apply(lambda x: x.split("</s>")[1].replace("(", "").replace(")", "").strip())
    df[column1] = df[column1].apply(lambda x: x.split("</s>")[0])
    df[column1] = df[column1].apply(lambda x: x.split("<s>")[1].strip())
    df[column2] = df[column2].apply(lambda x: "data/train/wav/"+x+".wav")
    return df

pipe = Pipeline(steps = [("metadata", FunctionTransformer(create_meta_data, kw_args={"column1":'Transcript', "column2": 'audio'}))])
meta_pipe = pipe.fit_transform(meta_data)
meta_data

## Resampling

In [None]:
# resampling any loaded audio files to 44.1KHZ 
def resample(df, column):
    sampled_audio = []
    rates = []
    for i in df[column]:
        audio, rate=librosa.load(i, sr=44100)
        sampled_audio.append(audio)
        rates.append(rate)
    
    return sampled_audio, rates
resample(meta_data, 'audio')
