# Audio Editor
## This notebook outlines the techniques used to edit the audio signal (removing long silences)

### Import the libraries

In [32]:
from pydub import AudioSegment
import numpy as np

### Read MP3 file using PyDub

In [33]:
audiofile = AudioSegment.from_file("obama.mp3")

In [34]:
audiofile

### Convert the samples into Numpy array

In [35]:
data_mp3 = np.array(audiofile.get_array_of_samples())

In [36]:
data_mp3.shape

(2744064,)

### Get the frame rate

In [37]:
fs_mp3 = audiofile.frame_rate

In [38]:
fs_mp3

44100

### Find the signal duration

In [39]:
print(f'Signal Duration = {data_mp3.shape[0] / (2 * fs_mp3)} seconds')

Signal Duration = 31.111836734693878 seconds


### Read a wav file
- Use scipy's wavfile

In [40]:
from scipy.io import wavfile
fs_wav, data_wav = wavfile.read("obama.wav")

### Get the frame rate

In [41]:
fs_wav

44100

In [42]:
data_wav.shape

(1372032, 2)

### Get the duration of the signal

In [43]:
print(f'Signal Duration = {data_wav.shape[0] / fs_wav} seconds')

Signal Duration = 31.111836734693878 seconds


### Normalize the signal

In [46]:
data_wav_norm = data_wav / (2**15)

### Length of the signal

In [47]:
signal_len = len(data_wav_norm)
signal_len

1372032

### Fix segment size in seconds

In [48]:
segment_size_t = 1 

### Segment size in samples

In [49]:
segment_size = segment_size_t * fs_wav

### Split audio signal into 1 second segments

In [50]:
segments = np.array([data_wav_norm[x:x + segment_size] for x in
                     np.arange(0, signal_len, segment_size)])


Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray



### How many segments ?

In [51]:
segments.shape[0]

32

### Save each segment in a seperate filename

In [52]:
for iS, s in enumerate(segments):
    wavfile.write("obama_segment_{0:d}_{1:d}.wav".format(segment_size_t * iS,
                                                              segment_size_t * (iS + 1)), fs_wav, (s))

### Remove pauses using an energy threshold = 50% of the median energy

In [53]:
energies = [(s**2).sum() / len(s) for s in segments]

### Fix 50% as threshold

In [54]:
thres = 0.5 * np.median(energies)

### Collect the indexes of segments which is above this 50% threshold

In [55]:
index_of_segments_to_keep = (np.where(energies > thres)[0])

### Get segments that have energies higher than the threshold

In [56]:
segments2 = segments[index_of_segments_to_keep]

### Concatenate segments to signal

In [57]:
new_signal = np.concatenate(segments2)

### Write the file

In [58]:
wavfile.write("obama_processed.wav", fs_wav, new_signal)

## ASSIGNMENT: Take any mp3 file containing speeches of someone of your choice with pauses inbetween the speech and apply the above techniques to remove pauses. Become an audio editor!
- Submit the mp3 file of original speech, notebook file, mp3 file of edited speech