## Resources
- https://musicinformationretrieval.com/
- https://towardsdatascience.com/music-genre-classification-with-python-c714d032f0d8

In [None]:
!pip show librosa

In [None]:
import librosa

In [None]:
!ls 

In [None]:
!ls -la royalty-free-music

In [None]:
audio_path = 'royalty-free-music/bensound-dubstep.mp3'

In [None]:
music_array , sample_rate = librosa.load(audio_path)

In [None]:
print(type(music_array), type(sample_rate))

In [None]:
print(music_array.shape, sample_rate)

### Note:  The above selected  audio time series as a numpy array with a default sampling rate(sr) of 22KHZ mono

In [None]:
music_array2 , sample_rate2 = librosa.load(audio_path, sr=44100)

In [None]:
print(music_array2.shape, sample_rate2)

In [None]:
music_array3 , sample_rate_none = librosa.load(audio_path, sr=None)

In [None]:
print(music_array3.shape, sample_rate_none)

## What is Sample Rate:
- The sample rate is the number of samples carried by the selected audio per second, measured either in Hz or kHz

# Playing audio in Jupyter Notebook

In [None]:
import IPython.display as ipd
ipd.Audio(audio_path)

# Audio Signals or Waveform visualization

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import librosa.display

In [None]:
plt.figure(figsize=(15, 4), facecolor=(.9, .9, .9))
librosa.display.waveshow(music_array2, sr=sample_rate2, color='pink')

# Spectrogram Visualization
- A spectrogram is a visual representation of the spectrum of frequencies of sound or other signals as they vary with time. 
- Spectrograms are sometimes called sonographs, voiceprints, or voicegrams. 
- When the data is represented in a 3D plot, they may be called waterfalls. 
- In 2-dimensional arrays, the first axis is frequency while the second axis is time.

In [None]:
X = librosa.stft(music_array2)
Xdb = librosa.amplitude_to_db(abs(X))

In [None]:
plt.figure(figsize=(14, 5))
librosa.display.specshow(Xdb, sr=sample_rate2, x_axis='time', y_axis='hz')
plt.colorbar()

## The graph 
- The vertical axis shows frequencies (from 0 to 10kHz), and the horizontal axis shows the time of the clip. 
- Since all action is taking place at the bottom of the spectrum, we can convert the frequency axis to a logarithmic one.

In [None]:
print(music_array2.shape, sample_rate2)

In [None]:
librosa.display.specshow(Xdb, sr=sample_rate2, x_axis='time', y_axis='log')
plt.colorbar()

# Run the default beat tracker

In [None]:
tempo, beat_frames = librosa.beat.beat_track(y=music_array2, sr=sample_rate2)
print('Estimated tempo: {:.2f} beats per minute'.format(tempo))

## Frames:
- Frames here correspond to short windows of the signal (y), each separated by hop_length = 512 samples. 
- librosa uses centered frames, so that the kth frame is centered around sample k * hop_length.

In [None]:
# 4. Convert the frame indices of beat events into timestamps
beat_times = librosa.frames_to_time(beat_frames, sr=sample_rate2)

In [None]:
beat_times.shape

In [None]:
beat_times

In [None]:
# Got help from here - https://musicinformationretrieval.com/beat_tracking.html
plt.figure(figsize=(14, 5))
librosa.display.waveshow(music_array2, alpha=0.1)
plt.vlines(beat_times, -1, 1, color='r')
plt.ylim(-1, 1)

# Saving to the local disk

In [None]:
import soundfile as sf

In [None]:
sf.write('temp_file_x.wav', x, samplerate=sample_rate)

In [None]:
!ls

In [None]:
sf.write('temp_file_48000.wav', x, 48000, 'PCM_24')

In [None]:
temp_x, temp_sr = sf.read('temp_file_48000.wav')
print(temp_x.shape, temp_sr)
ipd.Audio(temp_x, rate=temp_sr) 