### Monta o conjunto de features

### Índice
1. [Importe das bibliotecas](#import)
1. [Constantes](#constants)
1. [Abrindo os arquivos de áudio](#open_files)
1. [Retorno Onset_Strengh](#onset_strength)
1. [Retorno MFCC](#mfcc) 
1. [Retorno Spectral Centroid](#spectral_centroid)
1. [Retorno RMS](#rms)
1. [Retorno ZCR](#zcr)
1. [Retorno Mel Spectogram](#mel_spec)
1. [Retorno do Tempogram](#tempogram)
1. [Retorno do Recurrence Matrix](#recurrence)

<a id="import"></a>
* Importe Bibliotecas

In [20]:
import librosa
import numpy as np

%load_ext autoreload
%autoreload 2
%run ./S0_util.ipynb

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


<a id="constants"></a>
* Constantes

In [21]:
SPEECH_PATH = "../data/speechs"    
MUSIC_PATH = "../data/music"    
GLOBO_PATH = "../data/globo"
SEGMENT_DURATION =  3
SAMPLE_RATE = 44100
NUM_MFCC = 13
HOP_LENGTH = 512
FRAME_LENGTH = 2048
SAMPLES_PER_SEGMENT =  SAMPLE_RATE * SEGMENT_DURATION

<a id="open_files"></a>
* Abre arquivos de cada tipo para verificar o retorno das características

In [22]:
### MUSCIC
music1  = "../data/music/blues/blues.00000.wav"
music1, _ = librosa.load(music1, sr=SAMPLE_RATE)
music1 =  music1

### NOVELA
soap_fav_intro = "../data/globo/8701320.wav"
soap_intro1, _ = librosa.load(soap_fav_intro, sr=SAMPLE_RATE)
soap_intro1 =  soap_intro1

### FALA
speech1 = "../data/speechs/sample-50.wav"
speech1, _ = librosa.load(speech1, sr=SAMPLE_RATE)
speech1 =  speech1

<a id="onset_strength"></a>
* Verifica o shape do retorno do `onset_strength`

> Music

In [23]:
onset = librosa.onset.onset_strength(y=music1, sr=SAMPLE_RATE)
print("ALL: {}".format(onset.shape))
onset = librosa.onset.onset_strength(y=music1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE)
print("SEG: {}".format(onset.shape))

ALL: (2586,)
SEG: (431,)


> Soap popera

In [67]:
onset = librosa.onset.onset_strength(y=soap_intro1, sr=SAMPLE_RATE)
print("ALL: {}".format(onset[np.newaxis, ...].shape))
onset = librosa.onset.onset_strength(y=soap_intro1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE)
print("SEG: {}".format(onset[np.newaxis, ...].shape))

ALL: (1, 5311)
SEG: (1, 431)


> Speech

In [68]:
onset = librosa.onset.onset_strength(y=speech1, sr=SAMPLE_RATE)
print("ALL: {}".format(onset[np.newaxis, ...].shape))
onset = librosa.onset.onset_strength(y=speech1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE)
print("SEG: {}".format(onset[np.newaxis, ...].shape))

ALL: (1, 1151)
SEG: (1, 431)


* Verifica o shape do retorno do `MFCC`

> Music

In [26]:
mfcc = librosa.feature.mfcc(music1, sr=SAMPLE_RATE)
print("ALL: {}".format(mfcc.shape))
mfcc = librosa.feature.mfcc(y=music1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE)
print("SEG: {}".format(mfcc.shape))

ALL: (20, 2586)
SEG: (20, 431)


> Soap popera

In [27]:
mfcc = librosa.feature.mfcc(soap_intro1, sr=SAMPLE_RATE)
print("ALL: {}".format(mfcc.shape))
mfcc = librosa.feature.mfcc(y=soap_intro1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE)
print("SEG: {}".format(mfcc.shape))

ALL: (20, 5311)
SEG: (20, 431)


> Speech

In [28]:
mfcc = librosa.feature.mfcc(speech1, sr=SAMPLE_RATE)
print("ALL: {}".format(mfcc.shape))
mfcc = librosa.feature.mfcc(y=speech1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE)
print("SEG: {}".format(mfcc.shape))

ALL: (20, 1151)
SEG: (20, 431)


* Verifica o shape do retorno do `Spectral Centroid`

> Music

In [29]:
centroid = librosa.feature.spectral_centroid(music1+0.01, sr=SAMPLE_RATE)
print("ALL: {}".format(centroid.shape))
centroid = librosa.feature.spectral_centroid(y=music1[0:SAMPLES_PER_SEGMENT]+0.01, sr=SAMPLE_RATE)
print("SEG: {}".format(centroid.shape))

ALL: (1, 2586)
SEG: (1, 431)


> Soap popera

In [30]:
centroid = librosa.feature.spectral_centroid(soap_intro1+0.01, sr=SAMPLE_RATE)
print("ALL: {}".format(centroid.shape))
centroid = librosa.feature.spectral_centroid(y=soap_intro1[0:SAMPLES_PER_SEGMENT]+0.01, sr=SAMPLE_RATE)
print("SEG: {}".format(centroid.shape))

ALL: (1, 5311)
SEG: (1, 431)


> Speech

In [31]:
centroid = librosa.feature.spectral_centroid(speech1+0.01, sr=SAMPLE_RATE)
print("ALL: {}".format(centroid.shape))
centroid = librosa.feature.spectral_centroid(y=speech1[0:SAMPLES_PER_SEGMENT]+0.01, sr=SAMPLE_RATE)
print("SEG: {}".format(centroid.shape))

ALL: (1, 1151)
SEG: (1, 431)


<a id="rms"></a>
* Verifica o shape do retorno do `RMS`

> Music

In [32]:
rms = librosa.feature.rms(y=music1, frame_length=FRAME_LENGTH, hop_length=HOP_LENGTH, center=True)
print("ALL: {}".format(rms.shape))
rms = librosa.feature.rms(y=music1[0:SAMPLES_PER_SEGMENT], frame_length=FRAME_LENGTH, hop_length=HOP_LENGTH, center=True)
print("SEG: {}".format(rms.shape))

ALL: (1, 2586)
SEG: (1, 431)


> Soap popera

In [33]:
rms = librosa.feature.rms(y=soap_intro1, frame_length=FRAME_LENGTH, hop_length=HOP_LENGTH, center=True)
print("ALL: {}".format(rms.shape))
rms = librosa.feature.rms(y=soap_intro1[0:SAMPLES_PER_SEGMENT], frame_length=FRAME_LENGTH, hop_length=HOP_LENGTH, center=True)
print("SEG: {}".format(rms.shape))

ALL: (1, 5311)
SEG: (1, 431)


> Speech

In [34]:
rms = librosa.feature.rms(y=speech1, frame_length=FRAME_LENGTH, hop_length=HOP_LENGTH, center=True)
print("ALL: {}".format(rms.shape))
rms = librosa.feature.rms(y=speech1[0:SAMPLES_PER_SEGMENT], frame_length=FRAME_LENGTH, hop_length=HOP_LENGTH, center=True)
print("SEG: {}".format(rms.shape))

ALL: (1, 1151)
SEG: (1, 431)


<a id="zcr"></a>
* Verifica o shape do retorno do `Zero Crossing Rate`

> Music

In [35]:
zcr = librosa.feature.zero_crossing_rate(y=music1 + 0.01)
print("ALL: {}".format(zcr.shape))
zcr = librosa.feature.zero_crossing_rate(y=music1[0:SAMPLES_PER_SEGMENT] + 0.01)
print("SEG: {}".format(zcr.shape))

ALL: (1, 2586)
SEG: (1, 431)


> Soap popera

In [36]:
zcr = librosa.feature.zero_crossing_rate(y=soap_intro1 + 0.01)
print("ALL: {}".format(zcr.shape))
zcr = librosa.feature.zero_crossing_rate(y=soap_intro1[0:SAMPLES_PER_SEGMENT] + 0.01)
print("SEG: {}".format(zcr.shape))

ALL: (1, 5311)
SEG: (1, 431)


> Speech

In [37]:
zcr = librosa.feature.zero_crossing_rate(y=speech1 + 0.01)
print("ALL: {}".format(zcr.shape))
zcr = librosa.feature.zero_crossing_rate(y=speech1[0:SAMPLES_PER_SEGMENT] + 0.01)
print("SEG: {}".format(zcr.shape))

ALL: (1, 1151)
SEG: (1, 431)


<a id="mel_spec"></a>
* Verifica o shape do retorno do `Mel Spectogram`

> Music

In [38]:
mel = librosa.feature.melspectrogram(y=music1, sr=SAMPLE_RATE, n_mels=20, fmax=10000 , n_fft=FRAME_LENGTH, hop_length=HOP_LENGTH)
print("ALL: {}".format(mel.shape))
mel = librosa.feature.melspectrogram(y=music1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE, n_mels=20, fmax=10000 , n_fft=FRAME_LENGTH, hop_length=HOP_LENGTH)
print("SEG: {}".format(mel.shape))

ALL: (20, 2586)
SEG: (20, 431)


> Soap popera

In [39]:
mel = librosa.feature.melspectrogram(y=soap_intro1, sr=SAMPLE_RATE, n_mels=20, fmax=10000 , n_fft=FRAME_LENGTH, hop_length=HOP_LENGTH)
print("ALL: {}".format(mel.shape))
mel = librosa.feature.melspectrogram(y=soap_intro1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE, n_mels=20, fmax=10000 , n_fft=FRAME_LENGTH, hop_length=HOP_LENGTH)
print("SEG: {}".format(mel.shape))

ALL: (20, 5311)
SEG: (20, 431)


> Speech

In [40]:
mel = librosa.feature.melspectrogram(y=speech1, sr=SAMPLE_RATE, n_mels=20, fmax=10000 , n_fft=FRAME_LENGTH, hop_length=HOP_LENGTH)
print("ALL: {}".format(mel.shape))
mel = librosa.feature.melspectrogram(y=speech1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE, n_mels=20, fmax=10000 , n_fft=FRAME_LENGTH, hop_length=HOP_LENGTH)
print("SEG: {}".format(mel.shape))

ALL: (20, 1151)
SEG: (20, 431)


<a id="tempo"></a>
* Verifica o shape do retorno do `tempogram`

> Music

In [41]:
oenv = librosa.onset.onset_strength(y=music1, sr=SAMPLE_RATE, hop_length=HOP_LENGTH)
tempogram = librosa.feature.tempogram(onset_envelope=oenv, sr=SAMPLE_RATE,hop_length=HOP_LENGTH)

print("ALL: {}".format(tempogram.shape))
oenv = librosa.onset.onset_strength(y=music1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE, hop_length=HOP_LENGTH)
tempogram = librosa.feature.tempogram(onset_envelope=oenv, sr=SAMPLE_RATE,hop_length=HOP_LENGTH)
tempogram.shape
print("SEG: {}".format(tempogram.shape))

ALL: (384, 2586)
SEG: (384, 431)


> Soap popera

In [42]:
oenv = librosa.onset.onset_strength(y=soap_intro1, sr=SAMPLE_RATE, hop_length=HOP_LENGTH)
tempogram = librosa.feature.tempogram(onset_envelope=oenv, sr=SAMPLE_RATE,hop_length=HOP_LENGTH)

print("ALL: {}".format(tempogram.shape))
oenv = librosa.onset.onset_strength(y=soap_intro1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE, hop_length=HOP_LENGTH)
tempogram = librosa.feature.tempogram(onset_envelope=oenv, sr=SAMPLE_RATE,hop_length=HOP_LENGTH)
tempogram.shape
print("SEG: {}".format(tempogram.shape))

ALL: (384, 5311)
SEG: (384, 431)


> Speech

In [43]:
oenv = librosa.onset.onset_strength(y=speech1, sr=SAMPLE_RATE, hop_length=HOP_LENGTH)
tempogram = librosa.feature.tempogram(onset_envelope=oenv, sr=SAMPLE_RATE,hop_length=HOP_LENGTH)

print("ALL: {}".format(tempogram.shape))
oenv = librosa.onset.onset_strength(y=speech1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE, hop_length=HOP_LENGTH)
tempogram = librosa.feature.tempogram(onset_envelope=oenv, sr=SAMPLE_RATE,hop_length=HOP_LENGTH)
tempogram.shape
print("SEG: {}".format(tempogram.shape))

ALL: (384, 1151)
SEG: (384, 431)


<a id="recurrence"></a>
* Verifica o shape do retorno do `Recurrence Matrix`

> Music

In [57]:
chroma = librosa.feature.chroma_cqt(y=music1, sr=SAMPLE_RATE, hop_length=HOP_LENGTH)
chroma_stack = librosa.feature.stack_memory(chroma, n_steps=10, delay=3)
R = librosa.segment.recurrence_matrix(chroma_stack, metric="cosine",  mode='affinity')
print("ALL: {}".format(R.shape))
chroma = librosa.feature.chroma_cqt(y=music1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE, hop_length=HOP_LENGTH)
chroma_stack = librosa.feature.stack_memory(chroma, n_steps=10, delay=3)
R = librosa.segment.recurrence_matrix(chroma_stack, metric="cosine",  mode='affinity')
print("SEG: {}".format(R.shape))

ALL: (2586, 2586)
SEG: (431, 431)


> Soap popera

In [58]:
chroma = librosa.feature.chroma_cqt(y=soap_intro1, sr=SAMPLE_RATE, hop_length=HOP_LENGTH)
chroma_stack = librosa.feature.stack_memory(chroma, n_steps=10, delay=3)
R = librosa.segment.recurrence_matrix(chroma_stack, metric="cosine",  mode='affinity')
print("ALL: {}".format(R.shape))
chroma = librosa.feature.chroma_cqt(y=soap_intro1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE, hop_length=HOP_LENGTH)
chroma_stack = librosa.feature.stack_memory(chroma, n_steps=10, delay=3)
R = librosa.segment.recurrence_matrix(chroma_stack, metric="cosine",  mode='affinity')
print("SEG: {}".format(R.shape))

ALL: (5311, 5311)
SEG: (431, 431)


> Speech

In [59]:
chroma = librosa.feature.chroma_cqt(y=speech1, sr=SAMPLE_RATE, hop_length=HOP_LENGTH)
chroma_stack = librosa.feature.stack_memory(chroma, n_steps=10, delay=3)
R = librosa.segment.recurrence_matrix(chroma_stack, metric="cosine",  mode='affinity')
print("ALL: {}".format(R.shape))
chroma = librosa.feature.chroma_cqt(y=speech1[0:SAMPLES_PER_SEGMENT], sr=SAMPLE_RATE, hop_length=HOP_LENGTH)
chroma_stack = librosa.feature.stack_memory(chroma, n_steps=10, delay=3)
R = librosa.segment.recurrence_matrix(chroma_stack, metric="cosine",  mode='affinity')
print("SEG: {}".format(R.shape))

ALL: (1151, 1151)
SEG: (431, 431)
