# Reconhecimento de Batidas (Snooooop)

**Objetivos**
1. Entender a diversidade musical em rap ao longo dos anos

**Resultado Esperado**
1. Implementar um algoritmo de extração de batidas
1. Saber usar o librosa para extrair batidas
   1. Não precisamos fazer tudo do zero
1. Usar DTW para comparar as batidas

In [1]:
from numba import jit

import glob

import librosa
import librosa.display

import IPython.display as ipd

import matplotlib.pyplot as plt

import numpy as np
import os
import pandas as pd

plt.rcParams['axes.labelsize']  = 16
plt.rcParams['axes.titlesize']  = 16
plt.rcParams['legend.fontsize'] = 16
plt.rcParams['xtick.labelsize'] = 16
plt.rcParams['ytick.labelsize'] = 16
plt.rcParams['lines.linewidth'] = 2

plt.ion()

## O problema

Neste notebook vamos tentar entender se o [Snoop Dogg](https://pt.wikipedia.org/wiki/Snoop_Dogg) tem razão quando fala que o Rap vem ficando sempre igual. Observe na fala do Snoop como ele fala que as batidas do Rap vem ficando cada vez mais homogêneas.

In [2]:
from IPython.display import Video
Video('https://pudding.cool/2018/05/similarity/assets/videos/video_snoop.mp4')

Este laboratório foi motivado pelo artifo do [Pudding Cool](https://pudding.cool/2018/05/similarity/) que levanta evidências de que a música **pop** no geral vem ficando cada vez mais similar. Recomendo uma leitura do artigo antes de continuar, vamos seguir uma sequência de passos parecida. Portanto, vamos comparar músicas lançadas em anos diferentes. A nossa comparação vai ser feita com base na série de novelty extraídas pelo [librosa](https://librosa.org/). 

Para entender como extrair novidade usando librosa, use este [notebook](https://musicinformationretrieval.com/novelty_functions.html) como exemplo. Além do mais, você pode refrescar seu conhecimento da aula usando os [FMP Notebooks](https://www.audiolabs-erlangen.de/resources/MIR/FMP/C6/C6.html).

Para ilustrar os dados que faremos uso, a figura abaixo mostra um exemplo do Librosa extraindo **curvas** de batidas com dois métodos diferentes. Em particular estamos comparando o [Onset Strength](https://www.audiolabs-erlangen.de/resources/MIR/FMP/C6/C6S1_OnsetDetection.html) e o [Predominant Local Pulse](https://www.audiolabs-erlangen.de/resources/MIR/FMP/C6/C6S3_PredominantLocalPulse.html).

![](https://librosa.org/doc/latest/_images/librosa-beat-plp-1_01.png)

Agora, imagine uma base de dados de músicas por ano. Cada música vai ter uma, ou várias, **curvas** de novidade. Com tal base você deve gerar o seguinte plot:
1. Para cada tipo de novidade (use as do librosa ou implemente as suas)
   1. Ordenar as músicas no tempo
   1. Da última (mais recente) até a primeira
       1. Caminhar para trás
       1. Tirar a similaridade. Aqui você pode usar Dynamic Time Warping.
       1. Plotar
   1. Agregar o plot acima por década de lançamento
   
``` 
essa curva é a sim média de tudo lançando entre 2010~2019. cada ponto é a média da década com década antes
    |
    v
| x x x x  
|         x 
|           x 
|            x                 o o o o o o o o o <- essa é a curva para os anos 2000 por exemplo
|             x
|              x x x x x x x x x x x x x x x x x
 -----------------------------------------------
 2019 aqui                            2010 aqui
   ano lançamento
```
   5. O formato da curva mostra o quão diverso foi a década
   
Este é o lab!

## Base de Dados

Para resolver nosso problema, vamos inicialmente pegar as músicas do estilo RAP dos últimos anos da Billboard. Para tal, faremos uso [desta](https://www.kaggle.com/danield2255/data-on-songs-from-billboard-19992019) base de dados. Abaixo tenho uma leitura da base.

In [3]:
df = pd.read_csv('billboardHot100_1999-2019.csv', index_col=0)
df['Date'] = pd.to_datetime(df['Date'])
df['Week'] = pd.to_datetime(df['Week'])
df.head()

Unnamed: 0,Artists,Name,Weekly.rank,Peak.position,Weeks.on.chart,Week,Date,Genre,Writing.Credits,Lyrics,Features
1,"Lil Nas,",Old Town Road,1,1.0,7.0,2019-07-06,2019-04-05,"Country,Atlanta,Alternative Country,Hip-Hop,Tr...","Jozzy, Atticus ross, Trent reznor, Billy ray c...","Old Town Road Remix \nOh, oh-oh\nOh\nYeah, I'm...",Billy Ray Cyrus
2,"Shawn Mendes, Camila Cabello",Senorita,2,,,2019-07-06,2019-06-21,Pop,"Cashmere cat, Jack patterson, Charli xcx, Benn...",Senorita \nI love it when you call me senorita...,
3,Billie Eilish,Bad Guy,3,2.0,13.0,2019-07-06,2019-03-29,"Hip-Hop,Dark Pop,House,Trap,Memes,Alternative ...","Billie eilish, Finneas","bad guy \nWhite shirt now red, my bloody nose\...",
4,Khalid,Talk,4,3.0,20.0,2019-07-06,2019-02-07,"Synth-Pop,Pop","Howard lawrence, Guy lawrence, Khalid",Talk \nCan we just talk? Can we just talk?\nTa...,
5,"Ed Sheeran, Justin Bieber",I Don't Care,5,2.0,7.0,2019-07-06,2019-05-10,"Canada,UK,Dance,Dance-Pop,Pop","Ed sheeran, Justin bieber, Shellback, Max mart...",I Don't Care \nI'm at a party I don't wanna be...,


Vamos remover as colunas que não importam.

In [4]:
df = df.drop(['Writing.Credits',
              'Lyrics',
              'Features',
              'Weekly.rank', 
              'Peak.position',
              'Weeks.on.chart',
             ],
             axis='columns')
df.head()

Unnamed: 0,Artists,Name,Week,Date,Genre
1,"Lil Nas,",Old Town Road,2019-07-06,2019-04-05,"Country,Atlanta,Alternative Country,Hip-Hop,Tr..."
2,"Shawn Mendes, Camila Cabello",Senorita,2019-07-06,2019-06-21,Pop
3,Billie Eilish,Bad Guy,2019-07-06,2019-03-29,"Hip-Hop,Dark Pop,House,Trap,Memes,Alternative ..."
4,Khalid,Talk,2019-07-06,2019-02-07,"Synth-Pop,Pop"
5,"Ed Sheeran, Justin Bieber",I Don't Care,2019-07-06,2019-05-10,"Canada,UK,Dance,Dance-Pop,Pop"


Agora, vou pegar os gêneros musicais de cada música da billboard. Esyte código é mais por curiosidade, fiz a escolha das `strings` que representam Rap com base nele.

In [5]:
genres = set()
for _, row in df.iterrows():
    genres.update(set(row['Genre'].split(',')))

In [6]:
print('\n'.join(sorted(genres)))

A Cappella
Acoustic
Adult Alternative
Adult Contemporary
African Languages
Afro Trap
Afrobeats
Alternative
Alternative Country
Alternative Dance
Alternative Metal
Alternative Pop
Alternative R&;B
Alternative Rock
Ambient
American Idol
American Underground
Americana
Anime
Art Pop
Art Rock
Atlanta
Aussie Hip-Hop
Australia
Avant Garde
Bachata
Ballad
Baroque Pop
Basketball
Bass Music
Bassline
Bay Area
Bedroom Pop
Beef
Big Band
Blue-Eyed Soul
Bluegrass
Blues
Blues Rock
Bolivia
Boom Bap
Bossa Nova
Bounce
Boy Band
Brasil
Brit Pop
British Rock
Broadway
Bubblegum Pop
Calypso
Canada
Celtic
Chamber Music
Charity
Chart History
Chicago Drill
Children&#39;s Music
Chill
Chillhop
Chillstep
Christian
Christian Metal
Christian Pop
Christian Rock
Christmas
Civil Rights
Classical Crossover
Classical Music
Climate Change
Cloud Rap
Colombia
Comedy
Conscious Hip-Hop
Contemporary Folk
Country
Country Rap
Cover
Crunk
Cuba
DMV
Dab
Dance
Dance-Pop
Dancehall
Danmark
Dark Ambient
Dark Pop
Deep House
Deutscher Rap


Vamos filtrar as músicas de Rap, Hip-Hop, R&B, Funk etc

In [7]:
hip = df['Genre'].str.contains('Hip-Hop')
rap = df['Genre'].str.contains('Rap')
rb = df['Genre'].str.contains('R&;B')
funk = df['Genre'].str.contains('Funk')

# Acaha que fiquei só com Rap, mas pode brincar se quiser. Fiz isso por tamanho da base.
df = df[rap]
df = df.drop('Genre', axis='columns')
df.head()

Unnamed: 0,Artists,Name,Week,Date
1,"Lil Nas,",Old Town Road,2019-07-06,2019-04-05
7,DaBaby,Suge,2019-07-06,2019-03-01
8,Drake,Money In The Grave,2019-07-06,2019-06-15
9,Chris Brown,No Guidance,2019-07-06,2019-06-08
10,Post Malone,Wow.,2019-07-06,2018-12-24


Vamos ver o tamanho do dataset final

In [8]:
df.shape

(30695, 4)

Ok, agora vamos usar um group by para pegar a quantidade de semanas que uma música ocorre. Vamos focar nas músicas que aparecem em pelo menos 21 semanas (aproximadamente 6 meses). Assim ficamos com um conjunto pequeno de dados.

Outro motivo para tal escolha é ter pelo menos umas 10 músicas por ano, como mostramos a seguir. O primeiro e último ano podem ser ignorados.

In [9]:
songs = df.groupby(['Artists', 'Name']).count()['Week'].sort_values()
songs

Artists               Name                
Sean Kingston         Dumb Love                1
Plies                 Real Hitta               1
Frank Ocean           Chanel                   1
Curren$Y              Bottom Of The Bottle     1
Plies                 Want It, Need It         1
                                              ..
Maroon 5              Girls Like You          52
Florida Georgia Line  Cruise                  54
The Black Eyed Peas   I Gotta Feeling         56
Katy Perry            Dark Horse              57
Shirt                 T                       62
Name: Week, Length: 2473, dtype: int64

Pega a data da primeira entrada na billboard

In [10]:
year = df.groupby(['Artists', 'Name']).min()['Week'].sort_values()

Pega as músicas mais populares


In [11]:
songs = songs[songs >= 21]
songs

Artists               Name                     
Destiny's Child       Soldier                      21
Jason Derulo          Wiggle                       21
Drake                 Find Your Love               21
Kendrick Lamar        Bitch, Don't Kill My Vibe    21
Lil Wayne             She Will                     21
                                                   ..
Maroon 5              Girls Like You               52
Florida Georgia Line  Cruise                       54
The Black Eyed Peas   I Gotta Feeling              56
Katy Perry            Dark Horse                   57
Shirt                 T                            62
Name: Week, Length: 458, dtype: int64

In [12]:
from collections import Counter
years = []
for y in year[songs.index]:
    years.append(y.year)
Counter(years)

Counter({2004: 23,
         2014: 23,
         2010: 26,
         2013: 18,
         2011: 25,
         2002: 21,
         2003: 25,
         2009: 25,
         2017: 36,
         2000: 12,
         2008: 23,
         2018: 33,
         2001: 15,
         2016: 28,
         2006: 20,
         2005: 27,
         2015: 25,
         2007: 23,
         2012: 24,
         2019: 4,
         1999: 2})

### YTMDL

Por fim, baixei as músicas usando o [YTMDL](https://github.com/deepjyoti30/ytmdl). Esta ferramenta permite a busca por músicas no YouTube fazendo uso do nome de título da mesma. Algumas opções do script:

1. -q pedir nada para o usuário
1. --choice pegar a primeira música
1. --nolocal não usar cache
1. --artist nome do artist
1. --skip-meta ignora meta dados, muitos erros
1. --disable-metaadd nem usa metadados para nada, melhorou os resultados
1. --ignore-errors, pule as músicas que não acha

O argumento sem nome é o nome da música. Coloquei tudo em um script e mandei executar para baixar as músicas.

In [13]:
# O código abaixo prepara para baixar as músicas usando o https://github.com/deepjyoti30/ytmdl
# Pode ignorar
with open('yolo.txt', 'w') as yolo:
    for artist, name in songs.index.to_flat_index():
        print(artist,
              name,
              year.loc[artist].loc[name].year,
              file=yolo)
! head yolo.txt

Destiny's Child Soldier 2004
Jason Derulo Wiggle 2014
Drake Find Your Love 2010
Kendrick Lamar Bitch, Don't Kill My Vibe 2013
Lil Wayne She Will 2011
P. Diddy I Need A Girl 2002
Chingy Holidae In 2003
Cam'Ron Hey Ma 2002
Sean Kingston Fire Burning 2009
Mario Just A Friend 2002 2002


In [14]:
# as músicas foram baixadas com o comando
# ytmdl --list yolo.txt --skip-meta -q --disable-metaadd --level DEBUG --ignore-errors

## Navegando as MP3s

Lembre-se que o módulo glob permite caminhar em arquivos

In [15]:
mp3fpaths = glob.glob('files/*.mp3')[:2]
print("\n".join(mp3fpaths))

files/UNK WALK IT OUT VIDEO.mp3
files/Fade to Black (3-8) Movie CLIP - Dirt Off Your Shoulder (2004) HD.mp3


Processei o log do ytmdl para pegar o nome do arquivo com ano. Observe acima como coloquei o ano na query, o log tem entradas como:

`song_name:  Shirts... And Pants. Pants Pants Pants Pants.  song_meta:  Shirt T 2008`

Aqui o song_meta foi minha query. Com isso consigo pegar os anos das mp3s.

In [16]:
nome_ano = pd.read_csv('./arquivo_ano.tsv', sep='\t', names=['file_exists', 'fpath', 'year'])
nome_ano

Unnamed: 0,file_exists,fpath,year
0,True,files/2003 - Freek de Jonge - De Stemming 1 - ...,2003
1,True,files/21. El Perdón - Nicky Jam y Enrique Igle...,2015
2,True,files/21 Savage - a lot (Official Video) ft. J...,2019
3,True,files/21 Savage - Bank Account (Official Audio...,2017
4,True,files/21 Savage & Metro Boomin - X ft Future (...,2016
...,...,...,...
447,True,files/Yo Gotti - Rake It Up ft. Nicki Minaj.mp3,2017
448,True,files/YoungBloodZ - Damn! (Video) ft. Lil' Jon...,2003
449,True,files/Young Money - Bed Rock (Official Music V...,2009
450,True,files/Yung Joc - It's Goin Down (Official Musi...,2006


In [17]:
Counter(nome_ano['year'])

Counter({2003: 25,
         2015: 24,
         2019: 4,
         2017: 36,
         2016: 28,
         2013: 18,
         2012: 24,
         2008: 23,
         2005: 27,
         2004: 22,
         2018: 33,
         2006: 20,
         2001: 15,
         2014: 23,
         2002: 20,
         2007: 22,
         2011: 25,
         2010: 24,
         2009: 25,
         2000: 12,
         1999: 2})

Temos uma quantidade boa de dados!

## Sua Solução
|
A partir daqui.

> Interprete seu gráfico final.

In [18]:
from tqdm.notebook import tqdm
import warnings

import gc

dec = {1990:[i for i in range(1990, 2000)],
       2000:[i for i in range(2000, 2010)],
       2010:[i for i in range(2010, 2020)]}

def SyncDTW(x, wp):
    new_curve = np.zeros(wp[0,1] + 1)
    x_path = wp[:,0]
    y_path = wp[:,1]
    for i in range(len(new_curve)):
        idxs = x_path[y_path == i]
#         print(i,idxs, y_path[y_path == i])
        cut = x[idxs]
        new_curve[i] = np.mean(cut)
    return new_curve 

def DTWSim(D, wp):
    c = 0
    for i, j in wp:
        c += D[i, j] / wp.shape[0]
    return c
    #     return np.sum(D[[wp[:,0], wp[:,1]]] / wp.shape[0])

### RMS Energy

In [19]:
def rmsEnergy(x, hop_length = 512, frame_length = 1024):
    rmse = librosa.feature.rms(x, frame_length=frame_length, hop_length=hop_length).flatten()
    rmse_diff = np.zeros_like(rmse)
    rmse_diff[1:] = np.diff(rmse)
    energy_novelty = np.max([np.zeros_like(rmse_diff), rmse_diff], axis=0)
    return energy_novelty

In [20]:
## Load All Songs (will it fit ?)
gc.collect()
all_songs = {}
for idx, row in tqdm(nome_ano.iterrows(), total=nome_ano.shape[0] ):
    x, sr = librosa.load(row['fpath'])
    all_songs[row['fpath']] = (x, sr)
    gc.collect()

  0%|          | 0/452 [00:00<?, ?it/s]























In [46]:
Sample_2019 = None
gc.collect()

dtw_sims_since2019 = {}
for year in sorted(nome_ano['year'].unique())[::-1]:
    musics_in_year = nome_ano[nome_ano['year'] == year]
    
    # Calc novelty
    novs = []
    for index, row in tqdm(musics_in_year.iterrows(), desc='Loading songs...', total=musics_in_year.shape[0]):
        gc.collect()
        with warnings.catch_warnings():
            warnings.simplefilter("ignore")
#             x, Fs = librosa.load(row['fpath']) 
            x, Fs = librosa.load(row['fpath'],duration=120, offset=10) 
        
        novelty = rmsEnergy(x)
        if Sample_2019 is None:
            Sample_2019 = novelty
        else:
            novs.append(novelty)
                
    # Calc DTW btw Longer song and the others
    sim = []
    for i, n in tqdm(enumerate(novs), desc='Calc. DTW...', total = len(novs)):
        gc.collect()
        D, wp = librosa.sequence.dtw(n, Sample_2019,  subseq=True)
        sim.append(DTWSim(D, wp))
    
    # Save mean p/ year
    yearSIM = np.mean(sim)
    dtw_sims_since2019[year] = yearSIM
    print(dtw_sims_since2019)
    


Loading songs...:   0%|          | 0/4 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/3 [00:00<?, ?it/s]

{2019: 23.741749223763865}


Loading songs...:   0%|          | 0/33 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/33 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574}


Loading songs...:   0%|          | 0/36 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/36 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745}


Loading songs...:   0%|          | 0/28 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/28 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835}


Loading songs...:   0%|          | 0/24 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/24 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947}


Loading songs...:   0%|          | 0/23 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/23 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367}


Loading songs...:   0%|          | 0/18 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/18 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359}


Loading songs...:   0%|          | 0/24 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/24 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359, 2012: 15.10364392190534}


Loading songs...:   0%|          | 0/25 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/25 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359, 2012: 15.10364392190534, 2011: 15.894268572977323}


Loading songs...:   0%|          | 0/24 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/24 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359, 2012: 15.10364392190534, 2011: 15.894268572977323, 2010: 17.388730735315775}


Loading songs...:   0%|          | 0/25 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/25 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359, 2012: 15.10364392190534, 2011: 15.894268572977323, 2010: 17.388730735315775, 2009: 18.000416355962162}


Loading songs...:   0%|          | 0/23 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/23 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359, 2012: 15.10364392190534, 2011: 15.894268572977323, 2010: 17.388730735315775, 2009: 18.000416355962162, 2008: 16.464966912471834}


Loading songs...:   0%|          | 0/22 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/22 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359, 2012: 15.10364392190534, 2011: 15.894268572977323, 2010: 17.388730735315775, 2009: 18.000416355962162, 2008: 16.464966912471834, 2007: 18.85192059377251}


Loading songs...:   0%|          | 0/20 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/20 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359, 2012: 15.10364392190534, 2011: 15.894268572977323, 2010: 17.388730735315775, 2009: 18.000416355962162, 2008: 16.464966912471834, 2007: 18.85192059377251, 2006: 19.659454399158438}


Loading songs...:   0%|          | 0/27 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/27 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359, 2012: 15.10364392190534, 2011: 15.894268572977323, 2010: 17.388730735315775, 2009: 18.000416355962162, 2008: 16.464966912471834, 2007: 18.85192059377251, 2006: 19.659454399158438, 2005: 19.457723665311796}


Loading songs...:   0%|          | 0/22 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/22 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359, 2012: 15.10364392190534, 2011: 15.894268572977323, 2010: 17.388730735315775, 2009: 18.000416355962162, 2008: 16.464966912471834, 2007: 18.85192059377251, 2006: 19.659454399158438, 2005: 19.457723665311796, 2004: 20.8939520917187}


Loading songs...:   0%|          | 0/25 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/25 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359, 2012: 15.10364392190534, 2011: 15.894268572977323, 2010: 17.388730735315775, 2009: 18.000416355962162, 2008: 16.464966912471834, 2007: 18.85192059377251, 2006: 19.659454399158438, 2005: 19.457723665311796, 2004: 20.8939520917187, 2003: 19.38903629824538}


Loading songs...:   0%|          | 0/20 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/20 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359, 2012: 15.10364392190534, 2011: 15.894268572977323, 2010: 17.388730735315775, 2009: 18.000416355962162, 2008: 16.464966912471834, 2007: 18.85192059377251, 2006: 19.659454399158438, 2005: 19.457723665311796, 2004: 20.8939520917187, 2003: 19.38903629824538, 2002: 18.24052466078931}


Loading songs...:   0%|          | 0/15 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/15 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359, 2012: 15.10364392190534, 2011: 15.894268572977323, 2010: 17.388730735315775, 2009: 18.000416355962162, 2008: 16.464966912471834, 2007: 18.85192059377251, 2006: 19.659454399158438, 2005: 19.457723665311796, 2004: 20.8939520917187, 2003: 19.38903629824538, 2002: 18.24052466078931, 2001: 18.88116428778104}


Loading songs...:   0%|          | 0/12 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/12 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359, 2012: 15.10364392190534, 2011: 15.894268572977323, 2010: 17.388730735315775, 2009: 18.000416355962162, 2008: 16.464966912471834, 2007: 18.85192059377251, 2006: 19.659454399158438, 2005: 19.457723665311796, 2004: 20.8939520917187, 2003: 19.38903629824538, 2002: 18.24052466078931, 2001: 18.88116428778104, 2000: 19.573191965814818}


Loading songs...:   0%|          | 0/2 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/2 [00:00<?, ?it/s]

{2019: 23.741749223763865, 2018: 19.428206071707574, 2017: 19.191919171448745, 2016: 18.976112641777835, 2015: 18.17009650390947, 2014: 16.920324645006367, 2013: 14.75050134504359, 2012: 15.10364392190534, 2011: 15.894268572977323, 2010: 17.388730735315775, 2009: 18.000416355962162, 2008: 16.464966912471834, 2007: 18.85192059377251, 2006: 19.659454399158438, 2005: 19.457723665311796, 2004: 20.8939520917187, 2003: 19.38903629824538, 2002: 18.24052466078931, 2001: 18.88116428778104, 2000: 19.573191965814818, 1999: 16.131467971557083}


In [22]:


for d in dec:
    values = []
    for y in dec[d][::-1]:
        if y in dtw_sims_since2019:
            values.append(dtw_sims_since2019[y])

    plt.plot(values, label=d)

plt.legend(title='Decade')
plt.show()


NameError: name 'dtw_sims_since2019' is not defined

### spectral_novelty

In [24]:
Sample_2019 = None
gc.collect()

dtw_sims_since2019 = {}
for year in sorted(nome_ano['year'].unique())[::-1]:
    musics_in_year = nome_ano[nome_ano['year'] == year]
    
    # Calc novelty
    novs = []
    for index, row in tqdm(musics_in_year.iterrows(), desc='Loading songs...', total=musics_in_year.shape[0]):
        gc.collect()
        with warnings.catch_warnings():
            warnings.simplefilter("ignore")
            x, sr = librosa.load(row['fpath']) 
#             x, sr = librosa.load(row['fpath'],duration=120, offset=10) 
        
        novelty = librosa.onset.onset_strength(x, sr=sr)
        if Sample_2019 is None:
            Sample_2019 = novelty
        else:
            novs.append(novelty)
                
    # Calc DTW btw Longer song and the others
    sim = []
    for i, n in tqdm(enumerate(novs), desc='Calc. DTW...', total = len(novs)):
        gc.collect()
        D, wp = librosa.sequence.dtw(n, Sample_2019,  subseq=True)
        sim.append(DTWSim(D, wp))
    
    # Save mean p/ year
    yearSIM = np.mean(sim)
    dtw_sims_since2019[year] = yearSIM
    print(dtw_sims_since2019)
    


Loading songs...:   0%|          | 0/4 [00:00<?, ?it/s]

Calc. DTW...:   0%|          | 0/3 [00:00<?, ?it/s]

{2019: 2922.078212993727}


Loading songs...:   0%|          | 0/33 [00:00<?, ?it/s]

KeyboardInterrupt: 