## Jupyter e Pandas

Pandas é uma biblioteca open source que fornece estruturas de dados de alto desempenho e fáceis de usar e ferramentas de análise de dados para a linguagem de programação Python.

* Pandas: https://pandas.pydata.org/
* Documentação: http://pandas.pydata.org/pandas-docs/stable/
* Tutoriais: https://pandas.pydata.org/pandas-docs/stable/getting_started/tutorials.html

In [1]:
import pandas as pd
import nltk

### Importando um dataframe a partir de um arquivo

In [2]:
df = pd.read_csv("../data/music_data.csv")
df.head()

Unnamed: 0,id_billboard,song_name,artist,lyrics,danceability,energy,key,loudness,mode,speechiness,...,instrumentalness,liveness,valence,time,type,id,track_href,analysis_url,duration_ms,time_signature
0,71,Sorry - Originally performed by Beyonce,2016 Dynamo Hitz,,0.699,0.526,2,-13.514,1,0.0373,...,0.00165,0.193,0.148,129.98,audio_features,5rzLBhiS7SJy03ZfcTpUSy,https://api.spotify.com/v1/tracks/5rzLBhiS7SJy...,https://api.spotify.com/v1/audio-analysis/5rzL...,232934,4
1,7,Hello,Adele,"Hello, it's me I was wondering if after all th...",0.481,0.451,5,-6.095,0,0.0347,...,0.0,0.0872,0.287,157.966,audio_features,4sPmO7WMQUAf45kwMOtONw,https://api.spotify.com/v1/tracks/4sPmO7WMQUAf...,https://api.spotify.com/v1/audio-analysis/4sPm...,295493,4
2,26,Send My Love (To Your New Lover),Adele,"Just the guitar. OK, cool. This was all you, n...",0.69,0.524,6,-8.39,0,0.103,...,3e-06,0.17,0.561,164.023,audio_features,4BHzQ9C00ceJxfG16AlNWb,https://api.spotify.com/v1/tracks/4BHzQ9C00ceJ...,https://api.spotify.com/v1/audio-analysis/4BHz...,223080,4
3,83,When We Were Young,Adele,,0.381,0.594,3,-5.97,1,0.0486,...,0.0,0.0925,0.275,143.86,audio_features,7IWkJwX9C0J7tHurTD7ViL,https://api.spotify.com/v1/tracks/7IWkJwX9C0J7...,https://api.spotify.com/v1/audio-analysis/7IWk...,290907,4
4,39,Here,Alessia Cara,,0.376,0.81,0,-4.003,1,0.166,...,0.0,0.073,0.334,123.909,audio_features,664gdARxaClFsoF5SXKOws,https://api.spotify.com/v1/tracks/664gdARxaClF...,https://api.spotify.com/v1/audio-analysis/664g...,199453,4


* As colunas do dataframe são um atributo do objeto

In [3]:
df.columns

Index(['id_billboard', 'song_name', 'artist', 'lyrics', 'danceability',
       'energy', 'key', 'loudness', 'mode', 'speechiness', 'acousticness',
       'instrumentalness', 'liveness', 'valence', 'time', 'type', 'id',
       'track_href', 'analysis_url', 'duration_ms', 'time_signature'],
      dtype='object')

* Também podem ser visualizados atributos e dados gerais

In [4]:
df[['danceability', 'energy', 'loudness']].describe()

Unnamed: 0,danceability,energy,loudness
count,60.0,60.0,60.0
mean,0.6592,0.644883,-6.1937
std,0.132919,0.154437,2.39571
min,0.376,0.275,-13.514
25%,0.57325,0.558,-7.53325
50%,0.6555,0.6705,-5.7265
75%,0.75825,0.761,-4.66275
max,0.916,0.928,-2.787


### Removendo colunas do dataset original

As colunas de um dataframe podem ser removidas utilizando o label ou os índices

In [5]:
new_df = df.drop(['key', 'mode', 'time', 'track_href', 'analysis_url', 'time_signature', 'id', 'type'], axis=1)
new_df.head()

Unnamed: 0,id_billboard,song_name,artist,lyrics,danceability,energy,loudness,speechiness,acousticness,instrumentalness,liveness,valence,duration_ms
0,71,Sorry - Originally performed by Beyonce,2016 Dynamo Hitz,,0.699,0.526,-13.514,0.0373,0.00101,0.00165,0.193,0.148,232934
1,7,Hello,Adele,"Hello, it's me I was wondering if after all th...",0.481,0.451,-6.095,0.0347,0.336,0.0,0.0872,0.287,295493
2,26,Send My Love (To Your New Lover),Adele,"Just the guitar. OK, cool. This was all you, n...",0.69,0.524,-8.39,0.103,0.0415,3e-06,0.17,0.561,223080
3,83,When We Were Young,Adele,,0.381,0.594,-5.97,0.0486,0.348,0.0,0.0925,0.275,290907
4,39,Here,Alessia Cara,,0.376,0.81,-4.003,0.166,0.0844,0.0,0.073,0.334,199453


### Acesso por colunas

* Acessando uma única coluna

In [6]:
new_df['artist']

0                                      2016 Dynamo Hitz
1                                                 Adele
2                                                 Adele
3                                                 Adele
4                                          Alessia Cara
5                                         Ariana Grande
6                                         Ariana Grande
7                             Ariana Grande Nicki Minaj
8                                         Bryson Tiller
9                                         Bryson Tiller
10                                Calvin Harris Rihanna
11                                         Charlie Puth
12                            Charlie Puth Selena Gomez
13                                          Chris Brown
14                                             Coldplay
15                                        Coldplay Seeb
16                                                 Daya
17                                              

* Acessando várias colunas

In [7]:
new_df[['song_name', 'artist', 'duration_ms']]

Unnamed: 0,song_name,artist,duration_ms
0,Sorry - Originally performed by Beyonce,2016 Dynamo Hitz,232934
1,Hello,Adele,295493
2,Send My Love (To Your New Lover),Adele,223080
3,When We Were Young,Adele,290907
4,Here,Alessia Cara,199453
5,Dangerous Woman,Ariana Grande,235947
6,Into You,Ariana Grande,244453
7,Side To Side,Ariana Grande Nicki Minaj,226160
8,Don't,Bryson Tiller,198293
9,Exchange,Bryson Tiller,194613


* Aplicando uma função à uma coluna

In [8]:
new_df['lyrics'].str.upper()

0                                                   NaN
1     HELLO, IT'S ME I WAS WONDERING IF AFTER ALL TH...
2     JUST THE GUITAR. OK, COOL. THIS WAS ALL YOU, N...
3                                                   NaN
4                                                   NaN
5     OH, YEAH DON'T NEED PERMISSION MADE MY DECISIO...
6     I'M SO INTO YOU, I CAN BARELY BREATHE AND ALL ...
7                                                   NaN
8     DON'T DON'T PLAY WITH HER, DON'T BE DISHONEST ...
9                                                   NaN
10    BABY, THIS IS WHAT YOU CAME FOR LIGHTNING STRI...
11                                                  NaN
12    [CHARLIE PUTH:] WE DON'T TALK ANYMORE WE DON'T...
13                                                  NaN
14                                                  NaN
15                                                  NaN
16                                                  NaN
17                                              

### Acesso por linha

* Acessando uma linha específica

In [9]:
new_df.loc[50]

id_billboard                                                        9
song_name           CAN'T STOP THE FEELING! (Original Song from Dr...
artist                                              Justin Timberlake
lyrics              I got this feeling inside my bones It goes ele...
danceability                                                    0.667
energy                                                           0.83
loudness                                                       -5.715
speechiness                                                    0.0749
acousticness                                                   0.0123
instrumentalness                                                    0
liveness                                                        0.191
valence                                                         0.716
duration_ms                                                    236002
Name: 50, dtype: object

* Acessando uma ou mais linhas a partir de uma consulta

In [10]:
new_df[new_df['song_name'] == 'Here']

Unnamed: 0,id_billboard,song_name,artist,lyrics,danceability,energy,loudness,speechiness,acousticness,instrumentalness,liveness,valence,duration_ms
4,39,Here,Alessia Cara,,0.376,0.81,-4.003,0.166,0.0844,0.0,0.073,0.334,199453


In [11]:
new_df['lyrics'].str.upper()[1]

"HELLO, IT'S ME I WAS WONDERING IF AFTER ALL THESE YEARS YOU'D LIKE TO MEET TO GO OVER EVERYTHING THEY SAY THAT TIME'S SUPPOSED TO HEAL YA BUT I AIN'T DONE MUCH HEALING HELLO, CAN YOU HEAR ME? I'M IN CALIFORNIA DREAMING ABOUT WHO WE USED TO BE WHEN WE WERE YOUNGER AND FREE I'VE FORGOTTEN HOW IT FELT BEFORE THE WORLD FELL AT OUR FEET THERE'S SUCH A DIFFERENCE BETWEEN US AND A MILLION MILES HELLO FROM THE OTHER SIDE I MUST HAVE CALLED A THOUSAND TIMES TO TELL YOU I'M SORRY FOR EVERYTHING THAT I'VE DONE BUT WHEN I CALL YOU NEVER SEEM TO BE HOME HELLO FROM THE OUTSIDE AT LEAST I CAN SAY THAT I'VE TRIED TO TELL YOU I'M SORRY FOR BREAKING YOUR HEART BUT IT DON'T MATTER, IT CLEARLY DOESN'T TEAR YOU APART ANYMORE HELLO, HOW ARE YOU? IT'S SO TYPICAL OF ME TO TALK ABOUT MYSELF, I'M SORRY I HOPE THAT YOU'RE WELL DID YOU EVER MAKE IT OUT OF THAT TOWN WHERE NOTHING EVER HAPPENED? IT'S NO SECRET THAT THE BOTH OF US ARE RUNNING OUT OF TIME SO HELLO FROM THE OTHER SIDE (OTHER SIDE) I MUST HAVE CALLED 

In [12]:
new_df[new_df['duration_ms'] >= 280000]

Unnamed: 0,id_billboard,song_name,artist,lyrics,danceability,energy,loudness,speechiness,acousticness,instrumentalness,liveness,valence,duration_ms
1,7,Hello,Adele,"Hello, it's me I was wondering if after all th...",0.481,0.451,-6.095,0.0347,0.336,0.0,0.0872,0.287,295493
3,83,When We Were Young,Adele,,0.381,0.594,-5.97,0.0486,0.348,0.0,0.0925,0.275,290907
33,87,All The Way Up (Remix),Fat Joe Remy Ma JAY Z French Montana InfaRed,"[Fat Joe:] Nothing can stop me, I'm all the wa...",0.564,0.717,-6.403,0.397,0.097,0.0,0.493,0.421,284831
41,30,Low Life,Future The Weeknd,[Refrain:] Everybody getting high Getting high...,0.722,0.331,-7.789,0.0725,0.337,0.283,0.146,0.102,313547


### Criação de novas colunas

In [13]:
new_df['duration_label'] = new_df['duration_ms'].map(lambda x: "short" if x <= 200000 else "long")
new_df.sample(10)

Unnamed: 0,id_billboard,song_name,artist,lyrics,danceability,energy,loudness,speechiness,acousticness,instrumentalness,liveness,valence,duration_ms,duration_label
35,93,All In My Head (Flex),Fifth Harmony Fetty Wap,,0.689,0.791,-5.194,0.053,0.023,0.0,0.0526,0.772,210573,long
18,6,Panda,Desiigner,[Spoken:] This what they all been waitin' for....,0.576,0.766,-4.943,0.449,0.028,2e-06,0.366,0.25,246761,long
57,25,Cold Water (feat. Justin Bieber & MØ),Major Lazer Justin Bieber MØ,[Justin Bieber:] Everybody gets high sometimes...,0.608,0.798,-5.092,0.0432,0.0736,0.0,0.156,0.488,185352,short
3,83,When We Were Young,Adele,,0.381,0.594,-5.97,0.0486,0.348,0.0,0.0925,0.275,290907,long
15,73,Hymn For The Weekend - Seeb Remix,Coldplay Seeb,,0.565,0.849,-3.516,0.0517,0.00868,5e-06,0.12,0.43,212647,long
0,71,Sorry - Originally performed by Beyonce,2016 Dynamo Hitz,,0.699,0.526,-13.514,0.0373,0.00101,0.00165,0.193,0.148,232934,long
11,43,One Call Away,Charlie Puth,,0.667,0.613,-5.353,0.0344,0.403,0.0,0.115,0.479,194453,short
49,31,What Do You Mean?,Justin Bieber,What do you mean? Ohh ohh ohh When you nod you...,0.845,0.567,-8.118,0.0956,0.59,0.00142,0.0811,0.794,205680,long
2,26,Send My Love (To Your New Lover),Adele,"Just the guitar. OK, cool. This was all you, n...",0.69,0.524,-8.39,0.103,0.0415,3e-06,0.17,0.561,223080,long
22,47,Let Me Love You,DJ Snake Justin Bieber,I used to believe We were burnin' on the edge ...,0.476,0.718,-5.309,0.0576,0.0784,1e-05,0.122,0.143,205947,long
