In [1]:
import pandas as pd
import numpy as np
import sqlalchemy as sa

In [1]:
1 == 2 or 3

3

# API Wrappers

No final da última aula vimos como construir um **wrapper** para uma API: uma classe que nos permite encapsular as etapas de autenticação e acesso à uma API. **Wrappers** nos permitem *abstrair* as camadas de conexão (como nosso script se comunicará com a API) das camadas de dados (o que vamos extrair de informação). Essa separação facilita a construção de códigos mais robustos - tanto à alterações na estrutura da API (simplificando o processo de atualização da camada de conexão) quanto aos requisitos de dados (simplificando o processo de alterações nos dados que extraímos).

Muitas APIs contém **wrappers** pré-implementados através de bibliotecas, nos liberando para focar na extração e manipulação de dados e não na camada de conexão! Hoje veremos uma aplicação desenvolvida a partir do **wrapper** `spotipy` - uma biblioteca que facilita a conexão e extração de dados da plataforma de streaming **Spotify**.

Alguns wrappers interessantes:

* [Dados Abertos Brasil](https://pypi.org/project/DadosAbertosBrasil/)
* [Dados Econômicos World Bank](https://pypi.org/project/wbgapi/)
* [Meteostat para Dados Metereológicos](https://github.com/meteostat/meteostat-python)
* [Yahoo! Finace](https://pypi.org/project/yfinance/)
* [Lista extensiva de Wrappers Python](https://github.com/realpython/list-of-python-api-wrappers) para encontrar outros wrappers interessantes.

Existem inúmeros wrappers pré-construídos - caso você deseje encontrar um wrapper para uma API específica seu melhor amigo será o Google (`python API nome_da_api`)!

## Lidando com autenticação

Como vimos na aula passada, o acesso à muitas APIs é autenticado. É uma boa idéia separar as chaves de autenticação, por exemplo um `token`, do nosso código Python: se tornarmos nosso código público, via GitHub por exemplo, não queremos que outras pessoas utilizem nossas credenciais para acessar uma API!

Podemos utilizar a biblioteca `dotenv` para *esconder* nossas chaves em um arquivo a parte do nosso código:

- Instale a biblioteca usando `!pip install dotenv` (ou `!pip install python-dotenv`)
- Crie um arquivo texto com a extensão `.env` na mesma pasta que o seu código Python
- Insira as as suas chaves de autenticação utilizando a notação abaixo:
```
API_KEY="l1noPOPAixCPM"
API_SECRET="GraGq0zrGhs1qvbA0xQXsZBKuTkK5MJ"
```
- Utilize a função `dotenv.load_dotenv('nome_do_seu_arquivo.env')`
- Agora podemos utilizar a biblioteca `os` para recuperar as variáveis escritas no arquivo utilizando `os.getenv('API_KEY')`

A idéia por trás desse procedimento é que podemos armazenar nossas chaves em um arquivo separado do nosso código. Agora podemos adicionar o arquivo `.env` ao nosso `.gitignore` para que ele não seja sincronizado com nosso repositório!

In [None]:
!pip install python-dotenv

In [2]:
import os
from dotenv import load_dotenv

Agora vamos carregar o arquivo `exemplo_aula.env` e verificar que a variável `API_KEY` foi carregada corretamente:

In [3]:
load_dotenv('credentials/exemplo_aula.env')

True

In [6]:
print(os.getenv('api_secret'))

None


# Spotipy

Para vermos como wrapper funcionam utilizaremos a biblioteca `spotipy`, que funciona como **wrapper para o Spotify**.

A maior parte dos **wrappers** não fazem parte da instalação padrão do Anaconda, logo precisamos instalá-la utilizando `!pip`.

In [None]:
!pip install spotipy

Além da própria biblioteca, vamos importar a classe `SpotifyClientCredentials` para fazer a autenticação na API do Spotify.

In [7]:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

## Definindo o Problema

Antes de mergulharmos na biblioteca `spotipy`, vamos entender o que queremos fazer com os dados desta API.

Além de fornecer dados sobre a popularidade de artistas, músicas mais tocadas, etc, a API do Spotify fornece dados quantitativos sobre as músicas disponíveis no serviço de streaming. Essas informações são chamadas de *Audio Features*.

Para entender o que são as *Audio Features* devemos ler a (extensa) [documentação](https://developer.spotify.com/documentation/web-api/reference/#/operations/get-several-audio-features) da API.

Nosso objetivo é utilizar essas informações (disponíveis por faixa) para quantificar e diferenciar festivais musicais acontencedo no verão de 2022 na Europa. O primeiro passo é carregar uma tabela com os dados de *headliners* para cada festival:

In [9]:
import pandas as pd

def read_from_gsheets(spreadsheet_link):
    """
    Transform Google Sheets URL into a CSV file
    """
    working_spreadsheet = spreadsheet_link.replace(
        "/edit?usp=sharing", "/export?format=csv"
    )

    return pd.read_csv(working_spreadsheet)


In [10]:
tb_festivals = read_from_gsheets(
    "https://docs.google.com/spreadsheets/d/1aUiwstZKEENiw3KAT1CCCXcji1iLWdCL_HwngvLxzJc/edit?usp=sharing"
)
tb_festivals.head(10)


Unnamed: 0,festival,headliners
0,Tomorrowland,Martin Garrix
1,Tomorrowland,Armin van Buuren
2,Tomorrowland,Dimitri Vegas & Like Mike
3,Tomorrowland,Marshmello
4,Tomorrowland,Amelie Lens
5,Tomorrowland,Adam Beyer
6,Tomorrowland,Eric Prydz
7,Tomorrowland,NERVO
8,Glastonbury,Billie Eilish
9,Glastonbury,Paul McCartney


## Conectando ao Spotify

Para nos conectarmos com a API do Spotify precisamos criar um par de `CLIENT_ID` e `CLIENT_SECRET`. Podemos fazer isso através do dashboard de desenvolvedor do Spotify: https://developer.spotify.com.

Vamos guardar o par de chaves criado em um arquivo `.env` e carrega-lo utilizando a função `load_dotenv`:

In [11]:
load_dotenv('credentials/spotify.env')

True

Agora vamos utilizar a classe `SpotifyClientCredentials` para criar o gerenciador de autenticação da API. Utilizando este gerenciador, podemos inicializar a conexão com a API utilizando a classe `spotipy.Spotify`:

In [12]:
auth_manager = SpotifyClientCredentials(
    client_id=os.getenv('CLIENT_ID'), 
    client_secret=os.getenv('CLIENT_SECRET')
)
spotify = spotipy.Spotify(client_credentials_manager=auth_manager)

### Investigando Audio Features

Vamos investigar o que são os *audio features* de algumas músicas para entender como extrair essa informação.

Inicialmente, precisamos do *url* de cada faixa no Spotify:

In [13]:
dict_songs = {
    "kate_bush": "https://open.spotify.com/track/75FEaRjZTKLhTrFGsfMUXR?si=26398bff72014b5a",
    "slayer": "https://open.spotify.com/track/4fiOTntQKr24p07FvQDHZE?si=148a03ca6ba844fb",
    "nin": "https://open.spotify.com/track/27tX58NOpv1YKQ0abW7EPy?si=09484ca5e58d454a",
    "cardi_b": "https://open.spotify.com/track/58q2HKrzhC3ozto2nDdN4z?si=6f755b3e29d841ad",
}


Agora vamos utilizar o método `audio_features` para extrair, a partir de um URL de faixa do Spotify, os dados de uma música específica:

In [17]:
spotify.audio_features(dict_songs["kate_bush"])

[{'danceability': 0.629,
  'energy': 0.547,
  'key': 10,
  'loudness': -13.123,
  'mode': 0,
  'speechiness': 0.055,
  'acousticness': 0.72,
  'instrumentalness': 0.00314,
  'liveness': 0.0604,
  'valence': 0.197,
  'tempo': 108.375,
  'type': 'audio_features',
  'id': '75FEaRjZTKLhTrFGsfMUXR',
  'uri': 'spotify:track:75FEaRjZTKLhTrFGsfMUXR',
  'track_href': 'https://api.spotify.com/v1/tracks/75FEaRjZTKLhTrFGsfMUXR',
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/75FEaRjZTKLhTrFGsfMUXR',
  'duration_ms': 298933,
  'time_signature': 4}]

In [24]:
spotify.audio_features('https://open.spotify.com/track/3VSvkDLaaMUV6nFNcT4nxv?si=9c03f4cca9644cfc')

[{'danceability': 0.308,
  'energy': 0.442,
  'key': 2,
  'loudness': -4.986,
  'mode': 0,
  'speechiness': 0.0301,
  'acousticness': 0.265,
  'instrumentalness': 9.61e-06,
  'liveness': 0.494,
  'valence': 0.536,
  'tempo': 85.387,
  'type': 'audio_features',
  'id': '3VSvkDLaaMUV6nFNcT4nxv',
  'uri': 'spotify:track:3VSvkDLaaMUV6nFNcT4nxv',
  'track_href': 'https://api.spotify.com/v1/tracks/3VSvkDLaaMUV6nFNcT4nxv',
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/3VSvkDLaaMUV6nFNcT4nxv',
  'duration_ms': 197741,
  'time_signature': 3}]

## Conectando **Festivais** à **Audio Features**

Agora que vimos como extrair *audio features* a partir de um URI de faixa, vamos mapear o caminho que utilizaremos para conectar nossa tabela de *headliners* aos *audio features*.

Precisamos *sair* de uma tabela de **nomes de artista** para o **URI de faixa** para algumas faixas de cada artista. O primeiro passo então é buscar informações sobre o artista a partir do nome deste!

### Extraindo Artistas

Para buscar *strings* na API do Spotify podemos utilizar o método `.search()`. Vamos utilizar este método para buscar um artista específico e investigar os resultados.

In [25]:
search_result = spotify.search(q="Cardi B", type="artist")
print(search_result)

{'artists': {'href': 'https://api.spotify.com/v1/search?query=Cardi+B&type=artist&offset=0&limit=10', 'items': [{'external_urls': {'spotify': 'https://open.spotify.com/artist/4kYSro6naA4h99UJvo89HB'}, 'followers': {'href': None, 'total': 20630195}, 'genres': ['dance pop', 'pop', 'pop rap', 'rap'], 'href': 'https://api.spotify.com/v1/artists/4kYSro6naA4h99UJvo89HB', 'id': '4kYSro6naA4h99UJvo89HB', 'images': [{'height': 640, 'url': 'https://i.scdn.co/image/ab6761610000e5eb8c2332e6c0ed96d144a91b3f', 'width': 640}, {'height': 320, 'url': 'https://i.scdn.co/image/ab676161000051748c2332e6c0ed96d144a91b3f', 'width': 320}, {'height': 160, 'url': 'https://i.scdn.co/image/ab6761610000f1788c2332e6c0ed96d144a91b3f', 'width': 160}], 'name': 'Cardi B', 'popularity': 81, 'type': 'artist', 'uri': 'spotify:artist:4kYSro6naA4h99UJvo89HB'}, {'external_urls': {'spotify': 'https://open.spotify.com/artist/48Zfxc7ffSnNO0VMbY0Sg8'}, 'followers': {'href': None, 'total': 92}, 'genres': [], 'href': 'https://api.

Um dicionário bem complexo... Podemos tentar tratar esse dicionário utilizando a função `json_normalize()` da biblioteca Pandas:

In [26]:
pd.json_normalize(search_result)

Unnamed: 0,artists.href,artists.items,artists.limit,artists.next,artists.offset,artists.previous,artists.total
0,https://api.spotify.com/v1/search?query=Cardi+...,[{'external_urls': {'spotify': 'https://open.s...,10,https://api.spotify.com/v1/search?query=Cardi+...,0,,104


Obviamente o resultado não foi o esperado... Precisamos investigar mais diretamente o resultado do método:

In [27]:
search_result.keys()

dict_keys(['artists'])

In [31]:
search_result['artists'].keys()

dict_keys(['href', 'items', 'limit', 'next', 'offset', 'previous', 'total'])

In [66]:
search_result['artists']['items'][0]

IndexError: list index out of range

Com o URI em mãos, podemos utilizar o método `.artist()` para trazer apenas um artista específico:

In [41]:
dict_cardib = search_result['artists']['items'][0]
dict_cardib

{'external_urls': {'spotify': 'https://open.spotify.com/artist/4kYSro6naA4h99UJvo89HB'},
 'followers': {'href': None, 'total': 20630195},
 'genres': ['dance pop', 'pop', 'pop rap', 'rap'],
 'href': 'https://api.spotify.com/v1/artists/4kYSro6naA4h99UJvo89HB',
 'id': '4kYSro6naA4h99UJvo89HB',
 'images': [{'height': 640,
   'url': 'https://i.scdn.co/image/ab6761610000e5eb8c2332e6c0ed96d144a91b3f',
   'width': 640},
  {'height': 320,
   'url': 'https://i.scdn.co/image/ab676161000051748c2332e6c0ed96d144a91b3f',
   'width': 320},
  {'height': 160,
   'url': 'https://i.scdn.co/image/ab6761610000f1788c2332e6c0ed96d144a91b3f',
   'width': 160}],
 'name': 'Cardi B',
 'popularity': 81,
 'type': 'artist',
 'uri': 'spotify:artist:4kYSro6naA4h99UJvo89HB'}

In [42]:
dict_cardib['uri']

'spotify:artist:4kYSro6naA4h99UJvo89HB'

In [44]:
spotify.artist('spotify:artist:4kYSro6naA4h99UJvo89HB')

{'external_urls': {'spotify': 'https://open.spotify.com/artist/4kYSro6naA4h99UJvo89HB'},
 'followers': {'href': None, 'total': 20630195},
 'genres': ['dance pop', 'pop', 'pop rap', 'rap'],
 'href': 'https://api.spotify.com/v1/artists/4kYSro6naA4h99UJvo89HB',
 'id': '4kYSro6naA4h99UJvo89HB',
 'images': [{'height': 640,
   'url': 'https://i.scdn.co/image/ab6761610000e5eb8c2332e6c0ed96d144a91b3f',
   'width': 640},
  {'height': 320,
   'url': 'https://i.scdn.co/image/ab676161000051748c2332e6c0ed96d144a91b3f',
   'width': 320},
  {'height': 160,
   'url': 'https://i.scdn.co/image/ab6761610000f1788c2332e6c0ed96d144a91b3f',
   'width': 160}],
 'name': 'Cardi B',
 'popularity': 81,
 'type': 'artist',
 'uri': 'spotify:artist:4kYSro6naA4h99UJvo89HB'}

### Extraindo Faixas

Agora precisamos transformar o URI de artista em um conjunto de faixas. Como extrair todas as músicas de um dado artista pode demorar muito, vamos utilizar o método `.artist_top_tracks()` para extrair dados das 10 maiores músicas de cada artista:

In [45]:
top_10_tracks = spotify.artist_top_tracks(dict_cardib['uri'])
top_10_tracks

{'tracks': [{'album': {'album_type': 'single',
    'artists': [{'external_urls': {'spotify': 'https://open.spotify.com/artist/4kYSro6naA4h99UJvo89HB'},
      'href': 'https://api.spotify.com/v1/artists/4kYSro6naA4h99UJvo89HB',
      'id': '4kYSro6naA4h99UJvo89HB',
      'name': 'Cardi B',
      'type': 'artist',
      'uri': 'spotify:artist:4kYSro6naA4h99UJvo89HB'},
     {'external_urls': {'spotify': 'https://open.spotify.com/artist/5K4W6rqBFWDnAN6FQUkS6x'},
      'href': 'https://api.spotify.com/v1/artists/5K4W6rqBFWDnAN6FQUkS6x',
      'id': '5K4W6rqBFWDnAN6FQUkS6x',
      'name': 'Kanye West',
      'type': 'artist',
      'uri': 'spotify:artist:5K4W6rqBFWDnAN6FQUkS6x'},
     {'external_urls': {'spotify': 'https://open.spotify.com/artist/3hcs9uc56yIGFCSy9leWe7'},
      'href': 'https://api.spotify.com/v1/artists/3hcs9uc56yIGFCSy9leWe7',
      'id': '3hcs9uc56yIGFCSy9leWe7',
      'name': 'Lil Durk',
      'type': 'artist',
      'uri': 'spotify:artist:3hcs9uc56yIGFCSy9leWe7'}],
    

Outro dicionário complexo... Novamente, tentemos trata-lo utilizando `json_normalize()`

In [46]:
pd.json_normalize(top_10_tracks)

Unnamed: 0,tracks
0,"[{'album': {'album_type': 'single', 'artists':..."


Novamente, não é o resultado esperado...

In [47]:
top_10_tracks.keys()

dict_keys(['tracks'])

In [48]:
type(top_10_tracks['tracks'])

list

In [50]:
len(top_10_tracks['tracks'])

10

In [51]:
top_10_tracks['tracks'][0].keys()

dict_keys(['album', 'artists', 'disc_number', 'duration_ms', 'explicit', 'external_ids', 'external_urls', 'href', 'id', 'is_local', 'is_playable', 'name', 'popularity', 'preview_url', 'track_number', 'type', 'uri'])

In [52]:
top_10_tracks['tracks'][0]

{'album': {'album_type': 'single',
  'artists': [{'external_urls': {'spotify': 'https://open.spotify.com/artist/4kYSro6naA4h99UJvo89HB'},
    'href': 'https://api.spotify.com/v1/artists/4kYSro6naA4h99UJvo89HB',
    'id': '4kYSro6naA4h99UJvo89HB',
    'name': 'Cardi B',
    'type': 'artist',
    'uri': 'spotify:artist:4kYSro6naA4h99UJvo89HB'},
   {'external_urls': {'spotify': 'https://open.spotify.com/artist/5K4W6rqBFWDnAN6FQUkS6x'},
    'href': 'https://api.spotify.com/v1/artists/5K4W6rqBFWDnAN6FQUkS6x',
    'id': '5K4W6rqBFWDnAN6FQUkS6x',
    'name': 'Kanye West',
    'type': 'artist',
    'uri': 'spotify:artist:5K4W6rqBFWDnAN6FQUkS6x'},
   {'external_urls': {'spotify': 'https://open.spotify.com/artist/3hcs9uc56yIGFCSy9leWe7'},
    'href': 'https://api.spotify.com/v1/artists/3hcs9uc56yIGFCSy9leWe7',
    'id': '3hcs9uc56yIGFCSy9leWe7',
    'name': 'Lil Durk',
    'type': 'artist',
    'uri': 'spotify:artist:3hcs9uc56yIGFCSy9leWe7'}],
  'external_urls': {'spotify': 'https://open.spotify

In [53]:
[track['name'] for track in top_10_tracks['tracks']]

['Hot Shit (feat. Ye & Lil Durk)',
 'WAP (feat. Megan Thee Stallion)',
 'I Like It',
 'South of the Border (feat. Camila Cabello & Cardi B)',
 'Rumors (feat. Cardi B)',
 'Finesse - Remix; feat. Cardi B',
 'Up',
 'Please Me',
 'Shake It (feat. Cardi B, Dougie B & Bory300)',
 'Wild Side (feat. Cardi B)']

In [54]:
[track['uri'] for track in top_10_tracks['tracks']]

['spotify:track:3uJFmluXzYedoJcvhpC1AW',
 'spotify:track:4Oun2ylbjFKMPTiaSbbCih',
 'spotify:track:58q2HKrzhC3ozto2nDdN4z',
 'spotify:track:4vUmTMuQqjdnvlZmAH61Qk',
 'spotify:track:6KgtcmCF9Ky68XC7ezxl3s',
 'spotify:track:3Vo4wInECJQuz9BIBMOu8i',
 'spotify:track:1M4OcYkxAtu3ErzSgDEfoi',
 'spotify:track:0PG9fbaaHFHfre2gUVo7AN',
 'spotify:track:0RkCnqwF8Tfl2QGPZwopyk',
 'spotify:track:2vXgyN14LX2zl7JEASw242']

In [55]:
top10_cardib_uri = [track['uri'] for track in top_10_tracks['tracks']]

In [56]:
top10_cardib_uri

['spotify:track:3uJFmluXzYedoJcvhpC1AW',
 'spotify:track:4Oun2ylbjFKMPTiaSbbCih',
 'spotify:track:58q2HKrzhC3ozto2nDdN4z',
 'spotify:track:4vUmTMuQqjdnvlZmAH61Qk',
 'spotify:track:6KgtcmCF9Ky68XC7ezxl3s',
 'spotify:track:3Vo4wInECJQuz9BIBMOu8i',
 'spotify:track:1M4OcYkxAtu3ErzSgDEfoi',
 'spotify:track:0PG9fbaaHFHfre2gUVo7AN',
 'spotify:track:0RkCnqwF8Tfl2QGPZwopyk',
 'spotify:track:2vXgyN14LX2zl7JEASw242']

### Extraindo Audio Features

Agora, com o URI de cada uma das 10 maiores músicas de um artista, podemos utilizar o método `.audio_features()` para extrair os *audio features* de cada uma dessas músicas:

In [60]:
spotify.audio_features(top10_cardib_uri[3])

[{'danceability': 0.857,
  'energy': 0.621,
  'key': 9,
  'loudness': -6.376,
  'mode': 0,
  'speechiness': 0.0825,
  'acousticness': 0.148,
  'instrumentalness': 0,
  'liveness': 0.0865,
  'valence': 0.668,
  'tempo': 97.989,
  'type': 'audio_features',
  'id': '4vUmTMuQqjdnvlZmAH61Qk',
  'uri': 'spotify:track:4vUmTMuQqjdnvlZmAH61Qk',
  'track_href': 'https://api.spotify.com/v1/tracks/4vUmTMuQqjdnvlZmAH61Qk',
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/4vUmTMuQqjdnvlZmAH61Qk',
  'duration_ms': 204467,
  'time_signature': 4}]

## Construindo nosso DB

Para construir um DB que possa responder nossa pergunta original, precisamos:

1. Consolidar os métodos que analisamos nas etapas anteriores;
1. Determinar quais **entidades** nosso banco representará;
1. Escrever o código para extrair as informações necessárias e carrega-la no DB.

Podemos estruturar nosso DB ao longo de 3 tabelas:

1. **`headliner`**: tabela com a relação entre festivais e artistas;
1. **`artist`**: tabela com informação de artistas;
1. **`track`**: tabela com Audio Features (e outras informações de faixa).

Antes de mais nada vamos nos conectar ao nosso DB:

In [61]:
import sqlalchemy as sa
load_dotenv('credentials/mysql.env')
url_banco = "localhost"
nome_db = "spotify"
conn_str = f"mysql+pymysql://root:sua_senha@{url_banco}/{nome_db}"
engine = sa.create_engine(conn_str)

tb_festivals.to_sql('fiifoofuu', engine, index = False, if_exists = 'replace')

### Tabela `headliner`

A tabela `headliner` deve conter as informações de quais artistas vão tocar em quais festivais. Poderíamos utilizar a tabela que carregamos do Google Sheets diretamente, mas isso faria o cruzamento entre **headliner** e **artist** pelo nome do artista (que pode ser frágil).

Vamos utilizar a API do Spotify para enriquecer a nossa tabela original com os URIs de cada artista:

In [62]:
tb_festivals.head()

Unnamed: 0,festival,headliners
0,Tomorrowland,Martin Garrix
1,Tomorrowland,Armin van Buuren
2,Tomorrowland,Dimitri Vegas & Like Mike
3,Tomorrowland,Marshmello
4,Tomorrowland,Amelie Lens


In [63]:
array_headliner = tb_festivals['headliners'].unique()
array_headliner[0:10]

array(['Martin Garrix', 'Armin van Buuren', 'Dimitri Vegas & Like Mike',
       'Marshmello', 'Amelie Lens', 'Adam Beyer', 'Eric Prydz', 'NERVO',
       'Billie Eilish', 'Paul McCartney'], dtype=object)

In [64]:
dict_uri = dict()
for artista in array_headliner:
    search_result = spotify.search(q=artista, type="artist")
    dict_uri[artista] = search_result['artists']['items'][0]['uri']

IndexError: list index out of range

In [68]:
spotify.search(q=artista, type="artist")

{'artists': {'href': 'https://api.spotify.com/v1/search?query=Fabio+%26+Grooverider+and+The+Outlook+Orchestra&type=artist&offset=0&limit=10',
  'items': [],
  'limit': 10,
  'next': None,
  'offset': 0,
  'previous': None,
  'total': 0}}

In [69]:
dict_uri = dict()
for artista in array_headliner:
    try:
        search_result = spotify.search(q=artista, type="artist")
        dict_uri[artista] = search_result['artists']['items'][0]['uri']
    except IndexError as e:
        dict_uri[artista] = np.nan

In [70]:
dict_uri

{'Martin Garrix': 'spotify:artist:60d24wfXkVzDSfLS6hyCjZ',
 'Armin van Buuren': 'spotify:artist:0SfsnGyD8FpIN4U4WCkBZ5',
 'Dimitri Vegas & Like Mike': 'spotify:artist:73jBynjsVtofjRpdpRAJGk',
 'Marshmello': 'spotify:artist:64KEffDW9EtZ1y2vBYgq8T',
 'Amelie Lens': 'spotify:artist:5Ho1vKl1Uz8bJlk4vbmvmf',
 'Adam Beyer': 'spotify:artist:1btv9qmIpbp7q1ixCYNdHu',
 'Eric Prydz': 'spotify:artist:5sm0jQ1mq0dusiLtDJ2b4R',
 'NERVO': 'spotify:artist:4j5KBTO4tk7up54ZirNGvK',
 'Billie Eilish': 'spotify:artist:6qqNVTkY8uBg9cP3Jd7DAH',
 'Paul McCartney': 'spotify:artist:4STHEaNw4mPZ2tzheohgXB',
 'Kendrick Lamar': 'spotify:artist:2YZyLoL8N0Wb9xBt1NhZWg',
 'Diana Ross': 'spotify:artist:3MdG05syQeRYPPcClLaUGl',
 'Burna Boy': 'spotify:artist:3wcj11K77LjEY1PkEazffa',
 'Herbie Hancock': 'spotify:artist:2ZvrvbQNrHKwjT7qfGFFUW',
 'Pet Shop Boys': 'spotify:artist:2ycnb8Er79LoH2AsR5ldjh',
 'Lorde': 'spotify:artist:163tK9Wjr9P9DmM0AVK7lm',
 'Metallica': 'spotify:artist:2ye2Wgw4gimLv2eAKyk1NB',
 'The Killers': '

In [71]:
tb_festivals['artist_uri'] = tb_festivals['headliners'].map(dict_uri)
tb_festivals.head()

Unnamed: 0,festival,headliners,artist_uri
0,Tomorrowland,Martin Garrix,spotify:artist:60d24wfXkVzDSfLS6hyCjZ
1,Tomorrowland,Armin van Buuren,spotify:artist:0SfsnGyD8FpIN4U4WCkBZ5
2,Tomorrowland,Dimitri Vegas & Like Mike,spotify:artist:73jBynjsVtofjRpdpRAJGk
3,Tomorrowland,Marshmello,spotify:artist:64KEffDW9EtZ1y2vBYgq8T
4,Tomorrowland,Amelie Lens,spotify:artist:5Ho1vKl1Uz8bJlk4vbmvmf


In [72]:
tb_festivals['headliners'].map(dict_uri)

0      spotify:artist:60d24wfXkVzDSfLS6hyCjZ
1      spotify:artist:0SfsnGyD8FpIN4U4WCkBZ5
2      spotify:artist:73jBynjsVtofjRpdpRAJGk
3      spotify:artist:64KEffDW9EtZ1y2vBYgq8T
4      spotify:artist:5Ho1vKl1Uz8bJlk4vbmvmf
                       ...                  
196    spotify:artist:60d24wfXkVzDSfLS6hyCjZ
197    spotify:artist:73A3bLnfnz5BoQjb4gNCga
198    spotify:artist:2o5jDhtHVPhrJdv3cEQ99Z
199    spotify:artist:0CbeG1224FS58EUx4tPevZ
200    spotify:artist:1Cs0zKBU1kc0i8ypK3B9ai
Name: headliners, Length: 201, dtype: object

In [73]:
tb_festivals[tb_festivals['artist_uri'].isna()]

Unnamed: 0,festival,headliners,artist_uri
116,Outlook Origins Festival,Fabio & Grooverider and The Outlook Orchestra,


In [74]:
tb_festivals.dropna()

Unnamed: 0,festival,headliners,artist_uri
0,Tomorrowland,Martin Garrix,spotify:artist:60d24wfXkVzDSfLS6hyCjZ
1,Tomorrowland,Armin van Buuren,spotify:artist:0SfsnGyD8FpIN4U4WCkBZ5
2,Tomorrowland,Dimitri Vegas & Like Mike,spotify:artist:73jBynjsVtofjRpdpRAJGk
3,Tomorrowland,Marshmello,spotify:artist:64KEffDW9EtZ1y2vBYgq8T
4,Tomorrowland,Amelie Lens,spotify:artist:5Ho1vKl1Uz8bJlk4vbmvmf
...,...,...,...
196,Creamfields,Martin Garrix,spotify:artist:60d24wfXkVzDSfLS6hyCjZ
197,Creamfields,Bicep,spotify:artist:73A3bLnfnz5BoQjb4gNCga
198,Creamfields,Tiësto,spotify:artist:2o5jDhtHVPhrJdv3cEQ99Z
199,Creamfields,Timmy Trumpet,spotify:artist:0CbeG1224FS58EUx4tPevZ


In [75]:
tb_festivals.to_sql('fiifoofuu', engine, index = False, if_exists = 'replace')

200

### Tabela `artist`

A tabela `artist` deve conter informações dos artistas presentes na tabela `headliner`. Vamos começar selecionando apenas os URIs distintos da tabela `headliner`:

In [76]:
artist_uri = engine.execute('SELECT DISTINCT artist_uri FROM headliner').fetchall()
artist_uri

[('spotify:artist:60d24wfXkVzDSfLS6hyCjZ',),
 ('spotify:artist:0SfsnGyD8FpIN4U4WCkBZ5',),
 ('spotify:artist:73jBynjsVtofjRpdpRAJGk',),
 ('spotify:artist:64KEffDW9EtZ1y2vBYgq8T',),
 ('spotify:artist:5Ho1vKl1Uz8bJlk4vbmvmf',),
 ('spotify:artist:1btv9qmIpbp7q1ixCYNdHu',),
 ('spotify:artist:5sm0jQ1mq0dusiLtDJ2b4R',),
 ('spotify:artist:4j5KBTO4tk7up54ZirNGvK',),
 ('spotify:artist:6qqNVTkY8uBg9cP3Jd7DAH',),
 ('spotify:artist:4STHEaNw4mPZ2tzheohgXB',),
 ('spotify:artist:2YZyLoL8N0Wb9xBt1NhZWg',),
 ('spotify:artist:3MdG05syQeRYPPcClLaUGl',),
 ('spotify:artist:3wcj11K77LjEY1PkEazffa',),
 ('spotify:artist:2ZvrvbQNrHKwjT7qfGFFUW',),
 ('spotify:artist:2ycnb8Er79LoH2AsR5ldjh',),
 ('spotify:artist:163tK9Wjr9P9DmM0AVK7lm',),
 ('spotify:artist:2ye2Wgw4gimLv2eAKyk1NB',),
 ('spotify:artist:0C0XlULifJtAgn6ZNCW2eu',),
 ('spotify:artist:12Chz98pHFMPJEknJQMWvI',),
 ('spotify:artist:2qk9voo8llSGYcZ6xrBzKx',),
 ('spotify:artist:3YQKmKGau1PzlVlkL1iodx',),
 ('spotify:artist:53XhwfbYqKCa1cC15pYq2q',),
 ('spotify

In [77]:
len(artist_uri)

143

In [78]:
for artista in artist_uri:
    print(artista[0])

spotify:artist:60d24wfXkVzDSfLS6hyCjZ
spotify:artist:0SfsnGyD8FpIN4U4WCkBZ5
spotify:artist:73jBynjsVtofjRpdpRAJGk
spotify:artist:64KEffDW9EtZ1y2vBYgq8T
spotify:artist:5Ho1vKl1Uz8bJlk4vbmvmf
spotify:artist:1btv9qmIpbp7q1ixCYNdHu
spotify:artist:5sm0jQ1mq0dusiLtDJ2b4R
spotify:artist:4j5KBTO4tk7up54ZirNGvK
spotify:artist:6qqNVTkY8uBg9cP3Jd7DAH
spotify:artist:4STHEaNw4mPZ2tzheohgXB
spotify:artist:2YZyLoL8N0Wb9xBt1NhZWg
spotify:artist:3MdG05syQeRYPPcClLaUGl
spotify:artist:3wcj11K77LjEY1PkEazffa
spotify:artist:2ZvrvbQNrHKwjT7qfGFFUW
spotify:artist:2ycnb8Er79LoH2AsR5ldjh
spotify:artist:163tK9Wjr9P9DmM0AVK7lm
spotify:artist:2ye2Wgw4gimLv2eAKyk1NB
spotify:artist:0C0XlULifJtAgn6ZNCW2eu
spotify:artist:12Chz98pHFMPJEknJQMWvI
spotify:artist:2qk9voo8llSGYcZ6xrBzKx
spotify:artist:3YQKmKGau1PzlVlkL1iodx
spotify:artist:53XhwfbYqKCa1cC15pYq2q
spotify:artist:6GbCJZrI318Ybm8mY36Of5
spotify:artist:6zvul52xwTWzilBZl6BUbT
spotify:artist:6RZUqkomCmb8zCRqc9eznB
spotify:artist:1Cs0zKBU1kc0i8ypK3B9ai
spotify:arti

In [79]:
dados_artista = []
for artista in artist_uri:
    search_result = spotify.artist(artista[0])
    dados_artista.append(
        (
            search_result['uri'],
            search_result['name'],
            search_result['popularity'],
            search_result['followers']['total']
        )
    )
dados_artista[0:5]

[('spotify:artist:60d24wfXkVzDSfLS6hyCjZ', 'Martin Garrix', 76, 15618020),
 ('spotify:artist:0SfsnGyD8FpIN4U4WCkBZ5', 'Armin van Buuren', 73, 4179784),
 ('spotify:artist:73jBynjsVtofjRpdpRAJGk',
  'Dimitri Vegas & Like Mike',
  70,
  3256496),
 ('spotify:artist:64KEffDW9EtZ1y2vBYgq8T', 'Marshmello', 82, 34151481),
 ('spotify:artist:5Ho1vKl1Uz8bJlk4vbmvmf', 'Amelie Lens', 48, 454443)]

In [80]:
dados_artista[0]

('spotify:artist:60d24wfXkVzDSfLS6hyCjZ', 'Martin Garrix', 76, 15618020)

In [81]:
tb_artista = pd.DataFrame(dados_artista, 
                          columns = ['artist_uri', 'artist_name', 'popularity', 'followers'])
tb_artista.head()

Unnamed: 0,artist_uri,artist_name,popularity,followers
0,spotify:artist:60d24wfXkVzDSfLS6hyCjZ,Martin Garrix,76,15618020
1,spotify:artist:0SfsnGyD8FpIN4U4WCkBZ5,Armin van Buuren,73,4179784
2,spotify:artist:73jBynjsVtofjRpdpRAJGk,Dimitri Vegas & Like Mike,70,3256496
3,spotify:artist:64KEffDW9EtZ1y2vBYgq8T,Marshmello,82,34151481
4,spotify:artist:5Ho1vKl1Uz8bJlk4vbmvmf,Amelie Lens,48,454443


In [82]:
tb_artista.to_sql('artist', engine, index = False, if_exists = 'replace')

143

### Tabela `tracks`

Agora, com os URIs de cada artista, podemos utilizar os métodos `.artist_top_tracks()` e `.audio_features()` para extrair as informações de cada faixa (tanto informações gerais, como o nome, quanto os *audio features* em si):

In [83]:
top_10_tracks = spotify.artist_top_tracks('spotify:artist:0C0XlULifJtAgn6ZNCW2eu')
top_10_tracks

{'tracks': [{'album': {'album_type': 'album',
    'artists': [{'external_urls': {'spotify': 'https://open.spotify.com/artist/0C0XlULifJtAgn6ZNCW2eu'},
      'href': 'https://api.spotify.com/v1/artists/0C0XlULifJtAgn6ZNCW2eu',
      'id': '0C0XlULifJtAgn6ZNCW2eu',
      'name': 'The Killers',
      'type': 'artist',
      'uri': 'spotify:artist:0C0XlULifJtAgn6ZNCW2eu'}],
    'external_urls': {'spotify': 'https://open.spotify.com/album/4piJq7R3gjUOxnYs6lDCTg'},
    'href': 'https://api.spotify.com/v1/albums/4piJq7R3gjUOxnYs6lDCTg',
    'id': '4piJq7R3gjUOxnYs6lDCTg',
    'images': [{'height': 640,
      'url': 'https://i.scdn.co/image/ab67616d0000b273ccdddd46119a4ff53eaf1f5d',
      'width': 640},
     {'height': 300,
      'url': 'https://i.scdn.co/image/ab67616d00001e02ccdddd46119a4ff53eaf1f5d',
      'width': 300},
     {'height': 64,
      'url': 'https://i.scdn.co/image/ab67616d00004851ccdddd46119a4ff53eaf1f5d',
      'width': 64}],
    'name': 'Hot Fuss',
    'release_date': '2004'

In [84]:
top_10_tracks['tracks']

[{'album': {'album_type': 'album',
   'artists': [{'external_urls': {'spotify': 'https://open.spotify.com/artist/0C0XlULifJtAgn6ZNCW2eu'},
     'href': 'https://api.spotify.com/v1/artists/0C0XlULifJtAgn6ZNCW2eu',
     'id': '0C0XlULifJtAgn6ZNCW2eu',
     'name': 'The Killers',
     'type': 'artist',
     'uri': 'spotify:artist:0C0XlULifJtAgn6ZNCW2eu'}],
   'external_urls': {'spotify': 'https://open.spotify.com/album/4piJq7R3gjUOxnYs6lDCTg'},
   'href': 'https://api.spotify.com/v1/albums/4piJq7R3gjUOxnYs6lDCTg',
   'id': '4piJq7R3gjUOxnYs6lDCTg',
   'images': [{'height': 640,
     'url': 'https://i.scdn.co/image/ab67616d0000b273ccdddd46119a4ff53eaf1f5d',
     'width': 640},
    {'height': 300,
     'url': 'https://i.scdn.co/image/ab67616d00001e02ccdddd46119a4ff53eaf1f5d',
     'width': 300},
    {'height': 64,
     'url': 'https://i.scdn.co/image/ab67616d00004851ccdddd46119a4ff53eaf1f5d',
     'width': 64}],
   'name': 'Hot Fuss',
   'release_date': '2004',
   'release_date_precision': 

In [85]:
[track['uri'] for track in top_10_tracks['tracks']]

['spotify:track:003vvx7Niy0yvhvHt4a68B',
 'spotify:track:6PwjJ58I4t7Mae9xfZ9l9v',
 'spotify:track:70wYA8oYHoMzhRRkARoMhU',
 'spotify:track:1sTsuZTdANkiFd7T34H3nb',
 'spotify:track:5vollujufHY0jMZxx77VWr',
 'spotify:track:3Qw0WuniULBdYjXe2jsqCy',
 'spotify:track:7cX4PJz1old9fyFI8RlfgW',
 'spotify:track:3KANrKOFYyAxfjQJHkgBdb',
 'spotify:track:2aZ2Co4NeQRsqWcU930zHT',
 'spotify:track:5aWhs651KYM26HYM16kRdk']

In [86]:
top_10_tracks['tracks'][0].keys()

dict_keys(['album', 'artists', 'disc_number', 'duration_ms', 'explicit', 'external_ids', 'external_urls', 'href', 'id', 'is_local', 'is_playable', 'name', 'popularity', 'preview_url', 'track_number', 'type', 'uri'])

In [92]:
top_10_tracks['tracks'][0]['name']

'Mr. Brightside'

In [91]:
spotify.audio_features(top_10_tracks['tracks'][0]['uri'])

[{'danceability': 0.352,
  'energy': 0.911,
  'key': 1,
  'loudness': -5.23,
  'mode': 1,
  'speechiness': 0.0747,
  'acousticness': 0.00121,
  'instrumentalness': 0,
  'liveness': 0.0995,
  'valence': 0.236,
  'tempo': 148.033,
  'type': 'audio_features',
  'id': '003vvx7Niy0yvhvHt4a68B',
  'uri': 'spotify:track:003vvx7Niy0yvhvHt4a68B',
  'track_href': 'https://api.spotify.com/v1/tracks/003vvx7Niy0yvhvHt4a68B',
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/003vvx7Niy0yvhvHt4a68B',
  'duration_ms': 222973,
  'time_signature': 4}]

In [93]:
track_data = []

for artist in artist_uri:
    top_tracks = spotify.artist_top_tracks(artist[0])
    for track in top_tracks['tracks']:
        track_au = spotify.audio_features(track['uri'])[0]
        track_au['name'] = track['name']
        track_au['popularity'] = track['popularity']
        track_au['explicit'] = track['explicit']
        track_au['artist_uri'] = artist[0]
        track_data.append(track_au)

In [105]:
track

{'album': {'album_type': 'compilation',
  'artists': [{'external_urls': {'spotify': 'https://open.spotify.com/artist/0LyfQWJT6nXafLPZqxe9Of'},
    'href': 'https://api.spotify.com/v1/artists/0LyfQWJT6nXafLPZqxe9Of',
    'id': '0LyfQWJT6nXafLPZqxe9Of',
    'name': 'Various Artists',
    'type': 'artist',
    'uri': 'spotify:artist:0LyfQWJT6nXafLPZqxe9Of'}],
  'external_urls': {'spotify': 'https://open.spotify.com/album/6b439Wfa3J7VMZCTNoLgN8'},
  'href': 'https://api.spotify.com/v1/albums/6b439Wfa3J7VMZCTNoLgN8',
  'id': '6b439Wfa3J7VMZCTNoLgN8',
  'images': [{'height': 640,
    'url': 'https://i.scdn.co/image/ab67616d0000b273db8fbd9b4acb3a967c3408e7',
    'width': 640},
   {'height': 300,
    'url': 'https://i.scdn.co/image/ab67616d00001e02db8fbd9b4acb3a967c3408e7',
    'width': 300},
   {'height': 64,
    'url': 'https://i.scdn.co/image/ab67616d00004851db8fbd9b4acb3a967c3408e7',
    'width': 64}],
  'name': 'EDM Vegas 2017',
  'release_date': '2017-06-16',
  'release_date_precision': 

In [103]:
spotify.audio_features(track['uri'])

[{'danceability': 0.732,
  'energy': 0.941,
  'key': 11,
  'loudness': -6.817,
  'mode': 0,
  'speechiness': 0.127,
  'acousticness': 0.00328,
  'instrumentalness': 0.88,
  'liveness': 0.0624,
  'valence': 0.259,
  'tempo': 124.006,
  'type': 'audio_features',
  'id': '2mrYurqSVRigZnrLZk42j4',
  'uri': 'spotify:track:2mrYurqSVRigZnrLZk42j4',
  'track_href': 'https://api.spotify.com/v1/tracks/2mrYurqSVRigZnrLZk42j4',
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/2mrYurqSVRigZnrLZk42j4',
  'duration_ms': 452283,
  'time_signature': 4}]

In [107]:
abc = spotify.audio_features(track['uri'])[0]

In [108]:
abc['name'] = track['name']

In [109]:
abc

{'danceability': 0.732,
 'energy': 0.941,
 'key': 11,
 'loudness': -6.817,
 'mode': 0,
 'speechiness': 0.127,
 'acousticness': 0.00328,
 'instrumentalness': 0.88,
 'liveness': 0.0624,
 'valence': 0.259,
 'tempo': 124.006,
 'type': 'audio_features',
 'id': '2mrYurqSVRigZnrLZk42j4',
 'uri': 'spotify:track:2mrYurqSVRigZnrLZk42j4',
 'track_href': 'https://api.spotify.com/v1/tracks/2mrYurqSVRigZnrLZk42j4',
 'analysis_url': 'https://api.spotify.com/v1/audio-analysis/2mrYurqSVRigZnrLZk42j4',
 'duration_ms': 452283,
 'time_signature': 4,
 'name': "Ohh Baby - David Tort's Dub Tech Mix"}

In [98]:
tb_tracks = pd.DataFrame(track_data)
tb_tracks.head()

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,...,id,uri,track_href,analysis_url,duration_ms,time_signature,name,popularity,explicit,artist_uri
0,0.584,0.54,1,-7.786,0,0.0576,0.0895,0.0,0.261,0.195,...,3ebXMykcMXOcLeJ9xZ17XH,spotify:track:3ebXMykcMXOcLeJ9xZ17XH,https://api.spotify.com/v1/tracks/3ebXMykcMXOc...,https://api.spotify.com/v1/audio-analysis/3ebX...,220883,4,Scared to Be Lonely,77,False,spotify:artist:60d24wfXkVzDSfLS6hyCjZ
1,0.501,0.519,4,-5.88,0,0.0409,0.109,0.0,0.454,0.168,...,23L5CiUhw2jV1OIMwthR3S,spotify:track:23L5CiUhw2jV1OIMwthR3S,https://api.spotify.com/v1/tracks/23L5CiUhw2jV...,https://api.spotify.com/v1/audio-analysis/23L5...,195707,4,In the Name of Love,77,False,spotify:artist:60d24wfXkVzDSfLS6hyCjZ
2,0.661,0.723,5,-6.976,0,0.0566,0.179,1.2e-05,0.14,0.316,...,7Feaw9WAEREY0DUOSXJLOM,spotify:track:7Feaw9WAEREY0DUOSXJLOM,https://api.spotify.com/v1/tracks/7Feaw9WAEREY...,https://api.spotify.com/v1/audio-analysis/7Fea...,163805,4,Summer Days (feat. Macklemore & Patrick Stump ...,75,True,spotify:artist:60d24wfXkVzDSfLS6hyCjZ
3,0.414,0.486,6,-6.431,0,0.0311,0.0129,0.0,0.111,0.368,...,4ut5G4rgB1ClpMTMfjoIuy,spotify:track:4ut5G4rgB1ClpMTMfjoIuy,https://api.spotify.com/v1/tracks/4ut5G4rgB1Cl...,https://api.spotify.com/v1/audio-analysis/4ut5...,230762,4,High On Life (feat. Bonn),71,False,spotify:artist:60d24wfXkVzDSfLS6hyCjZ
4,0.693,0.725,5,-6.318,0,0.0439,0.0628,0.00785,0.157,0.379,...,0lqgo6rIBS0nVsvppZC3Ay,spotify:track:0lqgo6rIBS0nVsvppZC3Ay,https://api.spotify.com/v1/tracks/0lqgo6rIBS0n...,https://api.spotify.com/v1/audio-analysis/0lqg...,190560,4,Loop,70,True,spotify:artist:60d24wfXkVzDSfLS6hyCjZ


In [99]:
tb_tracks.to_sql('track', engine, index = False, if_exists = 'replace')

1412

## Usando nosso DB

In [110]:
query_festival = '''
    SELECT
        h.festival,
        AVG(t.valence) AS feliz,
        AVG(t.energy) AS energia,
        AVG(t.danceability) AS dancavel,
        AVG(a.popularity) AS pop,
        AVG(a.followers) AS seguidores
    FROM 
        headliner h JOIN
        artist a ON (h.artist_uri = a.artist_uri) JOIN
        track t ON (a.artist_uri = t.artist_uri)
    GROUP BY
        h.festival 
'''
tb_festival_au = pd.read_sql(query_festival, engine)
tb_festival_au.head()

Unnamed: 0,festival,feliz,energia,dancavel,pop,seguidores
0,Creamfields,0.355613,0.77905,0.6731,68.625,6666338.0
1,Tomorrowland,0.336969,0.805462,0.643012,64.875,7398469.0
2,AMF,0.413505,0.815963,0.634975,73.5,5354958.0
3,SAGA Festival,0.415411,0.770862,0.691025,67.375,7061979.0
4,Time Warp,0.264539,0.736357,0.708914,44.2857,313081.1


In [119]:
tb_festival_au.sort_values('seguidores', ascending = False).head()

Unnamed: 0,festival,feliz,energia,dancavel,pop,seguidores
13,Sziget Festival,0.596743,0.714929,0.621529,81.5714,21757210.0
5,Glastonbury,0.513408,0.581076,0.632138,73.625,13710960.0
11,Mad Cool Festival,0.435,0.764189,0.532622,74.5556,12388460.0
6,Wireless Festival,0.511045,0.643625,0.677313,80.25,12266700.0
20,Rolling Loud Portugal,0.422218,0.608063,0.73645,81.625,10768070.0


# Voltamos 21h28

In [2]:
a = '13,47%'

In [6]:
float(a.replace('%', '').replace('.', '').replace(',', '.'))

0.13470000000000001

In [None]:
re.sub()

In [2]:
import pandas as pd

In [5]:
import numpy as np

In [3]:
abc = pd.DataFrame({'a' : [1, 1, 1, 2], 'b' : [1,2,3,1]})

In [6]:
np.unique(abc[['a', 'b']].values)

array([1, 2, 3])

In [2]:
lista_anos = [2019, 2020]
lista_meses = range(1, 13)

In [6]:
[f'{ano}-{mes}' for mes in lista_meses for ano in lista_anos]

['2019-1',
 '2020-1',
 '2019-2',
 '2020-2',
 '2019-3',
 '2020-3',
 '2019-4',
 '2020-4',
 '2019-5',
 '2020-5',
 '2019-6',
 '2020-6',
 '2019-7',
 '2020-7',
 '2019-8',
 '2020-8',
 '2019-9',
 '2020-9',
 '2019-10',
 '2020-10',
 '2019-11',
 '2020-11',
 '2019-12',
 '2020-12']

In [5]:
for ano in lista_anos:
    for mes in lista_meses:

        if len(str(mes)) < 2:
            mes_str = "0" + str(mes)
        else:
            mes_str = mes

        print(f'{ano}-{mes_str}')
        time.sleep(10)

2019-01
2019-02
2019-03
2019-04
2019-05
2019-06
2019-07
2019-08
2019-09
2019-10
2019-11
2019-12
2020-01
2020-02
2020-03
2020-04
2020-05
2020-06
2020-07
2020-08
2020-09
2020-10
2020-11
2020-12
