Baseado [nesse notebook](https://gist.github.com/Kautenja/71f139eee58099b77e91a0d775e42b47#file-visualizing-spotify-related-artists-with-gephi-ipynb) de @Kautenja no gist e [nesse notebook](https://github.com/nazareno/redes-do-spotify/blob/main/gera-rede-para-gephi.ipynb) de @nazareno no GitHub. 

# Spotify API

Para utilizar a API do Spotify faremos uso de um _wrapper_ leve disponível para _Python_:

In [1]:
!pip install spotipy



In [2]:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
# Credenciais da aplicação criada no Spotify Developer:
client_id = ''
client_secret = ''
# Criando um gerenciador de credenciais e um objeto da API do Spotify:
client_credentials_manager = SpotifyClientCredentials(client_id=client_id, client_secret=client_secret)
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

## Nó raiz

O nó raiz será o artista [Vintage Culture](https://open.spotify.com/artist/28uJnu5EsrGml2tBd7y8ts?si=ALFe0k18RWeKGfmDAJDqoQ).

In [3]:
# ID do Spotify para o artista Vintage Culture:
seed_artist = 'spotify:artist:28uJnu5EsrGml2tBd7y8ts' 

# Construindo a rede

Para construir uma rede, usaremos uma forma modificada de DFS com profundidade limitada para gerar um dicionário de artistas para uma lista de artistas relacionados.

In [4]:
def build_related_network(artist_id, depth: int=3) -> dict:
    """
    Return a dictionary of artist names to lists of their related artists.
    
    Args:
        arist_id: the id of the artist to start the graph from
        depth: the depth into the related artist network 
    
    Returns: a dictionary of strings (artist name) to lists (related artists)
    """
    graph = dict()
    _build_related_network(artist_id, depth, graph)
    return graph

def _build_related_network(artist_id, depth, graph):
    """
    Recursively collect related artists and store the results in the graph.
    
    Args:
        artist_id: the artist to get the related artists of
        depth: the current depth in the graph
        graph: the dictionary to put the related artist results into
    """
    if depth == 0:
        return
    name = sp.artist(artist_id)['name']
    if name in graph.keys(): 
        print("revisit for " + name)
        return
    print("fetching for " + name)
    like_artist = sp.artist_related_artists(artist_id)
    graph[name] = [related['name'] for related in like_artist['artists']]
    [_build_related_network(related['id'], depth - 1, graph) for related in like_artist['artists']]

In [5]:
# Checando o caso base:
build_related_network(seed_artist, 1)

fetching for Vintage Culture


{'Vintage Culture': ['Cat Dealers',
  'Chapeleiro',
  'Dubdogz',
  'Bhaskar',
  'JØRD',
  'KVSH',
  'Jetlag Music',
  'Malifoo',
  'DUX',
  'Sevenn',
  'Evokings',
  'Breno Miranda',
  'Zeeba',
  'Vitor Kley',
  'Groove Delight',
  'Chemical Surf',
  'GHOSTT',
  'VINNE',
  'Bruno Be',
  'Atitude 67']}

In [6]:
like_seed = build_related_network(seed_artist, 5)

fetching for Vintage Culture
fetching for Cat Dealers
fetching for Jetlag Music
fetching for DUX
revisit for Jetlag Music
fetching for GHOSTT
fetching for Malifoo
fetching for Breno Miranda
fetching for Zeeba
fetching for Evokings
fetching for JØRD
fetching for Selva
fetching for Joy Corporation
fetching for Sevenn
fetching for KVSH
fetching for Dazzo
fetching for Dubdogz
fetching for Elekfantz
fetching for VINNE
fetching for Pontifexx
fetching for Bhaskar
revisit for Cat Dealers
fetching for Bruno Be
fetching for Zerb
revisit for Evokings
revisit for Malifoo
revisit for GHOSTT
revisit for JØRD
revisit for Breno Miranda
revisit for Zeeba
revisit for Selva
revisit for KVSH
revisit for Sevenn
revisit for Dazzo
revisit for Joy Corporation
revisit for Cat Dealers
revisit for Dubdogz
revisit for Bhaskar
revisit for VINNE
revisit for Elekfantz
revisit for Pontifexx
revisit for Bruno Be
fetching for Groove Delight
revisit for JØRD
revisit for GHOSTT
revisit for Evokings
revisit for Malifoo
re

# Formatando a rede para o Gephi

In [7]:
import pandas as pd
import numpy as np

## nodes.csv

In [8]:
def build_nodes(graph: dict) -> pd.DataFrame:
    """
    Return a dataframe of nodes for the given graph.
    
    Args:
        graph: the graph to generate a unique table of nodes from
        
    Returns: a dataframe with nodes and unique ids
    """
    _nodes = []

    # iterate over all the artists in the list
    for artist, related_list in graph.items():
        _nodes.append(artist)
        [_nodes.append(related) for related in related_list]

    # keep only unique nodes
    _nodes = np.unique(_nodes)
    # make a dataframe to generate ids
    _nodes = pd.DataFrame(_nodes, columns=['label'])
    # use the index columns as the id
    _nodes['id'] = _nodes.index
    return _nodes

In [9]:
network_nodes = build_nodes(like_seed)
network_nodes.head()

Unnamed: 0,label,id
0,1Kilo,0
1,3030,1
2,4i20,2
3,5 a Seco,3
4,7th Sun,4


## edges.csv

In [10]:
def build_edges(graph: dict, nodes: pd.DataFrame) -> pd.DataFrame:
    """
    Return a dataframe of edges based on the graph and table of node ids.
    
    Args:
        graph: the graph to find edges in
        nodes: the table of nodes with unique node ids
        
    Returns: a table of targets to destinations by unique id
    """
    _edges = []

    for artist, related_list in graph.items():
        artist_node = nodes['id'][nodes['label'] == artist].values[0]
        for related in related_list:
            related_node = nodes['id'][nodes['label'] == related].values[0]
            _edges.append((artist_node, related_node))

    return pd.DataFrame(_edges, columns=['Source','Target'])

In [11]:
network_edges = build_edges(like_seed, network_nodes)
network_edges.head()

Unnamed: 0,Source,Target
0,383,64
1,383,65
2,383,106
3,383,37
4,383,182


## Exportando os CSVs 

In [14]:
network_nodes.to_csv('data/vintage_culture_nodes.csv', index = False)

In [15]:
network_edges.to_csv('data/vintage_culture_edges.csv', index = False) 