<a href="https://colab.research.google.com/github/nazareno/redes-do-spotify/blob/main/gera-rede-para-gephi.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Baseado [nesse notebook](https://gist.github.com/Kautenja/71f139eee58099b77e91a0d775e42b47#file-visualizing-spotify-related-artists-with-gephi-ipynb) de @Kautenja no gist. 

# Spotify API

instead of directly interacting with the Spotify Restful API, I use a lightweight Python wrapper for ease.

In [None]:
!pip install spotipy

Collecting spotipy
  Downloading https://files.pythonhosted.org/packages/7a/cd/e7d9a35216ea5bfb9234785f3d8fa7c96d0e33999c2cb72394128f6b4cce/spotipy-2.16.1-py3-none-any.whl
Installing collected packages: spotipy
Successfully installed spotipy-2.16.1


In [None]:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
# spotify credentials for the application:
# "Related Artist Network Visualizer"
client_id = 'COLOQUE A SUA AQUI'
client_secret = 'COLOQUE A SUA AQUI'
# create a credential manager and api layer
client_credentials_manager = SpotifyClientCredentials(client_id=client_id, client_secret=client_secret)
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

## Root Node

I'll start with a root node of on of my favorite artists, [Bassnectar](https://www.bassnectar.net).

In [None]:
# the spotify id for the artist Bassnectar
seed_artist = 'spotify:artist:1mCHLu4gizrN9PwHxKrJv4'

# Building A Network

To build a network, I'll use a modified form of depth limited DFS to generate a dictionary of artists to a list of related artists.

In [None]:
def related_network(artist_id, depth: int=3) -> dict:
    """
    Return a dictionary of artist names to lists of their related artists.
    
    Args:
        arist_id: the id of the artist to start the graph from
        depth: the depth into the related artist network 
    
    Returns: a dictionary of strings (artist name) to lists (related artists)
    """
    graph = dict()
    _related_network(artist_id, depth, graph)
    return graph

def _related_network(artist_id, depth, graph):
    """
    Recursively collect related artists and store the results in the graph.
    
    Args:
        artist_id: the artist to get the related artists of
        depth: the current depth in the graph
        graph: the dictionary to put the related artist results into
    """
    if depth == 0:
        return
    name = sp.artist(artist_id)['name']
    if name in graph.keys(): 
      print("revisit for " + name)
      return
    print("fetching for " + name)
    like_artist = sp.artist_related_artists(artist_id)
    graph[name] = [related['name'] for related in like_artist['artists']]
    [_related_network(related['id'], depth - 1, graph) for related in like_artist['artists']]

In [None]:
# check the base case
related_network(seed_artist, 1)

fetching for Dominguinhos


{'Dominguinhos': ['Luiz Gonzaga',
  'Joao Do Vale',
  'Trio Nordestino',
  'Jackson Do Pandeiro',
  'Moraes Moreira',
  'Xangai',
  'Geraldo Azevedo',
  'João Bosco',
  'Trio Virgulino',
  'Luiz Melodia',
  'Gonzaguinha',
  'Elba Ramalho',
  'Chico César',
  'Sivuca',
  'Trio Forrozão',
  'Nelson Cavaquinho',
  'Elza Soares',
  'Dorival Caymmi',
  'Clara Nunes',
  'Hamilton De Holanda']}

In [None]:
like_seed = related_network(seed_artist, 5)

fetching for Dominguinhos
fetching for Luiz Gonzaga
revisit for Dominguinhos
fetching for Joao Do Vale
fetching for Xangai
revisit for Joao Do Vale
fetching for Elomar
fetching for Elomar, Geraldo Azevedo, Vital Farias e Xangai
fetching for Quinteto Violado
fetching for Elomar, Geraldo Azevedo, Vital Farias, Xangai
revisit for Dominguinhos
fetching for Geraldo Azevedo
fetching for Walter Franco
fetching for Moraes Moreira
fetching for Ednardo
fetching for Antônio Nóbrega
fetching for Jackson Do Pandeiro
revisit for Luiz Gonzaga
fetching for Sivuca
fetching for Jorge Mautner
fetching for Mestre Ambrósio
fetching for Itamar Assumpção
fetching for Jards Macalé
fetching for A Cor Do Som
fetching for Sa & Guarabyra
fetching for Nelson Cavaquinho
fetching for Nelson Sargento
fetching for Elton Medeiros
fetching for Jamelão
fetching for Noel Rosa
fetching for VELHA GUARDA DA PORTELA
fetching for Zé Keti
fetching for Candeia
fetching for Moreira Da Silva
fetching for Os Originais Do Samba
fetc

# Formatting the Network for Gephi

In [None]:
import pandas as pd
import numpy as np

## nodes.csv

In [None]:
def nodes(graph: dict) -> pd.DataFrame:
    """
    Return a dataframe of nodes for the given graph.
    
    Args:
        graph: the graph to generate a unique table of nodes from
        
    Returns: a dataframe with nodes and unique ids
    """
    _nodes = []

    # iterate over all the artists in the list
    for artist, related_list in graph.items():
        _nodes.append(artist)
        [_nodes.append(related) for related in related_list]

    # keep only unique nodes
    _nodes = np.unique(_nodes)
    # make a dataframe to generate ids
    _nodes = pd.DataFrame(_nodes, columns=['label'])
    # use the index columns as the id
    _nodes['id'] = _nodes.index
    return _nodes

In [None]:
network_nodes = nodes(like_seed)
network_nodes.head()

Unnamed: 0,label,id
0,14 Bis,0
1,A Bolha,1
2,A Cor Do Som,2
3,Abdias,3
4,Abel Ferreira,4


## edges.csv

In [None]:
def edges(graph: dict, nodes: pd.DataFrame) -> pd.DataFrame:
    """
    Return a dataframe of edges based on the graph and table of node ids.
    
    Args:
        graph: the graph to find edges in
        nodes: the table of nodes with unique node ids
        
    Returns: a table of targets to destinations by unique id
    """
    _edges = []

    for artist, related_list in graph.items():
        artist_node = nodes['id'][nodes['label'] == artist].values[0]
        for related in related_list:
            related_node = nodes['id'][nodes['label'] == related].values[0]
            _edges.append((artist_node, related_node))

    return pd.DataFrame(_edges, columns=['Source','Target'])

In [None]:
network_edges = edges(like_seed, network_nodes)
network_edges.head()

Unnamed: 0,Source,Target
0,202,374
1,202,316
2,202,626
3,202,305
4,202,435


## Save CSVs 

In [None]:
from google.colab import files
network_nodes.to_csv('nodes.csv', index = False) 
files.download('nodes.csv')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [None]:
network_edges.to_csv('edges.csv', index = False) 
files.download('edges.csv')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>