## Getting song analysis from `analysis_url` feautre

In notebook 01.1 we could extract song's features. One of them (analysis_url`) give us a "place" in the API to retieve a extended analysis for a particular song.

In [32]:
import ast
import os

from typing import List
from os import listdir
from dotenv import load_dotenv

import requests
import spotipy
import spotipy.util as util

import pandas as pd

Loading the evironment variables from `.env`:

In [3]:
load_dotenv()

True

Asking permission to use my data and getting the token:

In [40]:
def get_token():
    username = os.getenv('username')
    client_id = os.getenv('client_id')
    client_secret = os.getenv('client_secret')
    redirect_uri = os.getenv('redirect_uri')
    scope = os.getenv('scope')

    token = util.prompt_for_user_token(username=username, scope=scope, client_id=client_id,
                                    client_secret=client_secret, redirect_uri=redirect_uri)

    return token

Loading my streaming data:

In [5]:
def get_streamings(path: str) -> List[dict]:
    file = path + '/my_spotify_data/StreamingHistory0.json'
    
    all_streamings = []

    with open(file, 'r', encoding='UTF-8') as f:
        new_stremings = ast.literal_eval(f.read())
        all_streamings += [streaming for streaming in new_stremings]

    return all_streamings

In [6]:
streamings = get_streamings('../data')

In [12]:
name = streamings[123]['artistName']
name

'The Killers'

Getting __artist id__ and __genres__ from `artistName`. We'll select the first item from the genres' list, because Spotify explains that the first one is the most representative.

In [27]:
def get_artist_data(artist_name: str, token: str) -> str:
    # setting the header and parameters for requesting the API
    headers = {
        'Accept': 'application/json',
        'Content-Type': 'application/json',
        'Authorization': f'Bearer ' + token
    }

    params = [
        ('q', artist_name),
        ('type', 'artist')
    ]

    try:
        r = requests.get('https://api.spotify.com/v1/search', headers=headers,
                         params=params, timeout=5)
        
        json = r.json()
        # selecting the item that contains the info needed
        first_result = json['artists']['items'][0]
        #retreiving the data needed
        #artist_id = first_result['id']
        genre = first_result['genres'][0]
        
        return genre

    except:
        return None 

We've selected the first item from the genres' list, because Spotify explains that the first one is the most representative.

In [37]:
genre = get_artist_data(artist_name=name, token=token)
genre

'alternative rock'

Requesting the genres and ids for the entire dataset:

In [29]:
unique_artists = list(set([streaming['artistName'] for streaming in streamings]))

In [38]:
a = get_artist_data('Zuni', token)
a

'rain'

In [41]:
artist_data = []
token = get_token()

for artist in unique_artists:
    genre = get_artist_data(artist, token)

    artist_data.append((artist, genre))

df_artists = pd.DataFrame(artist_data, columns=['artist', 'genre'])

In [42]:
df_genres = pd.DataFrame(artist_data, columns=['artist', 'genre'])
df_genres

Unnamed: 0,artist,genre
0,Zuni,rain
1,Rusty Eye,
2,Olivier Lupin,
3,Spin Doctors,alternative rock
4,Sonic Brainwaves,binaural
...,...,...
1645,Florence + The Machine,art pop
1646,FISHER,australian house
1647,Pablo Alborán,latin
1648,Shakira,colombian pop


Saving the genres data into a csv file:

In [44]:
df_genres.to_csv('../data/artist_genres.csv', index=False)