## Spotify data tutorial

This tutorial covers getting the:
* Spotify Album ID, for a given album name and artist name
* Name of tracks for a given spotify album id
* Total play count for a given album based on aggregating the play counts of all tracks in the album

Code snippets below are used as starting point for the data fetching scripts

### Getting the spotify album id and track names by searching for a given album on the spotify WebAPI

Do `pip install spotipy` before running

In [4]:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

Load credentials

In [7]:
with open('/home/sean-cx1/Documents/cert/spotify.csv') as fhandle:
    secrets = fhandle.read().strip()
    
cred = secrets.split(', ')

In [10]:
client_credentials_manager = SpotifyClientCredentials(client_id=cred[0],client_secret=cred[1])
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

Demo of searching for an album and parsing the first 10 results, we'll take a look at the `name` of the album, the `artists`, and the `uri` (identifier needed to fetch album info)

In [12]:
album_search_results = sp.search(q='Mezzanine',type='album',limit=10)

In [91]:
for result in album_search_results['albums']['items']:
    album_name = result['name']
    album_uri = result['uri']
    all_artists= ' '.join([people['name'] for people in result['artists']])
    album_artist = all_artists.strip()
    
    print(f'Album:{album_name}, Artist:{album_artist}), URI:{album_uri}')

Album:Mezzanine, Artist:Massive Attack), URI:spotify:album:49MNmJhZQewjt06rpwp6QR
Album:Mezzanine (Deluxe), Artist:Massive Attack), URI:spotify:album:0NDZWNHJ5ySx3YeFLbsdMe
Album:Mezzanine, Artist:ØDYSSEE), URI:spotify:album:6VBPczhNlGRrsypcHtblQi
Album:Mezzanine, Artist:Gareth Emery), URI:spotify:album:0mQOqsMLZvjypKtTizC5ap
Album:Mezzanine - The Remixes, Artist:Massive Attack), URI:spotify:album:0pyQQJWDE3113nVXk0Xmrr
Album:Mezzanine, Artist:Mezzanine), URI:spotify:album:4N7QbPq5hg7Ht7dgiEoBLM
Album:Calling Card / Mezzanine, Artist:The Galleria Jessy Lanza), URI:spotify:album:2jb5dXKiGmZ9yQJNEnkMOm
Album:Thought of Man, Artist:Secret Mezzanine), URI:spotify:album:3ATVdJf18otF7wstxmrE5C
Album:Ash to Ash, Artist:Secret Mezzanine), URI:spotify:album:3HIs2aqrKHpiZumNpefmm7
Album:Mezzanine, Artist:Adelyn Rose), URI:spotify:album:4Q8SJNc5nwVv3rpDwqa8vV


### Demo of getting information about an album from a given URI, in this case we are using the URI for the album Mezzanine by Massive Attack

In [31]:
example_uri = 'spotify:album:49MNmJhZQewjt06rpwp6QR'
# The album id is the random characters in the third part of the uri
album_id = example_uri.split(':')[2]

In [88]:
tracks = sp.album_tracks(album_id)

for track in tracks['items']:
    print(track['name'])

Angel
Risingson
Teardrop
Inertia Creeps
Exchange
Dissolved Girl
Man Next Door
Black Milk
Mezzanine
Group Four
(Exchange)


#### List of information available:

In [89]:
tracks['items'][0].keys()

dict_keys(['artists', 'available_markets', 'disc_number', 'duration_ms', 'explicit', 'external_urls', 'href', 'id', 'is_local', 'name', 'preview_url', 'track_number', 'type', 'uri'])

### Getting the total play count of an album for a given spotify album id

Stream play count using: https://github.com/evilarceus/Spotify-PlayCount found from https://github.com/spotify/web-api/issues/70 

In [67]:
import requests
import json

In [81]:
def count_api_url(album_id):
    return 'https://t4ils.dev:4433/api/beta/albumPlayCount?albumid='+album_id

In [64]:
play_count = requests.get(count_api_url(album_id)).text

The api returns a json blob, we need to parse it

In [76]:
play_count = json.loads(play_count_json)

In [80]:
for track in play_count['data']:
    print(track)

{'name': 'Angel', 'playcount': 43601502, 'disc': 1, 'number': 1, 'uri': 'spotify:track:7uv632EkfwYhXoqf8rhYrg'}
{'name': 'Risingson', 'playcount': 8870572, 'disc': 1, 'number': 2, 'uri': 'spotify:track:6ggJ6MceyHGWtUg1KLp3M1'}
{'name': 'Teardrop', 'playcount': 119866059, 'disc': 1, 'number': 3, 'uri': 'spotify:track:67Hna13dNDkZvBpTXRIaOJ'}
{'name': 'Inertia Creeps', 'playcount': 13217617, 'disc': 1, 'number': 4, 'uri': 'spotify:track:3N2UhXZI4Gf64Ku3cCjz2g'}
{'name': 'Exchange', 'playcount': 5569836, 'disc': 1, 'number': 5, 'uri': 'spotify:track:2HuMQkNVpFIsur2cRWWQmX'}
{'name': 'Dissolved Girl', 'playcount': 9339362, 'disc': 1, 'number': 6, 'uri': 'spotify:track:0oeEqyEAavgPfFxDYvjAP6'}
{'name': 'Man Next Door', 'playcount': 6404737, 'disc': 1, 'number': 7, 'uri': 'spotify:track:2Tz5THgkMOQeaW6DlqAlIa'}
{'name': 'Black Milk', 'playcount': 13248873, 'disc': 1, 'number': 8, 'uri': 'spotify:track:1Rezzt36ybaT2ZbDZpv83D'}
{'name': 'Mezzanine', 'playcount': 4494553, 'disc': 1, 'number': 9

To get the total play count for this album, we just sum up all the play count for all tracks.

In [84]:
album_agg_count = 0  
for track in play_count['data']:
    album_agg_count += track['playcount']

In [85]:
print(album_agg_count)

230800379
