# Spotify Scrape - Dakota Jacobs

## Background

This script uses an imported utility called 'spotipy.' It will be used to do several things including finding user data in addition to artist information.

In order set this up for yourself, you'll need to set up a spotify developer account (https://developer.spotify.com/web-api/). This link will walk you through the necessary steps to gain access to an authorization token which is necessary to parse through user data.

Note: the extent of access to user information is limited by the scope of this script. Here it is limited to finding a users top artists and tracks. To modify someones public (or private) playlists, you must modify the scope to something different which requires more robust authentication. Here (https://developer.spotify.com/web-api/using-scopes/) is more information on scopes and here (http://spotipy.readthedocs.io/en/latest/#) is some good background on spotipy and the authorization process.

Protip: When going through the authentication process and initializing your own server (localhost: XXXX) be sure to set your default browser to internet explorer, the token will not come through if you use google chrome. Learn from my mistakes. 

Here is helpful information about song feature information: https://developer.spotify.com/web-api/get-audio-features/

## Goals

1) Access spotify user data including playlist information and top tracks.

2) Print useful information regarding track information and artist discography.

3) Compare a list of songs for spotify-defined track features such as 'danceability, acousticness, and livliness.'

4) Build playlist based on meeting or exceeding a feature (Work in progress).

    Logic:  playlist = []
    
            if song (danceability > 0.8):
                song.append(playlist)
            else:
                print ("This song: ", song, "isn't groovy enough")
                
                

## Initialize

The following cell sets up the authorization information necessary to access user and artist data

In [15]:
import sys
import spotipy
import spotipy.util as util
import simplejson as json

###----  Begin with authentication  ----###
##       cid, secret and redirect_uri are all given when spotify developer access is granted

print ("Starting authentication...")

print ("Accessing client_id...")
cid = '067ace6972b14c0fa85119bbe41a12e5'

print ("Accessing client_secret...")
secret = 'a45f3118973a42b3996323ea4b2a207c'

print ("Accessing redirect_uri...")
redirect_uri = 'http://localhost:8080/callback'

print ("Defining scope...")
scope = 'user-top-read'

print ("Identifying username...")
username = 'dakotajacobs16'
print ("Found username: ", username)
token = util.prompt_for_user_token(username, scope, cid, secret, redirect_uri)    # token checks to see if preceding paramters match

Starting authentication...
Accessing client_id...
Accessing client_secret...
Accessing redirect_uri...
Defining scope...
Identifying username...
Found username:  dakotajacobs16


# Get user playlist information

This cell will access publicly available user information and print it to screen

Note: authentication information (cid, secret, redirect_uri, scope, username) must be reinitialized

In [16]:
cid = '067ace6972b14c0fa85119bbe41a12e5'
secret = 'a45f3118973a42b3996323ea4b2a207c'
redirect_uri = 'http://localhost:8080/callback'
scope = 'user-top-read'
username = 'dakotajacobs16'
token = util.prompt_for_user_token(username, scope, cid, secret, redirect_uri)      # These paramters are mentioned above

print ("\n\n\n" "#### Beginning to access current user playlist information ####")
response = input("Would you like to know current users playlists?")
if response == 'Yes':
    next
    sp = spotipy.Spotify(auth=token)                                                # Checks to see if authentication is the same as the token
    sp.trace = False
    results = sp.current_user_playlists(limit=50)                                   # Current_user_playlist is a function defined by the spotipy package
    print ("\n\n", "Here are the playlists for", username)
    for i, item in enumerate(results['items']):                                     # prints and numbers user playlists
        print ("%d %s" %(i, item['name']))
        
    response1 = input("How about another users playlist?")                          # Same as above except different user defined by variable 'user'
    if response1 == 'Yes':
        sp = spotipy.Spotify(auth=token)
        sp.trace = False
        user = 'saadi12k'
        results = sp.user_playlists(user)
        print ("\n\n", "Here are the playlists for", user)
        for i, item in enumerate(results['items']):
            print ("%d %s" %(i, item['name']))
            
    else:
        print ("Exiting loop")
        next
        
else:
    print ("Exiting")
    sys.exit()                                                                     # sys.exit is in place for debugging purposes to quickly get out of example loops
    



        
        
        
 




#### Beginning to access current user playlist information ####
Would you like to know current users playlists?Yes


 Here are the playlists for dakotajacobs16
0 Trap Rap
1 Your Summer Rewind
2 Deep Feels
3 All the Hits All the Time
4 Liked from Radio
5 All the Angst
6 That Other Rock
7 Simple and Quiet
8 It's Not a Phase, Mom!!!
9 \m/ ^_^ \m/
How about another users playlist?Yes


 Here are the playlists for saadi12k
0 Drake – Signs
1 Music Final
2 Rock This
3 Gold School
4 Signed XOXO
5 G.O.O.D. Music
6 UMD MUSC205 Listening List
7 2016
8 Worst Behavior (Remix) [feat. Drizzy Drayke] – Kendick Lamar
9 lupe
10 Let Nas Down
11 soft but tough 
12 organ donor 
13 Drake – What A Time To Be Alive
14 KiD-CuDi and Friends (Chill Shit)
15 Kid Cudi
16 Main
17 Einstein Study Music Academy – Studying Music: Music to Make You Smarter
18 How Bout Now - Drake
19 Drake – So Far Gone
20 I LOVE MAKONNEN – I LOVE MAKONNEN
21 Majid Jordan – A Place Like This
22 Drake – If You're Reading This It's Too 

## Get album information

In [17]:
cid = '067ace6972b14c0fa85119bbe41a12e5'
secret = 'a45f3118973a42b3996323ea4b2a207c'
redirect_uri = 'http://localhost:8080/callback'
scope = 'user-top-read'
username = 'dakotajacobs16'
token = util.prompt_for_user_token(username, scope, cid, secret, redirect_uri)      # These paramters are mentioned above


##    Lets look at some album information
print ("\n\n", "#### Starting to look at specific artist from user ####")
print ("The artist_id for La Dispute is: 7lQKE6HaKQcCsgLRMhsh5W")                   # Used this ID for the sake of presentation
response = input("Which artist do you want to look into?")
if token:
    sp = spotipy.Spotify(auth=token)
    sp.trace = False
    results1 = sp.artist_albums(artist_id = response)
    for i, item in enumerate(results1['items']):
       # print (json.dumps(results1, indent = 4))                                                     
        print("%d %s" %(i, item['name']))
        print("%d %s" %(i ,item['id']))
        print("%d %s" %(i, item['uri']))
        # The print statements above print out specific values from results1 unmuting the first print statement will show ...
        # all possible values to print. The ones chosen were picked for the sake of a presentation. Additional values can ...
        # be printed if the user can access them via the json file format (hence 'items' is easily accessible)

    print("\n\n", "#### Now lets get the album tracks from a given album ####")
    response = input("Would you like to continue?")
    if response == 'Yes':
        album = input("which album from above do you want to know tracklistings for?")
        sp = spotipy.Spotify(auth=token)
        sp.trace = False
        results = sp.album_tracks(album_id = album)
        for i, item in enumerate(results['items']):
            print("%d %s" %(i, item['name']))
    else:
        next




 #### Starting to look at specific artist from user ####
The artist_id for La Dispute is: 7lQKE6HaKQcCsgLRMhsh5W
Which artist do you want to look into?7lQKE6HaKQcCsgLRMhsh5W
0 Tiny Dots
0 0z8MdJWMUAggph0Dzmkk2y
0 spotify:album:0z8MdJWMUAggph0Dzmkk2y
1 Rooms of the House
1 5m0NdXjQm038wphE4VSTPU
1 spotify:album:5m0NdXjQm038wphE4VSTPU
2 Wildlife
2 1zAMnOQUqSq3xCMgeBS6i2
2 spotify:album:1zAMnOQUqSq3xCMgeBS6i2
3 Somewhere at the Bottom of the River Between Vega and Altair
3 0MT2nypErOpCsHXuEpwGB2
3 spotify:album:0MT2nypErOpCsHXuEpwGB2
4 Vancouver
4 19NRrJFEhkCC1tbFPDMkzA
4 spotify:album:19NRrJFEhkCC1tbFPDMkzA
5 Thirteen
5 06w08xKTcQ3wCu59gOKXle
5 spotify:album:06w08xKTcQ3wCu59gOKXle
6 Never Come Undone
6 0WugySd4t8715VlblgGjvs
6 spotify:album:0WugySd4t8715VlblgGjvs
7 Here, Hear, Vol. 3
7 7GYOUMKMIo7dxrK8a25cKq
7 spotify:album:7GYOUMKMIo7dxrK8a25cKq
8 No Sleep till Christmas 3
8 1dt9d5uDqWnVJBqeOZrlGI
8 spotify:album:1dt9d5uDqWnVJBqeOZrlGI
9 Whatever Nevermind: A Tribute to Nirvana's Neve

## Get user top tracks

In [18]:
cid = '067ace6972b14c0fa85119bbe41a12e5'
secret = 'a45f3118973a42b3996323ea4b2a207c'
redirect_uri = 'http://localhost:8080/callback'
scope = 'user-top-read'
username = 'dakotajacobs16'
token = util.prompt_for_user_token(username, scope, cid, secret, redirect_uri)      # These paramters are mentioned above

print("\n\n", "#### Lets look at current users top tracks ####")
response = input("Would you like to continue?")
if response == 'Yes':
    sp = spotipy.Spotify(auth=token)
    sp.trace = False
    results = sp.current_user_top_tracks(limit = '20', offset = '0', time_range = 'long_term')
    for i, item in enumerate(results['items']):
       # print (json.dumps(results, indent = 4))
        print (i, item['name'])                         # name of the song
        print (i, item['external_urls'])                # url to the song
        print (i, item['id'])                           # id of song for the next cell of the script
        print (i, item['preview_url'])                  # clickable link to preview 30 seconds of the song



 #### Lets look at current users top tracks ####
Would you like to continue?Yes
0 Woman (reading)
0 {'spotify': 'https://open.spotify.com/track/5nIQXDduIdBlTjX1ML24hO'}
0 5nIQXDduIdBlTjX1ML24hO
0 https://p.scdn.co/mp3-preview/b395f36fe71a9db4cd5c5a1a05dc4e93ac608be9
1 Drifting
1 {'spotify': 'https://open.spotify.com/track/58VZtWOA75FCms5fd8H7Zy'}
1 58VZtWOA75FCms5fd8H7Zy
1 https://p.scdn.co/mp3-preview/5d336f24eb930fd779522d36b427bc7db83b9472
2 Reign of Darkness
2 {'spotify': 'https://open.spotify.com/track/1ZzC86R3PE8fEbZpLXxenI'}
2 1ZzC86R3PE8fEbZpLXxenI
2 https://p.scdn.co/mp3-preview/a1bec36ecc6640a4095e6fc549a573952a4eec4f
3 Sweet Talk
3 {'spotify': 'https://open.spotify.com/track/757fXABDTbalNzihYS3mUc'}
3 757fXABDTbalNzihYS3mUc
3 https://p.scdn.co/mp3-preview/3d0eb96dc8aba6538fbcccb90eef7a4c32b5b649
4 That Wrecking Ball
4 {'spotify': 'https://open.spotify.com/track/7e2aPfrwX7tCcC4wMDnsU4'}
4 7e2aPfrwX7tCcC4wMDnsU4
4 https://p.scdn.co/mp3-preview/601e4b6c251acef5218feb16750956f

## Song Comparison

In [19]:
cid = '067ace6972b14c0fa85119bbe41a12e5'
secret = 'a45f3118973a42b3996323ea4b2a207c'
redirect_uri = 'http://localhost:8080/callback'
scope = 'user-top-read'
username = 'dakotajacobs16'
token = util.prompt_for_user_token(username, scope, cid, secret, redirect_uri)      # These paramters are mentioned above


print ("\n\n", "#### Lets compare two artists ####")       
response = input("Would you like to continue?")
if response == 'Yes':
    sp = spotipy.Spotify(auth=token)
    sp.trace = False
    song1 = input("Pick the one song from above")                    # Use the ids from above as user input 
    song2 = input("Pick a different song from above")
    results = sp.audio_features(tracks = [song1, song2])
    print (results[0])
    print (results[1])                                               # Raw output is unreadable, so we use a json module
    with open('Comparison.txt', 'w') as outfile:
        json.dump(results, outfile, sort_keys=True, indent=4)        # Prints to output



 #### Lets compare two artists ####
Would you like to continue?Yes
Pick the one song from above5nIQXDduIdBlTjX1ML24hO
Pick a different song from above1ZzC86R3PE8fEbZpLXxenI
{'danceability': 0.463, 'energy': 0.633, 'key': 9, 'loudness': -8.361, 'mode': 0, 'speechiness': 0.0296, 'acousticness': 0.0977, 'instrumentalness': 0.0147, 'liveness': 0.114, 'valence': 0.59, 'tempo': 115.445, 'type': 'audio_features', 'id': '5nIQXDduIdBlTjX1ML24hO', 'uri': 'spotify:track:5nIQXDduIdBlTjX1ML24hO', 'track_href': 'https://api.spotify.com/v1/tracks/5nIQXDduIdBlTjX1ML24hO', 'analysis_url': 'https://api.spotify.com/v1/audio-analysis/5nIQXDduIdBlTjX1ML24hO', 'duration_ms': 210893, 'time_signature': 4}
{'danceability': 0.392, 'energy': 0.966, 'key': 9, 'loudness': -3.863, 'mode': 1, 'speechiness': 0.172, 'acousticness': 0.00132, 'instrumentalness': 0.0473, 'liveness': 0.239, 'valence': 0.0764, 'tempo': 139.83, 'type': 'audio_features', 'id': '1ZzC86R3PE8fEbZpLXxenI', 'uri': 'spotify:track:1ZzC86R3PE8fEbZ