# Transfering Playlists from Google to Spotify

This is a mini project to solve the problem to migrating saved music from Google Play Music over to Spotify - salvaging playlists, playlist descriptions and songs.

### Set up

1. Create an app on the Spotify Developers site to get client_id and client_secret
2. Create a `config.py` file with credentials:
    * client_id = 
    * client_secret = 
    * user = (your user ID)
3. Whitelist `http://localhost:8080` under app settings on the Spotify Developers site.

In [None]:
import numpy as np
import pandas as pd
import os
from glob import glob

# for spotify API
import config
import spotipy
import spotipy.util as util
from spotipy.oauth2 import SpotifyClientCredentials

# create an app on the Spotify Developers site to get client_id and client_secret

sp = spotipy.Spotify(client_credentials_manager=SpotifyClientCredentials(config.client_id, 
                                                                         config.client_secret))

In [None]:
# this is to grant permission for your "app" to create playlists

scope = 'playlist-modify-private, playlist-modify-public'
redirect_uri = "http://localhost:8080"
token = util.prompt_for_user_token(config.user,
                           scope,
                           client_id=config.client_id,
                           client_secret=config.client_secret,
                           redirect_uri=redirect_uri)

# when you run this, your browser will launch and have you authorize the app!

# Song Data from Google Play Music

### Google Takeout

On takeout.google.com, Google lets us download data they have stored on us across their many apps. From Takeout, we can download our saved songs and playlists. 

### File Structure

The downloaded data that we want is in the `Playlists` folder. Within this, each folder is a playlist containing a `Metadata.csv` that stores the playlist description, as well as a `Tracks` folder that has each track as its own csv.

**For this I dragged the `Playlists` folder into this directory.**

We use the `glob` library to sift through these files.

In [None]:
# Getting the path to each playlist folder
playlists = glob("./Playlists/*")

In [None]:
# Getting the path to each track under each playlist
master = {}
for p in playlists:
    master[p] = glob(p+"/Tracks/*")

#### Constructing a dataframe to store all our songs

In [None]:
# Making an empty list, with each element being a dataframe for each playlist
playlist_dfs = []

for k, v in master.items(): # master: k=path to playlist, v=path to tracks
    playlist = k
    tracks = []
    for track in v:
        tracks.append(pd.read_csv(track))
    pdf = pd.concat(tracks) # since each track was a csv, concatenating!
    pdf['Playlist'] = k
    # adding the playlist name as a column so we can eventually put each df together
    
    playlist_dfs.append(pdf)

full = pd.concat(playlist_dfs)

### Quick data cleaning

In [None]:
full = full[full.Removed != 'Yes'] # some songs were removed from the library
full = full.drop(['Playlist Index', 'Removed'], axis=1) # drop unnecessary columns


In [None]:
# cleaning up some of the strings

full.Title = full['Title'].str.replace("&#39;", "'")
full.Title = full['Title'].str.replace("&amp;", "&")
full.Title = full['Title'].str.replace('\([^)]*\)', "", regex=True)

full.Artist = full['Artist'].str.replace("&#39;", "'")
full.Artist = full['Artist'].str.replace("&amp;", "&")
full.Artist = full['Artist'].str.replace('\([^)]*\)', "", regex=True)

full.Playlist = full.Playlist.str.replace("./Playlists/", '')

# Spotify Lookup

Now I'm using songs' title and artist as the search terms to find their Spotify IDs via the Spotify API wrapper, `Spotipy`.

In [None]:
# defining a function I'll then use to .apply() over the dataframe

def get_spotify_uri(row):
    artist = row['Artist']
    track = row['Title']
    
    query = 'artist: {} track: {}'.format(artist, track)
    items = sp.search(q=query, limit=1)['tracks']['items'] # getting the top sesarch result
    
    # some items have no results!
    if items != []:
        return items[0]['uri']
    else:
        return None

In [None]:
full['spotify_id'] = full.apply(lambda r: get_spotify_uri(r), axis=1)

### Missing Songs?

I'm keeping track of which songs couldn't be found via the Spotify API -- maybe to tune in the future or to take note of to manually add.

    268/2731 songs missing

In [None]:
# how many songs are missing??
full.spotify_id.isna().sum()

In [None]:
# export the missing ones to a csv
full[full['spotify_id'].isna()].to_csv('missing.csv', index=False)

In [None]:
# dropping the songs that have no Spotify ID
full = full.dropna(subset=['spotify_id'])

In [None]:
# exporting the full songs csv, just to have
full.to_csv('songs.csv', index=False)

# Creating Playlists

In [None]:
full = pd.read_csv('songs.csv')

### Getting Playlist Metadata

In [None]:
playlists = glob("./Playlists/*")
pre1 = {}
for p in playlists:
    pre1[p] = glob(p+"/*.csv")[0]
    
descriptions = {}
for k, v in pre1.items():
    descriptions[k.replace('./Playlists/', '')] = pd.read_csv(v)['Description'].iloc[0]


### Creating and populating playlists!!!

In [None]:
def populate_playlists(playlist):
    
    # the information for populating the playlists
    tracks = full[full.Playlist == playlist].spotify_id.to_list()
    playlist_name = playlist
    playlist_desc = descriptions[playlist]
    
    # creating an empty playlist on spotify 
    new_playlist = sp.user_playlist_create(config.user, # user ID
                                           playlist_name, 
                                           public=False, # are your playlists public/private?
                                           collaborative=False, 
                                           description=playlist_desc)
    # adding tracks! 
    # the API limits to adding 100 songs at a time
    if len(tracks) <= 100:
        results = sp.user_playlist_add_tracks(config.user, new_playlist['id'], tracks)
    
    else:
        i = 0
        while i < len(tracks):
            results = sp.user_playlist_add_tracks(config.user, new_playlist['id'], tracks[i:i+100])
            i += 100


In [None]:
# run the function!
for p in playlists:
    populate_playlists(p)

## Some Documentation from Spotipy

`user_playlist_create(user, name, public=True, collaborative=False, description='')`

**Creates a playlist for a user**

Parameters:
* user - the id of the user
* name - the name of the playlist
* public - is the created playlist public
* collaborative - is the created playlist collaborative
* description - the description of the playlist



`playlist_add_items(playlist_id, items, position=None)`

**Adds tracks/episodes to a playlist**

Parameters:
* playlist_id - the id of the playlist
* items - a list of track/episode URIs, URLs or IDs
* position - the position to add the tracks