#### Kristen Shen
#### Homework 05 - Part Three: Spotify API (Music again!)
#### 11/11/2024

In this notebook you'll be using [Spotipy](https://github.com/spotipy-dev/spotipy), a Python package, to talk to the Spotify API. This means you won't have to manually create API URLs, you'll just need to figure out how to make Spotipy do it for you! The full Spotipy documentation is available at [https://spotipy.readthedocs.io/](https://spotipy.readthedocs.io/)

# To access *public* Spotify data

You'll want to go to the [Spotify for Developers Dashboard](https://developer.spotify.com/dashboard) and create a new app. This will give you a `client_id` and `client_secret`! It's like a super-advanced version of an API key. When you're setting up your app it will probably also ask you for other things like a redirect URL - just put whatever you want in there, it doesn't matter. If it asks what you want access to, you can pick the Web API (but I don't think it matters).

> The code below won't work since it's *my* secret keys. I've deleted them so that this notebook is nice and safe for me!

In [14]:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(
    client_id='XXXXXXXXXXXXXXXXX',
    client_secret='XXXXXXXXXXXXXXXXX',
))

When you want data from Spotify, you can't just go to `/artists/Pixies` in order to get work by Pixies! You have to find a special code for the artist (or song, or album, or whatever). It's called the `uri`.

> You can find more details on searching [on the Spotipy documentation](https://spotipy.readthedocs.io/en/2.22.1/#spotipy.client.Spotify.search) or the [Spotify Web API documentation](https://developer.spotify.com/documentation/web-api/reference/search). Remember that Spotipy is a Python wrapper for the Spotify API, so you don't need to work with any URLs!

To find the `uri`, you first need to do a search. Below we use `sp.search` to search for a particular artist.

In [2]:
# Search for the artist Pixies
results = sp.search(q='artist:Pixies', type='artist')

The `results` it shows us is awful and long and terrible. Instead of showing you how to do that, I already poked through it and found the top artist result from our search.

In [None]:
results['artists']['items'][0]

There we go! The `uri` looks to be `spotify:artist:6zvul52xwTWzilBZl6BUbT`.

Now the sad part: the Spotipy documentation is...... not great. The Spotify Web API docs look good, *but* we're using the Python wrapper, not the raw Spotify API! Luckily Spotipy has a great [list of examples](https://github.com/spotipy-dev/spotipy/tree/master/examples), including one for [an artist's top tracks](https://github.com/spotipy-dev/spotipy/blob/master/examples/simple_artist_top_tracks.py).

```python
from spotipy.oauth2 import SpotifyClientCredentials
import spotipy

lz_uri = 'spotify:artist:36QJpDe2go2KgaRleHCDTp'

client_credentials_manager = SpotifyClientCredentials()
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

results = sp.artist_top_tracks(lz_uri)

for track in results['tracks'][:10]:
    print('track    : ' + track['name'])
    print('audio    : ' + track['preview_url'])
    print('cover art: ' + track['album']['images'][0]['url'])
```

Since we already have the credentials and blah blah blah set up, all we need to do is adapt the `sp.artist_top_tracks(lz_uri)` line and everything below it.

In [None]:
results = sp.artist_top_tracks('spotify:artist:6zvul52xwTWzilBZl6BUbT')

for track in results['tracks'][:10]:
    print(track['name'])

And that's about it! You use magic codes and it lets you get up-to-date information.

# Your mission

I recently came across a Spotify playlist called [Fall in a 90s Suburb](https://open.spotify.com/playlist/7r2XnAUl6moWkcwOaWgihD?si=505c8f22f4314a33) while researching the band [SEAGULL SCREAMING KISS HER KISS HER](https://open.spotify.com/artist/1WSO9nf7wTj5DZBsncauGr?si=S0xpngxHR1mLF720lMZwxg). The playlist was pretty good, but since since SSKHKH only has like 1,500 listeners each month I was curious about the most/least popular songs on the playlist.

## My questions

1. What are the ten most popular songs on the playlist?
2. What percentage of them have a popularity of zero? Print them out, sorted by the band name.
3. Is popularity relative to the artist, the album, all songs on Spotify, or something else?

### My suggested approach

I suggest approaching this through the following steps:

1. Getting the playlist and print out its **name and description**. 
2. Print out **the name and popularity of each song**
3. Print out **the name, popularity, and artists** of each track on the playlist. Or, if you'd like a shortcut, just pick the first artist.
4. Instead of printing, use these to **create a new dictionary** each time you look at a track. Print out this dictionary. You should be printing out 476 dictionaries!
5. Printing isn't helpful! Instead, after you create the dictionary **append it** to a list called `all_tracks`
6. When you're done, `all_tracks` should have 476 items in it
7. Sort the list by `popularity`, take the **top ten**
8. Filter the list by `popularity`, selecting only the ones with a popularity of `0`

### Tips

**Spotipy documentation:** https://spotipy.readthedocs.io/

**Spotify Web API documentation:** https://developer.spotify.com/documentation/web-api/

- Do this in many, many cells, not all in one!
- You definitely want to [look at the Spotipy examples](https://github.com/spotipy-dev/spotipy/tree/master/examples) to find some good code to base your answer off of. There are a handful that talk about playlists – it might be helpful to read and compare a few of them!
- Getting the playlist name/description is **a different endpoint** than getting the actual songs on the playlist.
- Are you printing out the **same number of tracks as are in the actual playlist?** Take note and be careful! It should be ~476.
- If you're getting the id of playlist songs but not seeing song names, look for `fields='items.track.id,total` in your code. It's only pulling the track's id! Change it to `items.track,total` and it will return [more information about each track](https://developer.spotify.com/documentation/web-api/reference/get-playlists-tracks)
- `all_tracks = []` should be the first line in your cell. That makes sure it always resets to being empty before you start adding tracks to it.
- Be sure the first and last items in `all_tracks` are different – maybe you're accidentally adding the same item each time!
- Normally we sort lists of numbers, which is easy. Sorting a list of dictionaries can be done easily with `key=`. Look it up!
- Pick the most popular 10 songs using list comprehensions
- Filtering is best done with a list comprehension.
- You can sort by things that aren't numbers!

In [5]:
# setting up
pip install spotipy --upgrade

Collecting spotipy
  Downloading spotipy-2.24.0-py3-none-any.whl.metadata (4.9 kB)
Collecting redis>=3.5.3 (from spotipy)
  Downloading redis-5.2.0-py3-none-any.whl.metadata (9.1 kB)
Downloading spotipy-2.24.0-py3-none-any.whl (30 kB)
Downloading redis-5.2.0-py3-none-any.whl (261 kB)
Installing collected packages: redis, spotipy
Successfully installed redis-5.2.0 spotipy-2.24.0

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [6]:
pip install --upgrade pip

Collecting pip
  Using cached pip-24.3.1-py3-none-any.whl.metadata (3.7 kB)
Using cached pip-24.3.1-py3-none-any.whl (1.8 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 24.2
    Uninstalling pip-24.2:
      Successfully uninstalled pip-24.2
Successfully installed pip-24.3.1
Note: you may need to restart the kernel to use updated packages.


#### Suggestion 1. Getting the playlist and print out its name and description.

In [3]:
pip install python-dotenv

Note: you may need to restart the kernel to use updated packages.


In [10]:
# not include my personal info here
from dotenv import load_dotenv
import os

load_dotenv()

client_id = os.getenv("client_id")
client_secret= os.getenv("client_secret")

In [15]:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(
    client_id = f"{client_id}",
    client_secret= f"{client_secret}",
))

In [4]:
# get the playlist id
playlist_id = "7r2XnAUl6moWkcwOaWgihD"

# Get playlist details
playlist = sp.playlist(playlist_id)

# check the keys in the dictioinary playlist
playlist.keys()

dict_keys(['collaborative', 'description', 'external_urls', 'followers', 'href', 'id', 'images', 'name', 'owner', 'primary_color', 'public', 'snapshot_id', 'tracks', 'type', 'uri'])

In [5]:
# Print name and description
print(f"Playlist Name: {playlist['name']}")
print("-------")
print(f"Description: {playlist['description']}")

Playlist Name: Fall in a 90s Suburb 🍂 
-------
Description: fuzzy guitars from the 80s, 90s &amp; early 00s for feeling angsty as the seasons change.  put on a sweater and listen to some indie rock, shoegaze, and noisy twee.


#### Suggestion 2. Print out the name and popularity of each song

In [6]:
# Get playlist tracks
playlist_tracks = sp.playlist_tracks(playlist_id)
# playlist_tracks
# check for keys in this dictionary
# playlist_tracks.keys()
# check the first item of this list
# playlist_tracks['items'][0]

In [7]:
# check the keys in this dictionary
playlist_tracks['items'][0].keys()

dict_keys(['added_at', 'added_by', 'is_local', 'primary_color', 'track', 'video_thumbnail'])

In [8]:
# check the keys in this dictionary
playlist_tracks['items'][0]['track'].keys()

dict_keys(['preview_url', 'available_markets', 'explicit', 'type', 'episode', 'track', 'album', 'artists', 'disc_number', 'track_number', 'duration_ms', 'external_ids', 'external_urls', 'href', 'id', 'name', 'popularity', 'uri', 'is_local'])

In [55]:
# confirm I could get the info I want
# playlist_tracks['items'][0]['track']['name']

In [56]:
# playlist_tracks['items'][0]['track']['popularity']

In [9]:
# get song name and popularity
for song_info in playlist_tracks['items']:
    song_name = song_info['track']['name']
    song_popularity = song_info['track']['popularity']
    print(f"Track Name: {song_name}")
    print("-----")
    print(f"Popularity: {song_popularity}")

Track Name: Waiting For October
-----
Popularity: 28
Track Name: Scott Pilgrim
-----
Popularity: 44
Track Name: Ginger
-----
Popularity: 24
Track Name: Frontwards
-----
Popularity: 0
Track Name: First Revival
-----
Popularity: 0
Track Name: I Can See It (But I Can't Feel It)
-----
Popularity: 27
Track Name: Skyscraper
-----
Popularity: 12
Track Name: Jar Of Cardinals
-----
Popularity: 21
Track Name: Get Back
-----
Popularity: 34
Track Name: Tripoli
-----
Popularity: 0
Track Name: Everything Flows
-----
Popularity: 0
Track Name: (When You Wake) You're Still in a Dream
-----
Popularity: 27
Track Name: Barnaby, Hardly Working
-----
Popularity: 0
Track Name: Nail Clinic
-----
Popularity: 0
Track Name: Number One Blind
-----
Popularity: 35
Track Name: Green Grow The Rushes
-----
Popularity: 24
Track Name: Don't Look Back
-----
Popularity: 21
Track Name: Sweetness and Light
-----
Popularity: 0
Track Name: Marzipan
-----
Popularity: 0
Track Name: The Backyard
-----
Popularity: 29
Track Name: 

#### Suggestion 3. Print out the name, popularity, and artists of each track on the playlist. Or, if you'd like a shortcut, just pick the first artist.

In [10]:
# since I got the name and popularity in question 2, I will just get the artist's name here.
for song_info in playlist_tracks['items']:
    for artist_info in song_info['track']['artists']:
        song_artist = artist_info['name']
        print(f"Artist: {song_artist}")

Artist: Polaris
Artist: Plumtree
Artist: Lilys
Artist: Pavement
Artist: The Amps
Artist: my bloody valentine
Artist: The Boo Radleys
Artist: Guided By Voices
Artist: Veruca Salt
Artist: Pinback
Artist: Teenage Fanclub
Artist: my bloody valentine
Artist: Yo La Tengo
Artist: Pavement
Artist: Veruca Salt
Artist: R.E.M.
Artist: Teenage Fanclub
Artist: Lush
Artist: Velocity Girl
Artist: Miracle Legion
Artist: Pavement
Artist: Ride
Artist: Brittle Stars
Artist: Ride
Artist: The Boo Radleys
Artist: Chapterhouse
Artist: Stereolab
Artist: The Ropers
Artist: Guided By Voices
Artist: Yo La Tengo
Artist: Tiger Trap
Artist: Beulah
Artist: Ride
Artist: Yo La Tengo
Artist: The Magnetic Fields
Artist: The Lemonheads
Artist: The Smashing Pumpkins
Artist: Stereolab
Artist: The Replacements
Artist: The Bats
Artist: Yo La Tengo
Artist: Black Tambourine
Artist: Elliott Smith
Artist: R.E.M.
Artist: Guided By Voices
Artist: Teenage Fanclub
Artist: Ride
Artist: The Breeders
Artist: Yo La Tengo
Artist: Noise A

In [12]:
# Get the total numbers of tracks
total_tracks = sp.playlist(playlist_id)['tracks']['total']
total_tracks

476

#### Suggestion 4. Instead of printing, use these to create a new dictionary each time you look at a track. 
###### Print out this dictionary. You should be printing out 476 dictionaries!(Included suggestioin 5 in the same cell below)

In [20]:
# prepare for Suggestion 5, I made the empty list
all_tracks = []

# because the offset is 100, to get all information of all tracks, we have to make sure it does not stop with the limit 100
for offset in range(0, total_tracks, 100):
    playlist_tracks = sp.playlist_tracks(playlist_id, offset=offset, limit=100)
    # Create a dictionary for the track
    for song_info in playlist_tracks['items']:
        track_info = {
            'track name': song_info['track']['name'],
            'popularity': song_info['track']['popularity'],
            'artists': ', '.join([artist['name'] for artist in song_info['track']['artists']])
            }
        
        all_tracks.append(track_info)

#### Suggestion 6. When you're done, all_tracks should have 476 items in it

In [21]:
# check if I have 476 items in all_tracks
len(all_tracks)

476

#### Suggestion 7: Sort the list by popularity, take the top ten

##### I searched on "stack overflow" about how to sort a list of disctionaries
##### https://stackoverflow.com/questions/8966538/syntax-behind-sortedkey-lambda

In [22]:
sorted_tracks = sorted(all_tracks, key=lambda x: x['popularity'], reverse=True)
# take top ten
top_ten_tracks = sorted_tracks[:10]
print(top_ten_tracks)

[{'track name': '1979 - Remastered 2012', 'popularity': 77, 'artists': 'The Smashing Pumpkins'}, {'track name': 'Today - 2011 Remaster', 'popularity': 69, 'artists': 'The Smashing Pumpkins'}, {'track name': 'Halah', 'popularity': 67, 'artists': 'Mazzy Star'}, {'track name': 'Cherry-coloured Funk', 'popularity': 66, 'artists': 'Cocteau Twins'}, {'track name': 'Coffee & TV', 'popularity': 61, 'artists': 'Blur'}, {'track name': 'Drown', 'popularity': 60, 'artists': 'The Smashing Pumpkins'}, {'track name': 'When You Sleep', 'popularity': 59, 'artists': 'my bloody valentine'}, {'track name': 'She Bangs the Drums - Remastered 2009', 'popularity': 59, 'artists': 'The Stone Roses'}, {'track name': 'Lorelei', 'popularity': 54, 'artists': 'Cocteau Twins'}, {'track name': 'Blue Flower', 'popularity': 53, 'artists': 'Mazzy Star'}]


### Question 1: What are the ten most popular songs on the playlist?

In [23]:
print("The ten most popular songs on the playlist are:")

for each_song in top_ten_tracks:
    song_name = each_song['track name']
    song_popularity = each_song['popularity']
    print(f"Song name: {song_name}, popularity: {song_popularity}")
    print("-----")

The ten most popular songs on the playlist are:
Song name: 1979 - Remastered 2012, popularity: 77
-----
Song name: Today - 2011 Remaster, popularity: 69
-----
Song name: Halah, popularity: 67
-----
Song name: Cherry-coloured Funk, popularity: 66
-----
Song name: Coffee & TV, popularity: 61
-----
Song name: Drown, popularity: 60
-----
Song name: When You Sleep, popularity: 59
-----
Song name: She Bangs the Drums - Remastered 2009, popularity: 59
-----
Song name: Lorelei, popularity: 54
-----
Song name: Blue Flower, popularity: 53
-----


#### Suggestion 8: Filter the list by popularity, selecting only the ones with a popularity of 0

In [49]:
track_pop_0 = [track for track in all_tracks if track['popularity'] == 0]
# track_pop_0

### Question 2: (I break it into two parts)
##### question 2.1 - What percentage of them have a popularity of zero?

In [30]:
num_pop_0 = len(track_pop_0)
num_total = len(all_tracks)

In [38]:
percent = num_pop_0 / num_total * 100
percent_rounded = round(percent, 2)
f"{percent_rounded} percent of them have a popularity of zero."

'38.45 percent of them have a popularity of zero.'

#### question 2.2 - sorted by the band name.

##### I want to group the track names by the same band name. To do this, I need to create a seperate dictionary to include these information
##### instead of creating an empty list, I searched python documentation (linked below) and learned to import `defaultdict`, 
##### which allows us to access dictionary entries that do not previously exist by assigning it a default value
##### https://docs.python.org/3/library/collections.html#collections.defaultdict

In [60]:
from collections import defaultdict
tracks_groupBy_band = defaultdict(list)

In [64]:
# add track names in accordance to the band name in the dictionary tracks_groupBy_band
for track in track_pop_0:
      tracks_groupBy_band[track['artists']].append(track['track name'])

In [68]:
# sortef by band in alphabetical order
sorted_track_byBand = sorted(tracks_groupBy_band.items())
# check this list to make sure I added the right items
# sorted_track_byBand

In [72]:
# display results
print("Tracks in following bands have popularity of zero:")

for band, tracks in sorted_track_byBand: 
    print(f"Band name: {band}")
    for track_name in tracks:
        print(f"----- Track Name: {track_name}")  


Tracks in following bands have popularity of zero:
Band name: A.R. Kane
----- Track Name: When You're Sad
Band name: Alison's Halo
----- Track Name: Dozen
Band name: Archers Of Loaf
----- Track Name: Plumb Line
----- Track Name: Web in Front
Band name: Bailter Space
----- Track Name: Splat
Band name: Beat Happening
----- Track Name: Noise
----- Track Name: Godsend
Band name: Beatnik Filmstars
----- Track Name: Try Some to See
Band name: Belle and Sebastian
----- Track Name: Get Me Away From Here, I'm Dying
----- Track Name: Sleep The Clock Around
----- Track Name: The Boy With The Arab Strap
----- Track Name: The Stars of Track and Field
----- Track Name: If You're Feeling Sinister
Band name: Black Tambourine
----- Track Name: Lazy Heart
----- Track Name: Throw Aggi Off the Bridge
----- Track Name: For Ex-Lovers Only
Band name: Built To Spill
----- Track Name: Fling
----- Track Name: Big Dipper
Band name: Camera Obscura
----- Track Name: I Don't Do Crowds
Band name: Cocteau Twins
-----

### Question 3: Is popularity relative to the artist, the album, all songs on Spotify, or something else?

##### Based on previous questions, I found 'popularity' is a key in the dictionary of each track (I examined this in line 8, by checking keys in the dictionary `playlist_tracks['items'][0]['track']`.) Since it is under individual track but not displayed under other dictionaries, I think popularity is directly related to each track rather than  to artists or albums.

These webistes (linked below) also confirm my thoughts, saying that each track has its own index for popularity. Other factors may indirectly impact the song's index.

https://community.spotify.com/t5/Spotify-for-Developers/question-about-the-popularity-score-of-an-artist/td-p/5795193

https://www.loudlab.org/blog/spotify-popularity-leverage-algorithm/#:~:text=Each%20track%20has%20its%20own,down%20a%20song's%20popularity%20index.