---
title: "Favorite Artist"
author: "Jake Starkey"
date: "2024-03-06"
categories: [ code, data-analysis]
image: "download.webp"
execute:
  warning: false
  message: false
editor:
  markdown:
    wrap: sentence
---

## Favorite Artist

The purpose of this blog is to give reader's insight into my favorite artist's within the Spotify DataFrame

Below is the spotify DataFrame that reads the file spotify_all.csv containing data of Spotify users' playlist information (Source: Spotify Million Playlist Dataset Challenge)..

In [1]:
import pandas as pd
spotify = pd.read_csv('https://bcdanl.github.io/data/spotify_all.csv')
spotify

Unnamed: 0,pid,playlist_name,pos,artist_name,track_name,duration_ms,album_name
0,0,Throwbacks,0,Missy Elliott,Lose Control (feat. Ciara & Fat Man Scoop),226863,The Cookbook
1,0,Throwbacks,1,Britney Spears,Toxic,198800,In The Zone
2,0,Throwbacks,2,Beyoncé,Crazy In Love,235933,Dangerously In Love (Alben für die Ewigkeit)
3,0,Throwbacks,3,Justin Timberlake,Rock Your Body,267266,Justified
4,0,Throwbacks,4,Shaggy,It Wasn't Me,227600,Hot Shot
...,...,...,...,...,...,...,...
198000,999998,✝️,6,Chris Tomlin,Waterfall,209573,Love Ran Red
198001,999998,✝️,7,Chris Tomlin,The Roar,220106,Love Ran Red
198002,999998,✝️,8,Crowder,Lift Your Head Weary Sinner (Chains),224666,Neon Steeple
198003,999998,✝️,9,Chris Tomlin,We Fall Down,280960,How Great Is Our God: The Essential Collection


## Variable Discription

-   pid: playlist ID; unique ID for playlist
-   playlist_name: a name of playlist
-   pos: a position of the track within a playlist (starting from 0)
-   artist_name: name of the track's primary artist
-   track_name: name of the track
-   duration_ms: duration of the track in milliseconds
-   album_name: name of the track's album \## Occurances

In [2]:
artist_count = spotify['artist_name'].value_counts()
artist_count

Drake                2715
Kanye West           1065
Kendrick Lamar       1035
Rihanna               915
The Weeknd            913
                     ... 
Luna City Express       1
Ninetoes                1
Rhemi                   1
Jamie 3:26              1
Caleb and Kelsey        1
Name: artist_name, Length: 18866, dtype: int64

-   The above code counts the occurences of each artist as you can see, Kanye West appears the most in playlists and he is my favorite artist.


## Favorite Artist Data Frame

In [3]:
favorite_artists = spotify[spotify['artist_name'].isin(['Kanye West', '21 Savage','Drake'])]
favorite_artists

Unnamed: 0,pid,playlist_name,pos,artist_name,track_name,duration_ms,album_name
522,10,abby,8,Drake,Portland,236614,More Life
544,10,abby,30,Drake,Preach,236973,If You're Reading This It's Too Late
570,10,abby,56,Drake,Headlines,235986,Take Care
638,11,VIBE,52,Drake,Houstatlantavegas,290426,So Far Gone
639,11,VIBE,53,Drake,Runnin Away For Good,292666,The Drake LP
...,...,...,...,...,...,...,...
197564,999990,Drake,0,Drake,One Dance,173986,Views
197565,999990,Drake,1,Drake,Fake Love,210937,More Life
197566,999990,Drake,2,Drake,Pop Style,212946,Views
197567,999990,Drake,3,Drake,Hotline Bling,267066,Views



-   The above code filters the DataFrame to show only the songs by my three favorite artists: Drake, Kanye West, and 21 Savage

## Favorite Artists Occurances

In [4]:
favorite_count = favorite_artists['artist_name'].value_counts()
favorite_count

Drake         2715
Kanye West    1065
21 Savage      344
Name: artist_name, dtype: int64

-   This code shows the number of occurrences that my three favorite artists have in the data set

## Favorite Artist's Track Duration

In [5]:
sorted_fav_artists = favorite_artists.sort_values(by = 'duration_ms', ascending = False)
no_duplicates = sorted_fav_artists.drop_duplicates(subset=['artist_name', 'track_name'])
longest_tracks = no_duplicates[['artist_name', 'track_name', 'duration_ms']].head(10)
longest_tracks

Unnamed: 0,artist_name,track_name,duration_ms
67145,Kanye West,Last Call,760973
119752,Kanye West,Runaway,547733
123616,Kanye West,Blame Game,469866
123594,Drake,Cameras / Good Ones Go Interlude - Medley,434960
47541,Drake,Pound Cake / Paris Morton Music 2,433800
138846,21 Savage,7 Min Freestyle,431586
65122,Drake,Shut It Down,419306
162482,Kanye West,So Appalled,397666
131325,Drake,Uptown,381240
55630,Kanye West,Monster,378893


-   The above code sorts the DateFrame containing only tracks from my favorite artists by the length of their songs
-   It then gets rid of any duplicates so I only see one of each song
-   Then I see the top 10 longest tracks by my three favorite artists
-   Kanye Takes the three longest songs out of my favorite artists.

## Favorite Artist's Average Position

In [6]:
fav_artist_name = favorite_artists['artist_name'].unique()
artist_tracks = spotify[spotify['artist_name'].isin(fav_artist_name)]
avg_pos = artist_tracks.groupby('artist_name')['pos'].mean()
avg_pos

artist_name
21 Savage     57.776163
Drake         57.143278
Kanye West    51.292958
Name: pos, dtype: float64

-   The above code gathers the average position of tracks within a playlist for each of my three favorite artists within the Spotify DataFrame
