---
title: "Favorite Artists"
author: "Owen Ellick"
date: "2024-03-8"
categories: [Data Analysis]
image: "drake.png"
execute:
  warning: false
  message: false

toc: true
---

![](drake.png)

# Favorite Artist(s) 🎤

The purpose of this blog is to give reader's insight into my favorite artist's within the Spotify DataFrame

Below is the spotify DataFrame that reads the file spotify_all.csv containing data of Spotify users' playlist information (Source: Spotify Million Playlist Dataset Challenge).

In [None]:
import pandas as pd
spotify = pd.read_csv('https://bcdanl.github.io/data/spotify_all.csv')
spotify

Unnamed: 0,pid,playlist_name,pos,artist_name,track_name,duration_ms,album_name
0,0,Throwbacks,0,Missy Elliott,Lose Control (feat. Ciara & Fat Man Scoop),226863,The Cookbook
1,0,Throwbacks,1,Britney Spears,Toxic,198800,In The Zone
2,0,Throwbacks,2,Beyoncé,Crazy In Love,235933,Dangerously In Love (Alben für die Ewigkeit)
3,0,Throwbacks,3,Justin Timberlake,Rock Your Body,267266,Justified
4,0,Throwbacks,4,Shaggy,It Wasn't Me,227600,Hot Shot
...,...,...,...,...,...,...,...
198000,999998,✝️,6,Chris Tomlin,Waterfall,209573,Love Ran Red
198001,999998,✝️,7,Chris Tomlin,The Roar,220106,Love Ran Red
198002,999998,✝️,8,Crowder,Lift Your Head Weary Sinner (Chains),224666,Neon Steeple
198003,999998,✝️,9,Chris Tomlin,We Fall Down,280960,How Great Is Our God: The Essential Collection


## Variable Description 👨‍🎤

- pid: playlist ID; unique ID for playlist
- playlist_name: a name of playlist
- pos: a position of the track within a playlist (starting from 0)
- artist_name: name of the track's primary artist
- track_name: name of the track
- duration_ms: duration of the track in milliseconds
- album_name: name of the track's album

## Occurances 🎙️

In [None]:
artist_count = spotify['artist_name'].value_counts()
artist_count

Drake                2715
Kanye West           1065
Kendrick Lamar       1035
Rihanna               915
The Weeknd            913
                     ... 
Luna City Express       1
Ninetoes                1
Rhemi                   1
Jamie 3:26              1
Caleb and Kelsey        1
Name: artist_name, Length: 18866, dtype: int64

- The above code counts the occurences of each artist
    - as you can see, Drake appears the most in playlists

## Favorite Artists DataFrame 🎹

In [None]:
favorite_artists = spotify[spotify['artist_name'].isin(['Drake', 'Rihanna','Halsey'])]
favorite_artists

Unnamed: 0,pid,playlist_name,pos,artist_name,track_name,duration_ms,album_name
320,5,Wedding,22,Rihanna,We Found Love,215226,Talk That Talk
409,7,2017,15,Halsey,Eyes Closed,202438,hopeless fountain kingdom
522,10,abby,8,Drake,Portland,236614,More Life
544,10,abby,30,Drake,Preach,236973,If You're Reading This It's Too Late
570,10,abby,56,Drake,Headlines,235986,Take Care
...,...,...,...,...,...,...,...
197565,999990,Drake,1,Drake,Fake Love,210937,More Life
197566,999990,Drake,2,Drake,Pop Style,212946,Views
197567,999990,Drake,3,Drake,Hotline Bling,267066,Views
197568,999990,Drake,4,Drake,Legend,241853,If You're Reading This It's Too Late


- The above code filters the DataFrame to show only the songs by my three favorite artists: Drake, Rihanna, and Halsey

## Favorite Artists Occurances 🎶

In [None]:
favorite_count = favorite_artists['artist_name'].value_counts()
favorite_count

Drake      2715
Rihanna     915
Halsey      309
Name: artist_name, dtype: int64

- This code shows the number of occurrences that my three favorite artists have in the data

## Favorite Artist's Track Duration 💿

In [None]:
sorted_fav_artists = favorite_artists.sort_values(by = 'duration_ms', ascending = False)
no_duplicates = sorted_fav_artists.drop_duplicates(subset=['artist_name', 'track_name'])
longest_tracks = no_duplicates[['artist_name', 'track_name', 'duration_ms']].head(10)
longest_tracks

Unnamed: 0,artist_name,track_name,duration_ms
47554,Drake,Cameras / Good Ones Go Interlude - Medley,434960
157169,Drake,Pound Cake / Paris Morton Music 2,433800
65122,Drake,Shut It Down,419306
145989,Rihanna,Same Ol’ Mistakes,397093
110780,Rihanna,Where Have You Been - Hardwell Club Mix,394653
122260,Drake,Uptown,381240
38545,Drake,Since Way Back,368035
110850,Drake,Tuscan Leather,366400
30644,Rihanna,Cold Case Love,364520
29273,Drake,Forever,357706


- The above code sorts the DateFrame containing only tracks from my favorite artists by the length of their tracks
- It then gets rid of any duplicates so I only see one of each song
- Then I see the top 10 longest tracks by my three favorite artists
- Drake has 7 out of the 10 longest songs out of my three favorite artists
- Rihanna has the other 3 longest songs
- Halsey doesn't have any of them which tells me she keeps her songs relatively short

## Favorite Artist's Average Position 🧑‍🎤

In [None]:
fav_artist_name = favorite_artists['artist_name'].unique()
artist_tracks = spotify[spotify['artist_name'].isin(fav_artist_name)]
avg_pos = artist_tracks.groupby('artist_name')['pos'].mean()
avg_pos

artist_name
Drake      57.143278
Halsey     65.582524
Rihanna    47.174863
Name: pos, dtype: float64

- The above code gathers the average position of tracks within a playlist for each of my three favorite artists within the Spotify DataFrame

I hope you learned a little bit about my favorite artists within the Spotify DataFrame