# Spotify Data Project Testing
### Questions
1. What percentage of tracks are labelled as "explicit"?
    - Does a song being explicit affect its streams?
2. What factors influence a song's likelihood of being included in popular playlists?
    - e.g. genre, tempo, artist popularity of songs that are commonly featured in Spotify's popular playlists.
3. How does the length of a song influence its streaming numbers?
4. How does the time of year that a song was release affect its spotify streams
5. find more questions that aren't necessarily to do with streams

In [6]:
# Importing Libraries
import ast
import pandas as pd
import seaborn as sns
from datasets import load_dataset
import matplotlib.pyplot as plt

# Loading the dataset
df = pd.read_csv('B:\R\R Directories\Portfolio\Datasets\Spotify_Data\Spotify_Data.csv')

# Cleaning the data
df['release_date'] = pd.to_datetime(df['Release Date']) # Convert 'Release Date' to datetime, change column name
df = df.drop('Release Date', axis=1) # Drop the old 'Release Date' column for tidiness
df['explicit_track'] = df['Explicit Track'].astype('category') # Convert 'Explicit Track' to bool and change name

# Function to remove commas and convert to int - fix this by working out which values dont work
def safe_to_int(value):
    try:
        return int(value.replace(',', ''))
    except AttributeError:
        return None

columns_to_convert = df[['Spotify Streams', 'YouTube Views', 'Spotify Playlist Count',
                         'YouTube Likes', 'TikTok Posts', 'TikTok Likes', 'TikTok Views',
                         'Soundcloud Streams']]
for col in columns_to_convert:
    df[col] = df[col].apply(safe_to_int)

## Inspecting the dataframe shows the following:
1. The df is 4600 x 29
2. Datetime was a string value, so converted to datetime using Pandas
3. Column names are unfavorable, debating changing the ones I plan on using
4. Many numbers are str values separated by commas, wrote a function to change any desired columns to ints.

In [7]:
df.dtypes

Track                                 object
Album Name                            object
Artist                                object
ISRC                                  object
All Time Rank                         object
Track Score                          float64
Spotify Streams                       object
Spotify Playlist Count                object
Spotify Playlist Reach                object
Spotify Popularity                   float64
YouTube Views                         object
YouTube Likes                         object
TikTok Posts                          object
TikTok Likes                          object
TikTok Views                          object
YouTube Playlist Reach                object
Apple Music Playlist Count           float64
AirPlay Spins                         object
SiriusXM Spins                        object
Deezer Playlist Count                float64
Deezer Playlist Reach                 object
Amazon Playlist Count                float64
Pandora St

In [8]:
df

Unnamed: 0,Track,Album Name,Artist,ISRC,All Time Rank,Track Score,Spotify Streams,Spotify Playlist Count,Spotify Playlist Reach,Spotify Popularity,...,Deezer Playlist Count,Deezer Playlist Reach,Amazon Playlist Count,Pandora Streams,Pandora Track Stations,Soundcloud Streams,Shazam Counts,TIDAL Popularity,Explicit Track,release_date
0,MILLION DOLLAR BABY,Million Dollar Baby - Single,Tommy Richman,QM24S2402528,1,725.4,390470936,30716,196631588,92.0,...,62.0,17598718,114.0,18004655,22931,4818457,2669262,,0,2024-04-26
1,Not Like Us,Not Like Us,Kendrick Lamar,USUG12400910,2,545.9,323703884,28113,174597137,92.0,...,67.0,10422430,111.0,7780028,28444,6623075,1118279,,1,2024-05-04
2,i like the way you kiss me,I like the way you kiss me,Artemas,QZJ842400387,3,538.4,601309283,54331,211607669,92.0,...,136.0,36321847,172.0,5022621,5639,7208651,5285340,,0,2024-03-19
3,Flowers,Flowers - Single,Miley Cyrus,USSM12209777,4,444.9,2031280633,269802,136569078,85.0,...,264.0,24684248,210.0,190260277,203384,,11822942,,0,2023-01-12
4,Houdini,Houdini,Eminem,USUG12403398,5,423.3,107034922,7223,151469874,88.0,...,82.0,17660624,105.0,4493884,7006,207179,457017,,1,2024-05-31
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4595,For the Last Time,For the Last Time,$uicideboy$,QM8DG1703420,4585,19.4,305049963,65770,5103054,71.0,...,2.0,14217,,20104066,13184,50633006,656337,,1,2017-09-05
4596,Dil Meri Na Sune,"Dil Meri Na Sune (From ""Genius"")",Atif Aslam,INT101800122,4575,19.4,52282360,4602,1449767,56.0,...,1.0,927,,,,,193590,,0,2018-07-27
4597,Grace (feat. 42 Dugg),My Turn,Lil Baby,USUG12000043,4571,19.4,189972685,72066,6704802,65.0,...,1.0,74,6.0,84426740,28999,,1135998,,1,2020-02-28
4598,Nashe Si Chadh Gayi,November Top 10 Songs,Arijit Singh,INY091600067,4591,19.4,145467020,14037,7387064,66.0,...,,,7.0,6817840,,,448292,,0,2016-11-08
