## Gathering Data from Spotify Charts and Using the Spotify API

Using data collected from Spotify Charts and the Spotify API, we can explore how the attributes of popular songs streamed on Spotify. For example, what is the proportion of top 10 songs that are classified as explicit? How has this changed over the course of a year?

In [1]:
# Import libraries
from spotipy.oauth2 import SpotifyClientCredentials
import spotipy
import pandas as pd
from dotenv import load_dotenv
import os


In [2]:
# Initialize the Spotify API client, use client id from command terminal 
load_dotenv()

# Access the variables
SPOTIPY_CLIENT_ID = os.getenv('SPOTIPY_CLIENT_ID')
SPOTIPY_CLIENT_SECRET = os.getenv('SPOTIPY_CLIENT_SECRET')
SPOTIPY_REDIRECT_URI = os.getenv('SPOTIPY_REDIRECT_URI')

sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials
                    (client_id=SPOTIPY_CLIENT_ID,
                    client_secret=SPOTIPY_CLIENT_SECRET))

In [3]:
# Define the directory containing the CSV files
csv_directory = 'data2'

# Get a list of all CSV files in the directory
# This assumes all files are .csv and are in the csv_directory
csv_files = [file for file in os.listdir(csv_directory) if file.endswith('.csv')]

# Initialize an empty list to store the dataframes
dataframes = []

# Loop through the list of CSV files and read each one
for file in csv_files:
    file_path = os.path.join(csv_directory, file)
    
    # Get the current week from the file path name
    week_date = '-'.join(file_path.split('-')[3:]).split('.')[0]
    
    with open(file_path, 'r') as f:
        songs = f.readlines()
        
    # Add the week to each line
    for i in range(len(songs)):
        songs[i] = songs[i].split('\n')[0]
        if i == 0:
            songs[i] += f',week\n'
        else:
            songs[i] += f',{week_date}\n'

    # Write the modified file content to a new file path
    new_file_path = os.path.join(csv_directory, f'new {file}')
    with open(new_file_path, 'w') as f:
        f.writelines(songs)
    df = pd.read_csv(new_file_path)
    os.remove(new_file_path)
    dataframes.append(df)

# Concatenate all dataframes into one
charts = pd.concat(dataframes, ignore_index=True)

# Display the combined dataframe
charts = charts.drop(columns=['source'])
charts

Unnamed: 0,rank,uri,artist_names,track_name,peak_rank,previous_rank,weeks_on_chart,streams,week
0,1,spotify:track:6AI3ezQ4o3HUoP6Dhudph3,Kendrick Lamar,Not Like Us,1,1,2,38690532,2024-05-16
1,2,spotify:track:7fzHQizxTqy8wTXwlrgPQQ,Tommy Richman,MILLION DOLLAR BABY,2,2,3,38088066,2024-05-16
2,3,spotify:track:7221xIgOnuakPdLqT0F3nP,"Post Malone, Morgan Wallen",I Had Some Help (Feat. Morgan Wallen),3,-1,1,35069809,2024-05-16
3,4,spotify:track:2FQrifJ1N335Ljm3TjTVVf,Shaboozey,A Bar Song (Tipsy),3,5,5,20609475,2024-05-16
4,5,spotify:track:2qSkIjg1o9h3YT9RAgYN75,Sabrina Carpenter,Espresso,1,4,5,18329189,2024-05-16
...,...,...,...,...,...,...,...,...,...
995,196,spotify:track:58ge6dfP91o9oXMzq3XkIS,Arctic Monkeys,505,18,192,178,2575446,2024-06-13
996,197,spotify:track:2ZWlPOoWh0626oTaHrnl2a,Frank Ocean,Ivy,169,183,9,2574631,2024-06-13
997,198,spotify:track:53IRnAWx13PYmoVYtemUBS,Chappell Roan,Femininomenon,198,-1,1,2572362,2024-06-13
998,199,spotify:track:4obHzpwGrjoTuZh2DItEMZ,Morgan Wallen,7 Summers,3,-1,68,2571426,2024-06-13
