GitHub - kennethcandersen/spotify-user-preferences-during-covid-analysis: Analysis demonstrates that Spotify could adapt it's song recommendation algorithms to take into account how different countries respond to crisis. Analysis conducted using Jupyter Notebook, Pandas, API data retrieval, CSV data imports, and Matplotlib.

Spotify trends & COVID-19: Correlations between music listening trends and hardship levels from COVID

EXECUTIVE SUMMARY

Moderate and strong correlations in some countries between Covid infection rates and music valence ("happiness") preferences indicate that Spotify could adapt it's song recommendation algorithms to take into account how different countries respond to crisis. Some countries saw average valence preferences increase (users prefer happier music) and other countries decrease. Spotify can use this data to improve the user experience, opimize exclusive contracts with certain artists that match the ‘mood’ of each country, and tehreby increase premium subscription revenue.

REPOSITORY NAVIGATION

starter-codes contains the initial base codes for our entire project, which tested Spotify's API and the plotting techniques.
data-exploration contains the individual data handling, CSV files, inputs and outputs. Each country or region explored has it's own subfolder.
main-data-analysis contains the integrated final codes that created the analysis plots, which pulled data from the exploration data on each country and region.
documents contains the presentation and written report.

OBJECTIVE

Explore how COVID-19 affected music streaming user preferences to answer:

Are users listening to happy music to cheer up?
Are users listening to sad music to cope?
Do results vary by country?

BUSINESS IMPACT

Spotify can use this data to improve their algorithms and offer more personalized suggestions to users in different regions, that could potentially provide a better experience to the user, thereby retaining more premium users and increasing revenue.

DATA

Spotifycharts.com for historical data on top 200 songs by country and region.
Spotify API for metrics on individual tracks.
Our World data for Covid data.

DEPENDENCIES

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import requests
import json
import re, glob
import os, sys
from scipy import stats
import seaborn as sn
import spotipy

DATA EXPLORATION

Convert input CSVs into DataFrame COVID data in csv files from https://ourworldindata.org [5] and Spotify Top Charts at spotifycharts.com. They were read and transformed into DataFrames using Pandas.
Obtain track features from API Using the Spotify API and the track names from the Spotify DataFrame, we got track information regarding Valence. This is a metric used by Spotify which characterize whether a song is relatively sad or happy.
Delete faulty/repeated data Some tracks were not found, and some information was not available in certain countries. These values in the DataFrame were dropped.
Plot key indexes over time Using the dates in which the data was obtained, we merged the DataFrames and plotted valence.
Obtain correlations between variables Correlations between Covid severity and listerner music preferences as reflected in the Spotify Top 200 Charts were obtained by using scatter plots and linear regression
analysis.

DATA TRANSFORMATION

From Spotify we obtained CSV files for the weekly top 200 charts for each country and the global one. From Our World in Data COVID statistics divided by date and country were used. Also, the Spotify API was used to obtain track information and audio features for each track.

From this data, key metric to study was Valence, defined from sounded studies as a measure which indicates whether people associate happy or sad sounds. This is important because it is how Spotify describes the musical positiveness conveyed by a track. The number of COVID cases and deaths were also analysed in order to compare how valence behaved over time.

Import CSV files obtained from the data sources, and turned them into Pandas DataFrames. From the DataFrame with the Spotify data the Spotify API was used to obtain the track ID associated with each song, and then the audio features of the song, which included song valence.

CLEAN-UP DATA

Dropped songs that are not in certain countries, songs that are no longer in Spotify and songs with missing values.

PANDAS ANALYSIS

Relevant mean values and weighted averages to simplify the data to be grouped by month and year were retrieved. Matplotlib helped to plot valence through time, comparing the monthly change in 2019 and 2020 also scatter plots and linear regression analysis to obtain the correlation between variables was used.

RESULTS

Valence preference varies between countries. Hispanic and Latin countries have higher mean valence. Mean valence between 2019 and 2020 was significantly different in some countries. México decreased in 2020. Spain increased in 2020. Globally, there was no significant change in mean valence.

Valence vs. Cases per Million Global tendency shows a strong positive correlation. Germany and the US match the trend. México, Spain and India show from a moderate to a strong negative correlation.

Valence vs. Deaths per Million Global tendency shows a strong positive correlation. Germany and the US match the trend. México, Brazil, Italy and India show from a moderate to a strong negative correlation.

CONCLUSIONS

The majority of the countries analyzed presented a correlation between Covid severity and changes in listening preferencies.

Therefore, Spotify can use the data obtained to offer more personalized suggestions to users depending on the country, and optimize exclusive contracts with certain artists that match the ‘mood’ of each country.

=======

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
data-exploration		data-exploration
documents		documents
main-data-analysis		main-data-analysis
starter-codes		starter-codes
.gitignore		.gitignore
README.md		README.md
Spotify_COVID_analysis.png		Spotify_COVID_analysis.png
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spotify trends & COVID-19: Correlations between music listening trends and hardship levels from COVID

About

Releases

Packages

Contributors 4

Languages

kennethcandersen/spotify-user-preferences-during-covid-analysis

Folders and files

Latest commit

History

Repository files navigation

Spotify trends & COVID-19: Correlations between music listening trends and hardship levels from COVID

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages