Skip to content

This project can pull stats related to your Spotify streaming history. Given a csv file containing your Spotify history, we can determine your information such as favorite song/artist using different data structures.

Notifications You must be signed in to change notification settings

apeera/Sort-Spotify-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Sort-Spotify-Data

This project can pull stats related to your Spotify streaming history. Given a csv file containing your Spotify history, we can determine your information such as favorite song/artist using different data structures.

Problem: What problem are we trying to solve? / Motivation: Why is this a problem?

The members of our group like to listen to a vast amount of music on Spotify, however, unless you use a paid third party service there is no easy way to look in depth at your listening history year-round. Spotify does have a “Spotify Wrapped”, where they display your top artists, songs, and more intriguing data for that year, however this is only available for a short time at the end of the year and it does not include data from November or December. However, Spotify does give users the option to request their listening history, which after about a month they will send you your listening history in a series of json files in a zip file. This json data is difficult to use directly, so we decided to compare efficient ways to store and recall this data to be able to see our top songs and artists.

Features implemented

When the user runs the code, they are presented with six options: Create Unordered Map, Create Ordered Map, Output Unordered Map, Output Ordered Map, Search Unordered Map, Search Ordered Map, and End Program. When you select one of the first two options, you are then asked whether you want to create it sorted by artists or songs, it then creates a map/unordered_map that stores key: artist/song and value: # of streams of that artist/song. When you select to output, you are then given four more options, to Display Top Song Titles, Display Top Artist Names, Display All Song Titles, and Display All Artist Names. When you select Display Top, you are then prompted for the number of songs/artists you wish to display (n) and it will display your ‘n’ top songs/artist based on the number of total streams. When you select to search, you are given two more options, to Search by Artist or to Search by Song and it outputs the number of streams of that corresponding artist/song. Most of these functions are self-evident, and can be seen in the Video.

Description of data

Our data came in the form of a series of .json files that each contain various data points for a series of songs. There are 21 data points per song: ts, username, platform, ms_played, conn_country, ip_addr_decrypted, user_agent_decrypted, master_metadata_track_name, master_metadata_album_artist_name, master_metadata_album_album_name, spotify_track_uri, episode_name episode_show_name, spotify_episode_uri, reason_start, reason_end, shuffle skipped, offline, offline_timestamp, and incognito_mode. However, of these, there are really only two that we care about: master_metadata_track_name and master_metadata_album_artist_name, that contain the song’s name and artist’s name respectively. We then converted these .json files to .csv files separated by tabs using an online converter, as CSV files are far easier to work with in C++ than Json files, which would likely require a separate, external library. Between the three of us, we listened to 117815 songs, each with their respective set of data.

Tools/Languages/APIs/Libraries used

In our project, we used Replit as an IDE to share code in real time as well as our own IDEs, Visual Studio and CLion, for developing new portions of the code. We used C++ as our programming language and used the following built-in libraries: standard, algorithm, fstream, iostream, sstream, string, unordered_map, map, vector, and chrono.

Algorithms implemented / Additional Data Structures/Algorithms used

The primary data structures we used were maps, which are backed by red-black trees, and unordered maps, which are backed by hash tables. We also used a vector pair in order to get our top-played songs and artists. The primary comparisons of our project were between the unordered map and the ordered map. In terms of speed/efficiency when testing creating and printing out the maps, the unordered_map was more efficient, but the ordered map has the advantage of printing out in alphabetical order if all results are printed. We also used the built in C++ sort function, which uses a sort called Introsort that essentially combines Quicksort, Heapsort, and Insertion Sort.

About

This project can pull stats related to your Spotify streaming history. Given a csv file containing your Spotify history, we can determine your information such as favorite song/artist using different data structures.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages