Skip to content

analysis of complete spotify streaming dataset (endsong_*.json)

License

Notifications You must be signed in to change notification settings

pldubouilh/spotify-gdpr-dump-analysis

Repository files navigation

spotify-gdpr-dump-analysis

Local analysis of complete spotify streaming dataset (endsong_*.json). Made in 3 hours alongside with chatGPT, fixing bugs as they appeared.

Ask for your GDPR streaming data dump here. It take a couple days to come.

That's a whole lot of data 👀

# deps
$ python -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt

# get geodb for local ip lookup
$ curl -L -o city.mmdb https://git.io/GeoLite2-City.mmdb

# create sqlite3 database from json dump. datafolder should contain all your endsong_*.json files
$ python makedb.py datafolder/

# run analysis !
$ python map-ips-city.py

a

$ python top-cities.py
df                     city         country  count
20                   Berlin         Germany   2629
...
$ python top-songs-per-country.py
DE                                                 La femme d'argent                   Air
DE  Piano Concerto No. 3 in D Minor, Op. 30: I. Allegro ma non tanto   Sergei Rachmaninoff
DE                                La mer, L. 109: II. Jeux de vagues        Claude Debussy
DE                                                   Samba da Bencao        Bebel Gilberto
DE                                      Merry Christmas Mr. Lawrence      Ryuichi Sakamoto
DE                                                        WEIGHT OFF            KAYTRANADA
...

About

analysis of complete spotify streaming dataset (endsong_*.json)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages