Skip to content

🔧 a set of scripts to fetch, clean and group data from last.fm and other sources

License

Notifications You must be signed in to change notification settings

music-stats/scripts

Repository files navigation

music-stats scripts

license code size

SHOULD become an API gateway between different front-ends and various data providers, currently a set of scripts.

Tech stack

dev deps: typescript, jest.

deps: node, ramda, axios, chalk (v4.1.2 is used instead of v5.0.0 due to ESM, see this guide to migrate).

deps to consider for the server-side application: koa.

APIs, datasets

In use

To consider

On Spotify, only Personalization API is available now (among other endpoints, but that's the only section about the user's listening habits), restricting to top 50 artists/tracks. And it doesn't provide any measurable metric except "popularity" which is some abstract (i.e. calculated) affinity level. Geo data (e.g. country) is also not there.

Setup

Environment variables

Create a .env file and fill its values according to .env.template:

  • LASTFM_API_KEY (see last.fm docs)

Commands

$ npm ci               # install deps
$ npm run lint         # lint scripts
$ npm test             # run unit tests
$ npm run build        # compile TypeScript
$ npm run build:watch  # compile with watch

Terminology

In this repository, locations are called areas, not countries (like in music-stats/map). Even though country codes are used in the resulting merged-artists dataset, name area was chosen because MusicBrainz uses it in their schema - it can represent countries, regions or cities.

Scripts

Artist-area map

Fetch top artists for a given last.fm user

$ npm run script:artist-area-map:1-fetch-artists [50] [--] [--no-color] [--no-cache]
#                                                 ^
#                                                 number of artists, default is set in the config
Input

Username is set in src/config.ts.

Output

Filename: output/artist-area-map/1-lastfm-user-library.json.

Content:

[ { name: 'Dream Theater',
    playcount: 769,
    mbid: '28503ab7-8bf2-4666-a7bd-2644bfc7cb1d' }, // MusicBrainz ID
  { name: 'Queen',
    playcount: 757,
    mbid: '420ca290-76c5-41af-999e-564d7c71f1a7' },
  ...
  { name: 'Обійми Дощу',
    playcount: 222,
    mbid: 'fdafffec-3f14-442b-9700-1b52b89351ed' },
  { name: 'Lake of Tears',
    playcount: 214,
    mbid: '62cfcc64-a7d2-4ec2-ab4b-2a6b62e53940' } ]

Fetch areas for a given set of artists

$ npm run script:artist-area-map:2-fetch-artists-areas [10] [--] [--no-color] [--no-cache]
#                                                       ^
#                                                       number of artists, default is set in the config
Input

An output of npm run script:artist-area-map:1-fetch-artists.

Output

Filename: output/artist-area-map/2-musicbrainz-artists-areas.json.

Content:

[ { artist: 'Dream Theater', area: 'New York' }, // New York will be mapped to United States, individual cities aren't supported
  { artist: 'Queen', area: 'Japan' }, // Japan? "mbid" received from last.fm must be wrong, area will be switched to United Kingdom
  ...
  { artist: 'Обійми Дощу', area: 'Ukraine' },
  { artist: 'Lake of Tears', area: 'Sweden' } ]

Merge results of two scripts above

$ npm run script:artist-area-map:3-merge-artists [--] [--no-color]
Input

Expects both input files (.json) to be located at output/artist-area-map/. Blends them together, applies three stages of corrections (see data/corrections/), sorts alphabetically by artist name and puts "ISO 3166-1 alpha-2" country codes as area names.

Each [artist, playcount, countryCode] entry is placed on a separate line to make diffs easier to digest.

Output

Filename: output/artist-area-map/3-merged-artists.json.

Content:

[ [ 'Dream Theater', 769, 'US' ],
  [ 'Lake of Tears', 214, 'SE' ],
  [ 'Queen', 757, 'GB' ],
  [ 'Обійми Дощу', 222, 'UA' ],
  ... ]

Prepare country borders GeoJSON

  • Filters out countries not mentioned in the merged artists dataset.
  • Trims unused properties from the "GeoJSON Regions" dataset.
$ npm run script:artist-area-map:4-trim-world-map [--] [--no-color]
Input
  • An output of npm run script:artist-area-map:3-merge-artists.
  • A GeoJSON file with country borders is located at input/world.geo.json.
Output

Filename: output/artist-area-map/4-world.geo.json.

Scrobble timeline

Fetch all scrobbles

$ npm run script:scrobble-timeline:1-fetch-scrobbles [2019-02-25] [2019-03-10] [--] [--no-color] [--no-cache]
#                                                     ^            ^
#                                                     |            date to (YYYY-MM-DD), defaults to today
#                                                     |
#                                                     date from (YYYY-MM-DD), defaults to yesterday
Input

None.

Output

Filename: output/scrobble-timeline/1-scrobbles/2019-02-25--2019-03-10.json.

Content:

[ [ '2019-03-10 18:13',       // date
    'Handle This',            // track name
    2,                        // track playcount
    'All Killer, No Filler',  // album name
    28,                       // album playcount
    'Sum 41',                 // artist name
    33 ],                     // artist playcount
  ... ]

Merge all fetched scrobbles together

$ npm run script:scrobble-timeline:2-merge-scrobbles [--] [--no-color]
Input

Expects input files (.json) to be located at output/scrobble-timeline/1-scrobbles/.

Output

Filename: output/scrobble-timeline/2-merged-scrobbles.json.

Content: same as from the fetching commands, but everything put into a single chronological collection with playcount values aggregated.

About

🔧 a set of scripts to fetch, clean and group data from last.fm and other sources

Resources

License

Stars

Watchers

Forks

Packages

No packages published