Introduction

Soon we'll need to write a content resolver. Soon was probably over 15 years ago, so this a bit overdue. :)

Quick Start

To install the package:

python -m venv .virtualenv
source .virtualenv/bin/activate
pip install -r requirements.txt

Then prepare the index and scan a music collection. mp3 and flac are supported.

./resolve.py create test_index
./resolve.py scan test_index <path to mp3/flac files>

Then make a JSPF playlist on LB:

https://listenbrainz.org/user/{your username}/playlists/

Then download the JSPF file (make sure the playlist is public):

curl "https://api.listenbrainz.org/1/playlist/<playlist MBID>" > test.jspf

Finally, resolve the playlist to local files:

./resolve.py playlist <index_dir> <input JSPF file> <output m3u file>

Then open the m3u playlist with a local tool.

Current limitations / open questions

The Whoosh library that was being used for fuzzy indexing seems to be buggy and unsupported. After much searching I found another approach, using this method:

https://towardsdatascience.com/fuzzy-matching-at-scale-84f2bfd0c536

The term frequency, inverse document frequency (tf-idf) approach works well and is very fast. However, the libraries lack the ability to seralize these indexes to disk, which is annoying. But that can be worked around if we decide to use this approach.

How things work now:

Scan files and save data into a sqlite database.
When resolving a playlist or a recording, the metadata is loaded from SQLite and the indexes are built.
Then the resolving happens.

So far this isn't a problem and it may not be -- given that if you have loaded the data for 500,000 recordings in memory, an index of the data can be built in a few seconds, if that. If this has to be done once at startup of a service, it might be ok.

Open question: Do we want to continue working with this approach? Are the scikit.learn and nmslib ok to include as depedencies?

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
lb_content_resolver		lb_content_resolver
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
resolve.py		resolve.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lb_content_resolver

lb_content_resolver

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

resolve.py

resolve.py

setup.py

setup.py

Repository files navigation

Introduction

Quick Start

Current limitations / open questions

About

Releases

Sponsor this project

Packages

Languages

License

phw/listenbrainz-content-resolver

Folders and files

Latest commit

History

Repository files navigation

Introduction

Quick Start

Current limitations / open questions

About

Resources

License

Stars

Watchers

Forks

Sponsor this project

Languages