Recommendation Engine For Myself (though it can be for anyone if the model is trained on different data)
This is a content-based recommendation system for music that is tailored to my specific tastes. At the heart of the system is an LSTM auto-encoder that learns a compressed representation of any song, first converted into a spectrogram. Pairwise distance is used as a similarity metric in this representation space, where a lower distance between two songs indicate greater similarity. This can be used to rank selected songs against a set of indexed songs.
The rest of this README is a work in progress.
Tensorflow >= 2.0 is required. Installation instructions can be found on the official website: https://www.tensorflow.org/install
After installing Tensorflow, simply clone this repo: git clone https://github.com/eddddddy/ReFORM.git
The auto-encoder is trained on mel-spectrogram data. The data file can be generated by running python generate.py
. The script expects the DATA_PATH
directory to be organized as follows:
DATA_PATH\
artist1\
album1\
song1.wav
song2.wav
...
album2\
song3.wav
...
...
artist2\
album3\
song4.wav
...
...
...
and the mapping.yaml
file to be in the following format:
artist1:
album1:
- name: song1
- name: song2
...
album2:
- name: song3
...
...
artist2:
album3:
- name: song4
...
...
...
The data file will be placed in the spectrogram_data
directory, which will be created if it does not already exist.
To train the model, simply run python net.py
. Model weights will be checkpointed into the checkpoints
directory.
The embed.py
file contains functionality for encoding spectrogram data into the representation space learned by the auto-encoder. It also contains utilities for working with these representations. For example, use the following code to visualize the representation space:
>>> import embed
>>> e = embed.Embedding('checkpoints/weights.hdf5')
>>> e.calculate(name=['artist1.album1.song1', 'artist2.album2.song2', 'artist3.album3.song3'])
>>> e.plot(boundary='convex')
Example output: To get the similarity between two songs, use:
>>> e.similarity('artist1.album1.song1', 'artist2.album2.song2')
Currently not accepting any pull requests, but feel free to fork a copy of this repo.