# Music Recommendation Service

## Using Recommendations

This notebook assumes you're familiar with Jupyter and have run the `import.ipynb` notebook successfully. In the next steps we will walk through some basic retrieval and recommendation scenarios.

## Import the needed modules & do some setup

Note: This code will produce some output to inform you about the state of the model download. GPU support as well as CPU hardware feature support. **These will be highlighted in red. (ignore these)**

In [None]:
# import the needed modules
from IPython.display import Audio as player
from datasets import load_dataset, load_from_disk, Audio
from panns_inference import AudioTagging
from qdrant_client import QdrantClient
from qdrant_client.http import models
from os.path import join
from glob import glob
import pandas as pd
import numpy as np
import librosa
import os
MUSIC_COLLECTION_DB = "my_collection"

# get the hostname from OS ENV
QDRANT_HOST = os.environ.get('QDRANT_HOST')
# connect to Qdrant vector DB
client = QdrantClient(host=QDRANT_HOST, port=6333)
print("Attempting to connect to %s" % QDRANT_HOST)
# check if our collection already exists
collections = client.get_collections()
music_collection_exits = False
if collections:  
    if len(collections.collections) > 0 and collections.collections[0].name == MUSIC_COLLECTION_DB:
        print("%s exists..." % MUSIC_COLLECTION_DB)
        music_collection_exists = True
    else:
        print("%s doesn't exist..." % MUSIC_COLLECTION_DB)

In [None]:
# aquire the data we need here for the next steps
music_data = load_from_disk("./data/complete_music_data_set.arrow")
metadata = pd.read_json("./data/metatdata_complete_music_data_set.json")
payload = metadata.to_dict(orient="records")

# a helper function to print song details
def print_song(song, recommendation=True, show_embedding=False):
    score = "NA"
    if hasattr(song, "score"): score = song.score
    if recommendation:
        print("idx:%s  -- %s by %s with score %s" % (song.payload['index'], song.payload['name'], song.payload['artist'], score))
    else:
        print("idx:%s  -- %s by %s" % (song.payload['index'], song.payload['name'], song.payload['artist']))
    if show_embedding:
        print(song.vector)
        print("-" *30)

# return a audio player
def play_song(song):
    index_key = song.payload['index']
    #input_song = librosa.core.load(song.payload['urls'], sr=44100, mono=True)
    return player(music_data[index_key]['audio']['array'], rate=music_data[index_key]['audio']['sampling_rate'])

# retrieve a embedding based on a song index
def get_embedding(index_key):
    res = client.retrieve(
        collection_name=MUSIC_COLLECTION_DB,
        ids=[index_key],
        with_vectors=True # we can turn this on and off depending on our needs
    )
    if len(res) > 0: return res[0].vector
    else: return None

In [None]:
metadata['genre'].unique()

## Let's do a basic lookup against our vector DB first

**TASK:**
- Find a song you'd like to get recommendations for by changing out the `ids=[45,66,4566]` array.
- Try any index under 11,000 and the length of the array to 5 or less.

In [None]:
# this is how we can retrieve some songs from the DB using just IDs
lookup = client.retrieve(
    collection_name=MUSIC_COLLECTION_DB,
    ids=[45,66,4566],
    with_vectors=True # we can turn this on and off depending on our needs
)

# print out the songs we just fetched
for song in lookup:
    print_song(song, recommendation=False, show_embedding=False)
    display(play_song(song))

## Let's retrieve our first recommendation

**TASK:**
- Plug your song's index/idx/id number into the `get_embedding(4566)` method below
- Display and listen to the recommendations

In [None]:
# let's do a similarity search, notice the score to see how similar a song is
search = client.search(
    collection_name=MUSIC_COLLECTION_DB,
    query_vector=get_embedding(4566),
    limit=5
)

# note: the first song in our search is the input song itself with score 1.0 (perfect match)
print("Input song was:")
print_song(search[0])
display(play_song(search[0]))
        
# now let's load and play our recommendation
print("Recommended songs are:")
for song in search[1:]:
    print_song(song)
    display(play_song(song))

## Let's retrieve recommendations based on multiple songs

**TASK:**
- Plug in two song ids into the `positive=[855, 566]` array below to get recommendations based on multiple songs
- Listen to the songs

In [None]:
# here is another way of getting recommendations
# instead of inputting the sing we can actually just specify the index of the song we want recommendations for
recommendation = client.recommend(
                            collection_name=MUSIC_COLLECTION_DB, 
                            positive=[855, 566], limit=3
                    )

print("Recommended songs:")
for song in recommendation:
    print_song(song)
    display(play_song(song))

## Let's add a limiting filter

This time we will find the closest song in the electronic--disco genre to whatever input song we choose. **Notice the impact on the scores.**

**TASK:**
- Input your song by editing `positive=[57]`
- See which disco songs are closest to your input

In [None]:
# let's do a recommendation that's limited to a particular sub genres
subgenre_songs = models.Filter(
    must=[models.FieldCondition(key="subgenres", match=models.MatchAny(any=['electronic---disco']))]
)

subgenre_recommendations = client.recommend(
    collection_name=MUSIC_COLLECTION_DB,
    query_filter=subgenre_songs,
    positive=[57],
    #positive=[marc_anthony_valio_la_pena['idx'], 178, 122, 459],
    #negative=[385],
    limit=5
)

# print and play songs
for song in subgenre_recommendations:
    print_song(song)
    display(play_song(song))