# Song Recommendation Engine
Explore three strategies for constructing an emotion-responsive recommendation engine, and learn from their successes and failures.

---

## Introduction

**The goal is simple**: We ask how the user feels, and we want to retrieve Disney songs that go "well" with that input. For example, if the user is sad, a song like Reflection from Mulan would probably be appropriate.

You won't get good results if you try to find similarities between users' feelings (like, "Today I am great") and song lyrics. That's because song embeddings capture everything in the lyrics, making them "more open". Instead, we want to encode inputs, users, and lyrics into a similar representation and then run the search.  We need mainly three things: data, a way to encode it, and a way to match it with user input.

> Note: You will need to set `SPOTIPY_CLIENT_ID` and `SPOTIPY_CLIENT_SECRET` in .env file, please get it from [Spotify for Developers](https://developer.spotify.com/dashboard/applications) by creating an app.

## Workflow

Building a song recommendation engine using LangChain involves data collection, encoding, and matching. We scrape Disney song lyrics and gather their Spotify URLs. Using Activeloop Deep Lake Vector Database in LangChain, we convert the lyrics into embedded data with relevant metadata.

For matching songs to user input, we convert both song lyrics and user inputs into a list of emotions with the help of the OpenAI model. These emotions are embedded and stored in Deep Lake. A similarity search is then conducted in the vector database based on these emotions to provide song recommendations.

We filter out low-scoring matches and ensure the same song isn't recommended twice to add variation. 

## Setup

In [1]:
import openai
import os
from dotenv import load_dotenv, find_dotenv

_ = load_dotenv(find_dotenv())
openai.api_type = os.environ.get("OPENAI_API_TYPE")
openai.api_base = os.environ.get("OPENAI_API_BASE")
openai.api_version = os.environ.get("OPENAI_API_VERSION")
openai.api_key = os.environ.get("OPENAI_API_KEY")

## Building the system

### 1. Getting the data

To get our songs, we scraped https://www.disneyclips.com/lyrics/, a website containing all the lyrics for all Disney songs ever made.

In [2]:
# This cell avoid the error: RuntimeError: This event loop is already running

import nest_asyncio

nest_asyncio.apply()

In [3]:
from scripts import songs_lyrics_scrapper

Then, we used [Spotify Python APIs](https://spotipy.readthedocs.io/en/2.22.1/) to get all the embedding URLs for each song into the ["Disney Hits" Playlist](https://open.spotify.com/playlist/37i9dQZF1DX8C9xQcOrE6T). We removed all the songs we had scraped that were not in this playlist. By doing so, we end up with 85 songs.

In [None]:
from scripts import keep_only_lyrics_on_spotify

### 2. Data Encoding

Creating the dataset is pretty straightforward. Given the previous json file, we proceed to embed the text field and add all the rest of the keys/values as metadata.

In [6]:
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.chat_models import AzureChatOpenAI
from langchain.vectorstores import DeepLake
import json


def create_db(dataset_path: str, json_filepath: str) -> DeepLake:
    with open(json_filepath, "r") as f:
        data = json.load(f)

    texts = []
    metadatas = []

    for movie, lyrics in data.items():
        for lyric in lyrics:
            texts.append(lyric["text"])
            metadatas.append(
                {
                    "movie": movie,
                    "name": lyric["name"],
                    "embed_url": lyric["embed_url"],
                }
            )

    embeddings = HuggingFaceEmbeddings()

    db = DeepLake.from_texts(
        texts, embeddings, metadatas=metadatas, dataset_path=dataset_path
    )

    return db

To load it, we can simply:

In [7]:
def load_db(dataset_path: str, *args, **kwargs) -> DeepLake:
    db = DeepLake(dataset_path, *args, **kwargs)
    return db

### 3. Approaches to Matching Moods to Songs

#### 3.1 What Didn't Work

- **Similarity Search of Direct Embeddings**: This approach was straightforward. We create embeddings for the lyrics and the user input and do a similarity search. Unfortunately, we noticed terrible suggestions because we want to match the user's emotions to the songs theme rather than precisely what it says (lyrics).
- **Using ChatGPT as a Retrieval System**: We also tried to nuke the whole lyrics into ChatGPT and asked it to return matching songs with the user input. We had first to create a one-sentence summary of each lyric to fit the token limit. This did work okayish but was overkill.

#### 3.2 What Did Work: Similarity Search of Emotions Embeddings

Finally, we arrived at an inexpensive approach to run, which gives good results. We convert each lyric to a list of 8 emotions using ChatGPT: 

In [None]:
from scripts import create_emotions_summary

We then embedded each emotion for each song with GPT3.5-turbo and stored it with Deep Lake:

In [14]:
my_activeloop_org_id = os.environ.get("ACTIVELOOP_ORG_ID")
my_activeloop_dataset_name = "song-recommendation"

dataset_path = f"hub://{my_activeloop_org_id}/{my_activeloop_dataset_name}"

db = create_db(
    dataset_path=dataset_path,
    json_filepath="../../temp/emotions_with_spotify_url.json",
)

Your Deep Lake dataset has been successfully created!
The dataset is private so make sure you are logged in!


-

Dataset(path='hub://iamrk04/song-recommendation', tensors=['embedding', 'id', 'metadata', 'text'])

  tensor      htype      shape     dtype  compression
  -------    -------    -------   -------  ------- 
 embedding  embedding  (79, 768)  float32   None   
    id        text      (79, 1)     str     None   
 metadata     json      (79, 1)     str     None   
   text       text      (79, 1)     str     None   


 

Use the code below to load the db once you have it, avoid running the previous cell again:

In [27]:
my_activeloop_org_id = os.environ.get("ACTIVELOOP_ORG_ID")
my_activeloop_dataset_name = "song-recommendation"

dataset_path = f"hub://{my_activeloop_org_id}/{my_activeloop_dataset_name}"

db = load_db(
    dataset_path=dataset_path,
    embedding_function=HuggingFaceEmbeddings(),
    read_only=True,
)

Deep Lake Dataset in hub://iamrk04/song-recommendation already exists, loading from the storage


Then, we need to convert the user input to a list of emotions. We used ChatGPT again with a custom prompt:

In [108]:
from langchain.prompts import PromptTemplate
from langchain.chat_models import AzureChatOpenAI
from langchain.chains import LLMChain


def convert_input_to_emotions(user_input: str) -> str:
    """
    Convert user input to emotions using LLM.

    :param user_input: User input
    :return: Emotions
    """
    prompt = PromptTemplate(
        input_variables=["user_input"],
        template="""\
We have a simple song retrieval system. It accepts 8 emotions. You are tasked to suggest \
between 1 and 4 emotions to match the users feelings. Suggest more emotions for longer \
sentences and just one or two for small ones, trying to condense the main theme of the input.

Examples:

```
Input: "I had a great day!" 
"Joy"
Input: "I am very tired today and I am not feeling well"
"Exhaustion, Discomfort, and Fatigue"
Input: "I am in Love"
"Love"
```

Please, suggest emotions for input = ```{user_input}```, reply ONLY with a list of emotions/feelings/vibes\
""",
    )
    model = AzureChatOpenAI(deployment_name="gpt4", temperature=0.7)
    chain = LLMChain(llm=model, prompt=prompt)
    return chain.run(user_input=user_input)

### 4. Post processing

In [56]:
import numpy as np
from typing import Tuple, List
from langchain.schema import Document


# filter out the low-scoring ones.
def filter_scores(
    matches: List[Tuple[Document, float]], threshold: float = 0.8
) -> List[Tuple[Document, float]]:
    """
    Filter scores by threshold.

    :param matches: List of tuples (doc, score)
    :param threshold: Threshold to use for filtering
    :return: List of filtered tuples (doc, score)
    """
    return [(doc, score) for (doc, score) in matches if score > threshold]


# To add more variations, aka only sometimes recommend the first one, we need to sample
# from the list of candidate matches. To do so, we first ensure the scores sum to one by
# dividing by their sum.
def normalize_scores_by_sum(
    matches: List[Tuple[Document, float]]
) -> List[Tuple[Document, float]]:
    """
    Normalize scores by sum.

    :param matches: List of tuples (doc, score)
    :return: List of normalized tuples (doc, score)
    """
    scores = [score for _, score in matches]
    total = sum(scores)
    return [(doc, (score / total)) for doc, score in matches]


def weighted_random_sample(items: np.array, weights: np.array, n: int) -> np.array:
    """
    Does np.random.choice but ensuring we don't have duplicates in the final result

    Args:
        items (np.array): _description_
        weights (np.array): _description_
        n (int): _description_

    Returns:
        np.array: _description_
    """
    indices = np.arange(len(items))
    out_indices = []

    for _ in range(n):
        chosen_index = np.random.choice(indices, p=weights)
        out_indices.append(chosen_index)

        mask = indices != chosen_index
        indices = indices[mask]
        weights = weights[mask]

        if weights.sum() != 0:
            weights = weights / weights.sum()

    return items[out_indices]

### 5. Putting it all together

In [116]:
from typing import List


def get_recommendation(
    user_input: str,
    retrieve_songs: int = 20,
    match_score: float = 0.5,
    out_songs: int = 3,
) -> List[str]:
    """
    Get song recommendations based on user input.

    :param user_input: User input.
    :param retrieve_songs: max number of songs to retrieve from db.
    :param match_score: Minimum match score to filter matching songs.
    :param out_songs: Number of songs to return.
    """
    # Get emotions from a user's input
    emotions = convert_input_to_emotions(user_input)
    print(f"Detected emotions: {emotions}")

    # We find the k more similar song
    matches = db.similarity_search_with_score(
        query=emotions, distance_metric="cos", k=retrieve_songs
    )
    print(f"Matches: {matches}")

    # post-process the results
    try:
        norm_filtered_matches = normalize_scores_by_sum(
            filter_scores(matches, match_score)
        )
        docs, scores = zip(*norm_filtered_matches)
        choosen_docs = weighted_random_sample(
            np.array(docs), np.array(scores), n=out_songs
        ).tolist()
        return [doc.metadata["name"] for doc in choosen_docs]
    except:
        return []

In [119]:
user_input = "I am happy and excited"
response = get_recommendation(user_input)
response

Detected emotions: Happiness, Excitement
Matches: [(Document(page_content='Excitement, joy, amusement, wonder, friendship, enthusiasm, surprise, gratitude', metadata={'movie': 'Aladdin', 'name': 'Friend Like Me', 'embed_url': 'https://open.spotify.com/embed/track/5f2TWu6R2YYCJtLQ0fP78H?utm_source=generator'}), 0.7973227500915527), (Document(page_content='excitement, admiration, triumph, joy, pride, confidence, awe, inspiration', metadata={'movie': 'Hercules', 'name': 'Zero to Hero', 'embed_url': 'https://open.spotify.com/embed/track/4zDfgax6Ihb0UWdour1ZEs?utm_source=generator'}), 0.7807521820068359), (Document(page_content='excitement, anticipation, loneliness, elation, hope, anxiety, vulnerability, joy', metadata={'movie': 'Frozen', 'name': 'For the First Time in Forever', 'embed_url': 'https://open.spotify.com/embed/track/70b5Sq3ePOu3Gqg0hjlOtR?utm_source=generator'}), 0.7761297225952148), (Document(page_content='excitement, anticipation, happiness, playfulness, curiosity, contentmen

['Arabian Nights', 'Friend Like Me', "We Don't Talk About Bruno"]