<a href="https://colab.research.google.com/github/tjessica13/TS_SBERT/blob/main/TaylorSwift_SBERT.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Taylor Swift Lyric Retrieval with SBERT
This is an experiment with semantic search on Taylor Swift lyrics using the transformer SBERT.

In [16]:
# dataset is from https://www.kaggle.com/datasets/ishikajohari/taylor-swift-all-lyrics-30-albums/data

!git clone https://github.com/tjessica13/TS_SBERT

fatal: destination path 'TS_SBERT' already exists and is not an empty directory.


# Step 1: Preprocessing

In [127]:
songs = []

song_titles = []

In [126]:
import os
import re

def preprocess(content):
  # remove new lines
  lyrics_n = re.sub('\n', ' ', content)
  # remove slashes
  lyrics_slash = re.sub('\'|\"', '', lyrics_n)
  # remove the song structure parts (ex. [Chorus])
  lyrics_embed = re.sub('[0-9]+Embed', '', lyrics_slash)
  lyrics = re.sub('\[(.*?)\]', '', lyrics_embed)
  #print(lyrics)
  return lyrics


In [128]:
import os
# read in the documents for the collection
def read_docs(directory, album):
  for filename in os.listdir(directory):
    filepath = os.path.join(directory, filename)
    print(album)
    print(filename)
    if os.path.isfile(filepath):
      with open(filepath, 'r', encoding='utf-8') as file:
        next(file) # skip the first line of the file
        content = file.read()
        print("Processing file: " + filepath)
        lyrics = preprocess(content)
        song_titles.append(album + "." + filename)
        songs.append(lyrics)

In [93]:
albums = {
    "1989_TaylorsVersion_",
    "Evermore",
    "Fearless_TaylorsVersion_",
    "Folklore",
    "Lover",
    "Midnights_TheTillDawnEdition_",
    "Red_TaylorsVersion_",
    "Reputation",
    "SpeakNow_TaylorsVersion_",
    "THETORTUREDPOETSDEPARTMENT_THEANTHOLOGY",
    "TaylorSwift"
}

In [129]:
read_docs("/content/TS_SBERT/THETORTUREDPOETSDEPARTMENT_THEANTHOLOGY", "TTPD")

TTPD
HowDidItEnd_.txt
Processing file: /content/TS_SBERT/THETORTUREDPOETSDEPARTMENT_THEANTHOLOGY/HowDidItEnd_.txt
TTPD
TheAlchemy.txt
Processing file: /content/TS_SBERT/THETORTUREDPOETSDEPARTMENT_THEANTHOLOGY/TheAlchemy.txt
TTPD
Peter.txt
Processing file: /content/TS_SBERT/THETORTUREDPOETSDEPARTMENT_THEANTHOLOGY/Peter.txt
TTPD
DownBad.txt
Processing file: /content/TS_SBERT/THETORTUREDPOETSDEPARTMENT_THEANTHOLOGY/DownBad.txt
TTPD
ClaraBow.txt
Processing file: /content/TS_SBERT/THETORTUREDPOETSDEPARTMENT_THEANTHOLOGY/ClaraBow.txt
TTPD
imgonnagetyouback.txt
Processing file: /content/TS_SBERT/THETORTUREDPOETSDEPARTMENT_THEANTHOLOGY/imgonnagetyouback.txt
TTPD
TheProphecy.txt
Processing file: /content/TS_SBERT/THETORTUREDPOETSDEPARTMENT_THEANTHOLOGY/TheProphecy.txt
TTPD
Florida___.txt
Processing file: /content/TS_SBERT/THETORTUREDPOETSDEPARTMENT_THEANTHOLOGY/Florida___.txt
TTPD
SoHighSchool.txt
Processing file: /content/TS_SBERT/THETORTUREDPOETSDEPARTMENT_THEANTHOLOGY/SoHighSchool.txt
TTPD
B

In [130]:
songs

['(Uh-oh, uh-oh)   We hereby conduct this post-mortem He was a hot house flower to my outdoorsman Our maladies were such we could not cure them And so a touch that was my birthright became foreign   Come one, come all, its happenin again The empathetic hunger descends Well tell no one except all of our friends We must know How did it end? (Uh-oh, uh-oh)   We were blind to unforeseen circumstances We learned thе right steps to different dancеs And fell victim to interlopers glances Lost the game of chance, what are the chances? Soon, theyll go home to their husbands Smug cause they know they can trust him Then feverishly calling their cousins See Taylor Swift LiveGet tickets as low as $60You might also like Guess who we ran into at the shops? Walking in circles like she was lost Didnt you hear? They called it all off One gasp and then How did it end?   Say it once again with feeling How the death rattle breathing Silenced as the soul was leaving The deflation of our dreaming Leaving me 

In [None]:
for album in albums:
  read_docs("/content/TS_SBERT/" + album, album)

In [95]:
print(len(songs))
print(len(song_titles))

240
240


# SBERT

In [74]:
!pip install -U sentence-transformers



In [96]:
import json
from sentence_transformers import SentenceTransformer, CrossEncoder, util
import gzip
import os
import torch

In [97]:
#We use the Bi-Encoder to encode all passages, so that we can use it with semantic search
bi_encoder = SentenceTransformer('multi-qa-MiniLM-L6-cos-v1')
bi_encoder.max_seq_length = 256     #Truncate long passages to 256 tokens
top_k = 32                          #Number of passages we want to retrieve with the bi-encoder

In [98]:
# We encode all passages into our vector space. This takes about 5 minutes (depends on your GPU speed)
corpus_embeddings = bi_encoder.encode(songs, convert_to_tensor=True, show_progress_bar=True)

Batches:   0%|          | 0/8 [00:00<?, ?it/s]

In [108]:
import torch
def search(query):
  print("Looking for a song that matches:", query)

  ##### Semantic Search #####
  # Encode the query using the bi-encoder and find potentially relevant passages
  question_embedding = bi_encoder.encode(query, convert_to_tensor=True)
  hits = util.semantic_search(question_embedding, corpus_embeddings, top_k=top_k)
  hits = hits[0]  # Get the hits for the first query

  # Output of top-5 hits from bi-encoder
  print("\n-------------------------\n")
  print("Top-3 Bi-Encoder Retrieval hits")
  hits = sorted(hits, key=lambda x: x['score'], reverse=True)
  for hit in hits[0:10]:
      print("\t{:.3f}\t{}".format(hit['score'], song_titles[hit['corpus_id']]))
      print(songs[hit['corpus_id']])


THis experiments tests the capabilites of SBERT in semantic search and sentiment analysis. Additionally, we test how SBERT handles song lyrics which have metaphors and symbolism that may pose challenges for search.

Since SBERT is meant to be used for question answering, this experiment is testing if songs match the keywords and meaning of the query. This can include if the songs have the meaning of the query or synonyms.

In [123]:
search("A sad breakup")

Looking for a song that matches: A sad breakup

-------------------------

Top-3 Bi-Encoder Retrieval hits
	0.560	Red_TaylorsVersion_.StayStayStay_TaylorsVersion_.txt
Im pretty sure we almost broke up last night I threw my phone across the room at you I was expecting some dramatic turn away But you stayed This morning, I said we should talk about it Cause I read you should never leave a fight unresolved Thats when you came in wearing a football helmet And said, "Okay, lets talk"   And I said, "Stay, stay, stay" Ive been loving you for quite some time, time, time You think that its funny when Im mad, mad, mad But I think that its best if we both stay  Before you, Id only dated self-indulgent takers Who took all of their problems out on me But you carry my groceries and now Im always laughing And I love you because you have given me   No choice but to stay, stay, stay Ive been loving you for quite some time, time, time You think that its funny when Im mad, mad, mad But I think that its b

In [112]:
search("snow and winter")

Looking for a song that matches: snow and winter

-------------------------

Top-3 Bi-Encoder Retrieval hits
	0.298	Fearless_TaylorsVersion_.TheBestDay_TaylorsVersion_.txt
Im five years old, its getting cold, Ive got my big coat on I hear your laugh and look up smiling at you, I run and run Past the pumpkin patch and the tractor rides Look now, the sky is gold I hug your legs and fall asleep on the way home   I dont know why all the trees change in the fall But I know youre not scared of anything at all Dont know if Snow Whites house is near or far away But I know I had the best day with you today  Im thirteen now and dont know how my friends could be so mean I come home crying and you hold me tight and grab the keys And we drive and drive until we found a town far enough away And we talk and window shop til Ive forgotten all their names   I dont know who Im gonna talk to now at school But I know Im laughing on the car ride home with you Dont know how long its gonna take to feel okay B

In [122]:
search("dreaming and fantasy")

Looking for a song that matches: dreaming and fantasy

-------------------------

Top-3 Bi-Encoder Retrieval hits
	0.613	Folklore.epiphany.txt
Keep your helmet, keep your life, son Just a flesh wound, heres your rifle Crawling up the beaches now "Sir, I think hes bleeding out" And some things you just cant speak about   With you I serve, with you I fall down, down Watch you breathe in, watch you breathing out, out   Something med school did not cover Someones daughter, someones mother Holds your hand through plastic now "Doc, I think shes crashing out" And some things you just cant speak about   Only twenty minutes to sleep But you dream of some epiphany Just one single glimpse of relief To make some sense of what youve seen   With you I serve, with you I fall down, down (Down) Watch you breathe in, watch you breathing out, out With you I serve (With you I serve), with you I fall down (Down), down (Down) Watch you breathe in (Watch you breathe in), watch you breathing out (Out), out (O

In [121]:
search("angry and revenge")

Looking for a song that matches: angry and revenge

-------------------------

Top-3 Bi-Encoder Retrieval hits
	0.504	SpeakNow_TaylorsVersion_.BetterThanRevenge_TaylorsVersion_.txt
Now go stand in the corner and think about what you did (Haha) Haha, time for a little revenge   The story starts when it was hot and it was summer, and I had it all, I had him right there where I wanted him She came along, got him alone, and lets hear the applause She took him faster than you can say "Sabotage" I never saw it coming, wouldnt have suspected it I underestimated just who I was dealing with (Oh) She had to know the pain was beating on me like a drum She underestimated just who she was stealin from   Shes not a saint and shes not what you think Shes an actress, woah He was a moth to the flame She was holding the matches, woah Soon, shes gonna find stealing other peoples toys On the playground wont make you many friends She should keep in mind, she should keep in mind There is nothing I do better

In [124]:
search("grief and death")

Looking for a song that matches: grief and death

-------------------------

Top-3 Bi-Encoder Retrieval hits
	0.439	THETORTUREDPOETSDEPARTMENT_THEANTHOLOGY.HowDidItEnd_.txt
(Uh-oh, uh-oh)   We hereby conduct this post-mortem He was a hot house flower to my outdoorsman Our maladies were such we could not cure them And so a touch that was my birthright became foreign   Come one, come all, its happenin again The empathetic hunger descends Well tell no one except all of our friends We must know How did it end? (Uh-oh, uh-oh)   We were blind to unforeseen circumstances We learned thе right steps to different dancеs And fell victim to interlopers glances Lost the game of chance, what are the chances? Soon, theyll go home to their husbands Smug cause they know they can trust him Then feverishly calling their cousins See Taylor Swift LiveGet tickets as low as $60You might also like "Guess who we ran into at the shops? Walking in circles like she was lost Didnt you hear? They called it all off"

In [125]:
search("happy and celebrations")

Looking for a song that matches: happy and celebrations

-------------------------

Top-3 Bi-Encoder Retrieval hits
	0.430	Evermore.tolerateit.txt
I sit and watch you reading with your head low I wake and watch you breathing with your eyes closed I sit and watch you I notice everything you do or dont do Youre so much older and wiser, and I   I wait by the door like Im just a kid Use my best colors for your portrait Lay the table with the fancy shit And watch you tolerate it If its all in my head, tell me now Tell me Ive got it wrong somehow I know my love should be celebrated But you tolerate it   I greet you with a battle heros welcome I take your indiscretions all in good fun I sit and listеn, I polish plates until they gleam and glistеn Youre so much older and wiser and I   I wait by the door like Im just a kid Use my best colors for your portrait Lay the table with the fancy shit And watch you tolerate it If its all in my head, tell me now Tell me Ive got it wrong somehow I know my

# References

* https://www.kaggle.com/datasets/ishikajohari/taylor-swift-all-lyrics-30-albums
* https://www.kaggle.com/code/ishikajohari/taylor-swifts-lyrical-word-clouds-all-11-albums
*   https://stackoverflow.com/questions/4796764/read-file-from-line-2-or-skip-header-row
* https://stackoverflow.com/questions/3677255/regex-for-anything-between
* https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/retrieve_rerank/retrieve_rerank_simple_wikipedia.ipynb
* https://sbert.net/examples/applications/computing-embeddings/README.html


