# Create Taylor Swift Embeddings Data

Use this colab to analyze trends in Taylor Swift's song lyrics using [Phoenix OSS](https://github.com/Arize-ai/phoenix). Download the Kaggle dataset [here](https://www.kaggle.com/datasets/PromptCloudHQ/taylor-swift-song-lyrics-from-all-the-albums?select=taylor_swift_lyrics.csv).

In [1]:
!pip install arize-phoenix



In [1]:
import pandas as pd
import phoenix as px

In [6]:
df = pd.read_csv("summaries.csv", encoding="ISO-8859-1", delimiter="\t")
df

Unnamed: 0,videoID,title,summary
0,652b44ad43e8c47e4eb48235,Israel-Hamas war: Israeli soldiers gear up to ...,An intense missile conflict unfolds in the ski...
1,652b444c43e8c47e4eb48231,Israel vs Hamas: Five Videos that Define the W...,"In this video, there are five defining moments..."
2,652b444943e8c47e4eb48230,Jewish schools CLOSE over children safety conc...,Some North London Jewish schools have closed d...
3,652b443543e8c47e4eb4822f,Israel Air Force Rains Fire On Hamas Naval HQ ...,This video shows a city being struck by a disa...
4,652b442943e8c47e4eb4822e,Fake News Over Israel-Hamas War | Vantage with...,The video explores the prevalence of digital m...
...,...,...,...
72,652b408543e8c47e4eb481e0,Israel-Hamas War: Watch | Drone Video Captures...,The video provides an in-depth analysis of the...
73,652b407e43e8c47e4eb481df,Israel-Palestine war: Israel says it killed Ha...,The video starts with footage of a bomb explos...
74,652b407a43e8c47e4eb481de,Bodycam Videos Show Intense Shootouts Between ...,A tense situation unfolds as a driver captures...
75,652b407543e8c47e4eb481dc,Israel-Hamas war: CCTV catches two women caugh...,A CCTV footage captures various instances of a...


In [10]:
!pip install arize["AutoEmbeddings"]

zsh:1: no matches found: arize[AutoEmbeddings]


In [7]:
from arize.pandas.embeddings import EmbeddingGenerator, UseCases

df = df.reset_index(drop=True)
df

Unnamed: 0,videoID,title,summary
0,652b44ad43e8c47e4eb48235,Israel-Hamas war: Israeli soldiers gear up to ...,An intense missile conflict unfolds in the ski...
1,652b444c43e8c47e4eb48231,Israel vs Hamas: Five Videos that Define the W...,"In this video, there are five defining moments..."
2,652b444943e8c47e4eb48230,Jewish schools CLOSE over children safety conc...,Some North London Jewish schools have closed d...
3,652b443543e8c47e4eb4822f,Israel Air Force Rains Fire On Hamas Naval HQ ...,This video shows a city being struck by a disa...
4,652b442943e8c47e4eb4822e,Fake News Over Israel-Hamas War | Vantage with...,The video explores the prevalence of digital m...
...,...,...,...
72,652b408543e8c47e4eb481e0,Israel-Hamas War: Watch | Drone Video Captures...,The video provides an in-depth analysis of the...
73,652b407e43e8c47e4eb481df,Israel-Palestine war: Israel says it killed Ha...,The video starts with footage of a bomb explos...
74,652b407a43e8c47e4eb481de,Bodycam Videos Show Intense Shootouts Between ...,A tense situation unfolds as a driver captures...
75,652b407543e8c47e4eb481dc,Israel-Hamas war: CCTV catches two women caugh...,A CCTV footage captures various instances of a...


In [8]:
generator = EmbeddingGenerator.from_use_case(
    use_case=UseCases.NLP.SUMMARIZATION,
    model_name="distilbert-base-uncased",
    # model_name="distilbert-base-uncased",
    tokenizer_max_length=512,
    batch_size=100,
)

df["summary_vector"] = generator.generate_embeddings(text_col=df["summary"])

[38;21m  arize.utils.logging | INFO | Downloading pre-trained model 'distilbert-base-uncased'[0m
[38;21m  arize.utils.logging | INFO | Downloading tokenizer for 'distilbert-base-uncased'[0m
[38;21m  arize.utils.logging | INFO | Generating embedding vectors[0m


Map:   0%|          | 0/77 [00:00<?, ? examples/s]

In [16]:
%debug

> [0;32m/Users/sasha/Documents/Personal/Hacks/IsraelPalestinianConflict/pyarrow/error.pxi[0m(100)[0;36mpyarrow.lib.check_status[0;34m()[0m



In [9]:
df

Unnamed: 0,videoID,title,summary,summary_vector
0,652b44ad43e8c47e4eb48235,Israel-Hamas war: Israeli soldiers gear up to ...,An intense missile conflict unfolds in the ski...,"[-0.08562961220741272, -0.0525701567530632, -0..."
1,652b444c43e8c47e4eb48231,Israel vs Hamas: Five Videos that Define the W...,"In this video, there are five defining moments...","[-0.05311104655265808, -0.37769395112991333, -..."
2,652b444943e8c47e4eb48230,Jewish schools CLOSE over children safety conc...,Some North London Jewish schools have closed d...,"[0.08371689915657043, -0.3879491090774536, -0...."
3,652b443543e8c47e4eb4822f,Israel Air Force Rains Fire On Hamas Naval HQ ...,This video shows a city being struck by a disa...,"[0.10667204856872559, -0.297080397605896, -0.2..."
4,652b442943e8c47e4eb4822e,Fake News Over Israel-Hamas War | Vantage with...,The video explores the prevalence of digital m...,"[0.07996666431427002, -0.2996918261051178, -0...."
...,...,...,...,...
72,652b408543e8c47e4eb481e0,Israel-Hamas War: Watch | Drone Video Captures...,The video provides an in-depth analysis of the...,"[-0.03756340965628624, -0.24474535882472992, -..."
73,652b407e43e8c47e4eb481df,Israel-Palestine war: Israel says it killed Ha...,The video starts with footage of a bomb explos...,"[0.07485247403383255, -0.2852363586425781, -0...."
74,652b407a43e8c47e4eb481de,Bodycam Videos Show Intense Shootouts Between ...,A tense situation unfolds as a driver captures...,"[-0.039329931139945984, -0.32355111837387085, ..."
75,652b407543e8c47e4eb481dc,Israel-Hamas war: CCTV catches two women caugh...,A CCTV footage captures various instances of a...,"[0.041810646653175354, -0.35534563660621643, -..."


In [10]:
schema = px.Schema(
    embedding_feature_column_names={
        "taylors_embedding": px.EmbeddingColumnNames(
            vector_column_name="summary_vector", raw_data_column_name="summary"
        )
    },
    feature_column_names=["videoID", "title"],
)

In [11]:
px.launch_app(px.Dataset(df, schema))

🌍 To view the Phoenix app in your browser, visit http://127.0.0.1:6060/
📺 To view the Phoenix app in a notebook, run `px.active_session().view()`
📖 For more information on how to use Phoenix, check out https://docs.arize.com/phoenix


<phoenix.session.session.ThreadSession at 0x17043ab10>

In [12]:
px.active_session().view()

📺 Opening a view to the Phoenix app. The app is running at http://127.0.0.1:6060/
