### Downloading and running LLM

The first step is to load our model onto the GPU for faster inference. Note that we load the model and tokenizer separately and keep them as such so that we can explore them separately.

In [1]:
import warnings
warnings.filterwarnings("ignore")


from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model_name = "microsoft/Phi-3-mini-4k-instruct"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="cuda",
    torch_dtype="auto",
    trust_remote_code=False,
)

tokenizer = AutoTokenizer.from_pretrained(model_name)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [2]:
from transformers import pipeline

# Create a pipeline
generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    return_full_text=False,
    max_new_tokens=500,
    do_sample=False
)

Device set to use cuda
The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


In [3]:
# The prompt (user input / query)
messages = [
    {"role": "user", "content": "Create a funny joke about chickens."}
]

# Generate output
output = generator(messages)
output[0]['generated_text']

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


' Why did the chicken join the band? Because it had the drumsticks!'

In [4]:
prompt = "Write an email apologizing to Sarah for the tragic gardening mishap. Explain how it happened.<|assistant|>"

# Tokenize the input prompt
input_ids = tokenizer(prompt, return_tensors="pt", add_special_tokens=True).input_ids.to("cuda")

In [5]:
input_ids

tensor([[14350,   385,  4876, 27746,  5281,   304, 19235,   363,   278, 25305,
           293, 16423,   292,   286,   728,   481, 29889, 12027,  7420,   920,
           372,  9559, 29889, 32001]], device='cuda:0')

In [6]:
for id in input_ids[0]:
   print(tokenizer.decode(id))

Write
an
email
apolog
izing
to
Sarah
for
the
trag
ic
garden
ing
m
ish
ap
.
Exp
lain
how
it
happened
.
<|assistant|>


In [7]:
# generate text
output = model.generate(
    inputs=input_ids,
    max_new_tokens = 20
)

output

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


tensor([[14350,   385,  4876, 27746,  5281,   304, 19235,   363,   278, 25305,
           293, 16423,   292,   286,   728,   481, 29889, 12027,  7420,   920,
           372,  9559, 29889, 32001,  3323,   622, 29901,   317,  3742,   406,
          6225, 11763,   363,   278, 19906,   292,   341,   728,   481,    13,
            13,    13, 29928,   799]], device='cuda:0')

In [8]:
tokenizer.decode(output[0])

'Write an email apologizing to Sarah for the tragic gardening mishap. Explain how it happened.<|assistant|> Subject: Sincere Apologies for the Gardening Mishap\n\n\nDear'

### Text Embeddings (For Sentences and Whole Documents)

In [9]:
from sentence_transformers import SentenceTransformer

# Load model
model = SentenceTransformer('sentence-transformers/all-mpnet-base-v2')

# convert text to embeddings
embed = model.encode("The Lord of the Rings")

In [10]:
embed.shape

(768,)

### Understanding Embeddings

In [11]:
import gensim.downloader as api

model = api.load("glove-wiki-gigaword-50")



IOPub message rate exceeded.
The Jupyter server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--ServerApp.iopub_msg_rate_limit`.

Current values:
ServerApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
ServerApp.rate_limit_window=3.0 (secs)





IOPub message rate exceeded.
The Jupyter server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--ServerApp.iopub_msg_rate_limit`.

Current values:
ServerApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
ServerApp.rate_limit_window=3.0 (secs)






In [12]:
model.most_similar("king", topn=10)

[('prince', 0.8236179351806641),
 ('queen', 0.7839043140411377),
 ('ii', 0.7746230363845825),
 ('emperor', 0.7736247777938843),
 ('son', 0.766719400882721),
 ('uncle', 0.7627150416374207),
 ('kingdom', 0.7542160749435425),
 ('throne', 0.7539913654327393),
 ('brother', 0.7492411136627197),
 ('ruler', 0.7434253692626953)]

### Recommending songs using embeddings

In [13]:
import pandas as pd
from urllib import request

# Get the playlist dataset file
data = request.urlopen('https://storage.googleapis.com/maps-premium/dataset/yes_complete/train.txt')

lines = data.read().decode("utf-8").split("\n")[2:]

# remove playlist with only one song
playlist = [s.rstrip().split() for s in lines if len(s.split()) > 1]

# Load song metadata
songs_file = request.urlopen('https://storage.googleapis.com/maps-premium/dataset/yes_complete/song_hash.txt')
songs_file = songs_file.read().decode("utf-8").split('\n')
songs = [s.rstrip().split('\t') for s in songs_file]
songs_df = pd.DataFrame(data=songs, columns = ['id', 'title', 'artist'])
songs_df = songs_df.set_index('id')

In [14]:
songs_df.head()

Unnamed: 0_level_0,title,artist
id,Unnamed: 1_level_1,Unnamed: 2_level_1
0,Gucci Time (w\/ Swizz Beatz),Gucci Mane
1,Aston Martin Music (w\/ Drake & Chrisette Mich...,Rick Ross
2,Get Back Up (w\/ Chris Brown),T.I.
3,Hot Toddy (w\/ Jay-Z & Ester Dean),Usher
4,Whip My Hair,Willow


In [15]:
songs_df.shape

(75263, 2)

In [16]:
from gensim.models import Word2Vec

model = Word2Vec(
    playlist, vector_size=32, window=20, negative=50, min_count=1, workers=4
)

In [17]:
songs_df.iloc[2172]

title     Fade To Black
artist        Metallica
Name: 2172 , dtype: object

In [18]:
song_id = 2172

# Ask the model for songs similar to song #2172
similar_songs = model.wv.most_similar(positive=str(song_id))
similar_songs

[('2849', 0.9979673027992249),
 ('5586', 0.996990442276001),
 ('1922', 0.9963552951812744),
 ('20010', 0.9954861402511597),
 ('2063', 0.9950186014175415),
 ('6626', 0.9949555993080139),
 ('6641', 0.994841992855072),
 ('5634', 0.9947130084037781),
 ('1954', 0.9945917129516602),
 ('10105', 0.9944607615470886)]

In [34]:
similar_dict = {}
for id, _ in similar_songs:
    similar_dict[songs_df.iloc[int(id)]['title']] = songs_df.iloc[int(id)]['artist']

pd.DataFrame(data=list(similar_dict.items()), columns=['Song', 'Artist'])

Unnamed: 0,Song,Artist
0,Run To The Hills,Iron Maiden
1,The Last In Line,Dio
2,One,Metallica
3,Let It Die,Ozzy Osbourne
4,Cemetery Gates,Pantera
5,Blackout,Scorpions
6,Shout At The Devil,Motley Crue
7,Mr. Brownstone,Guns N' Roses
8,The Number Of The Beast,Iron Maiden
9,Three Lock Box,Sammy Hagar


In [42]:
def get_similar(song_id):
    similar_dict = {}
    similar_songs = model.wv.most_similar(positive=str(song_id), topn=5)
    for id, _ in similar_songs:
        similar_dict[songs_df.iloc[int(id)]['title']] = songs_df.iloc[int(id)]['artist']
    return pd.DataFrame(data = list(similar_dict.items()), columns=['Song', 'Artist'])

In [43]:
get_similar(5586)

Unnamed: 0,Song,Artist
0,Cemetery Gates,Pantera
1,Symphony Of Destruction,Megadeth
2,Mr. Brownstone,Guns N' Roses
3,Bad Company,Five Finger Death Punch
4,Fade To Black,Metallica


In [44]:
get_similar(1922)

Unnamed: 0,Song,Artist
0,Shout At The Devil,Motley Crue
1,Blackout,Scorpions
2,Lies Of The Beautiful People,Sixx A.M.
3,Symphony Of Destruction,Megadeth
4,The Last In Line,Dio
