### Embedding layers in PyTorch
- Allows to insert vectors in the place of word indexes.
- As dimension expansion

### Simple embedding layer
- __num_embeddings__ - How large is the vocabulary? How many categories you encoding
- __embedding_dim__ - How many number in the vecotr you to return


In [15]:
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.preprocessing import OneHotEncoder
from torch.nn.utils.rnn import pad_sequence

In [1]:
embedding_layer = nn.Embedding(num_embeddings=10, embedding_dim=4)
optimizer = torch.optim.Adam(embedding_layer.parameters(), lr=0.001)
loss_fn = nn.MSELoss()


In [2]:
print(embedding_layer)

Embedding(10, 4)


In [3]:
input_tensor = torch.tensor([[1,2]], dtype=torch.long)
pred = embedding_layer(input_tensor)

print(input_tensor.shape)
print(pred)
print(pred.shape)

torch.Size([1, 2])
tensor([[[-0.6789,  1.1378,  0.9993,  1.0034],
         [-0.9585,  1.9641,  0.8160,  1.4132]]], grad_fn=<EmbeddingBackward0>)
torch.Size([1, 2, 4])


In [4]:
embedding_layer.weight.data

tensor([[ 1.6318,  0.4854, -0.4328, -1.6026],
        [-0.6789,  1.1378,  0.9993,  1.0034],
        [-0.9585,  1.9641,  0.8160,  1.4132],
        [-0.1024,  1.0981,  0.1542,  1.0665],
        [ 1.0539,  1.2050, -1.1417, -0.2653],
        [ 0.7194,  0.4700, -0.2606, -0.7559],
        [-0.3073, -0.4720,  0.2014, -0.7850],
        [-1.0288,  0.5153, -0.7029, -0.8905],
        [-0.4648,  0.4477, -2.1117,  0.4215],
        [ 0.0648, -0.3432,  2.3571,  0.4730]])

- The above values are just random values but in next part will see how to train these above values to generate something meaningful

### Transfering an embedding

In [13]:
# embedding lookup matrix
embedding_lookup = torch.eye(3)

embedding_layer = nn.Embedding(num_embeddings=3, embedding_dim=3)

embedding_layer.weight.data = embedding_lookup

In [14]:
input_tensor = torch.tensor([[0,1]], dtype=torch.long)
pred = embedding_layer(input_tensor)
print(pred)

tensor([[[1., 0., 0.],
         [0., 1., 0.]]], grad_fn=<EmbeddingBackward0>)


- The given output shows that we provided the program with two rows from the one-hot encoding table. This encoding is a correct one-hot encoding for the values 0 and 1, where there are up to 3 unique values possible.

### Training an embedding

In [16]:
reviews = [
    'Never coming back!',
    'Horrible service',
    'Rude waitress',
    'Cold food.',
    'Horrible food!',
    'Awesome',
    'Awesome service!',
    'Rocks!',
    'poor work',
    'Couldn\'t have done better']

# Define labels (1=negative, 0=positive)
labels = [1, 1, 1, 1, 1, 0, 0, 0, 0, 0]

In [18]:
VOCAB_SIZE = 50
encoded_reviews = [torch.tensor([hash(word) % VOCAB_SIZE for word in review.split()]) for review in reviews]

print(encoded_reviews)

[tensor([ 2, 18,  8]), tensor([44, 19]), tensor([44, 34]), tensor([15, 37]), tensor([44, 27]), tensor([11]), tensor([11, 12]), tensor([35]), tensor([4, 7]), tensor([32,  6, 41, 26])]


In [21]:
print(hash('Never'))
print(hash('Never') % 50)

8091158445156410952
2


### Inference
- As the lengths are different --> path these reviews to 4 words and truncate words beyond the fourth word

In [19]:
MAX_LENGTH = 4
padded_reviews = pad_sequence(encoded_reviews, batch_first=True, padding_value=0).narrow(1,0,MAX_LENGTH)
print(padded_reviews)

tensor([[ 2, 18,  8,  0],
        [44, 19,  0,  0],
        [44, 34,  0,  0],
        [15, 37,  0,  0],
        [44, 27,  0,  0],
        [11,  0,  0,  0],
        [11, 12,  0,  0],
        [35,  0,  0,  0],
        [ 4,  7,  0,  0],
        [32,  6, 41, 26]])


### Model

In [26]:
model = nn.Sequential(
    nn.Embedding(VOCAB_SIZE, 8),
    nn.Flatten(),
    nn.Linear(8 * MAX_LENGTH, 1),
    nn.Sigmoid()
)

criterion = nn.BCELoss()
optimizer = optim.Adam(model.parameters())

epochs = 100
for epoch in range(epochs):
    optimizer.zero_grad()
    outputs = model(padded_reviews.long())
    loss = criterion(outputs.squeeze(), torch.tensor(labels, dtype=torch.float))
    loss.backward()
    optimizer.step()

In [27]:
# Evaluation
with torch.no_grad():
    outputs = model(padded_reviews.long())
    predictions = (outputs > 0.5).float().squeeze()
    accuracy = (predictions == torch.tensor(labels)).float().mean().item()
    loss_value = criterion(outputs.squeeze(), torch.tensor(labels, dtype=torch.float)).item()

print(f'Accuracy: {accuracy}')
print(f'Log-loss: {loss_value}')

Accuracy: 1.0
Log-loss: 0.3765341341495514


## Prompt engineering

#### Important things:
- When asking a question to LLM keep in mind: framing, clarity and precision
- Clarity
- Specificity
- Iteration