<a href="https://colab.research.google.com/github/kartoone/cosc470/blob/main/examples/llm/tf-keras-embeddings.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setup

In [1]:
import keras
from keras import layers
from keras import ops

You can also create a Sequential model incrementally via the `add()` method:

In [2]:
model = keras.Sequential()
model.add(keras.Input(shape=(5,)))  # 5 words in our vocab
model.add(layers.Dense(2, activation="linear")) # two numbers to represent each word
model.add(layers.Dense(5, activation="softmax")) # 5 words in our vocab
model.summary()

Note that the `Input` object is not displayed as part of `model.layers`, since
it isn't a layer:

In [3]:
model.compile(loss="categorical_crossentropy", optimizer="adam")

In [4]:
import numpy as np

weight_embedding_model = keras.Model(
    inputs=model.inputs,
    outputs=[layer.output for layer in model.layers],
)

# Call feature extractor on test input.
x = np.array([[0., 0., 0., 1., 0.]])
print(x)
features = weight_embedding_model(x)
print(features)



[[0. 0. 0. 1. 0.]]


Expected: ['keras_tensor']
Received: inputs=Tensor(shape=(1, 5))


[<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[ 0.81739306, -0.40693468]], dtype=float32)>, <tf.Tensor: shape=(1, 5), dtype=float32, numpy=
array([[0.23195596, 0.07277341, 0.2695857 , 0.16057797, 0.26510698]],
      dtype=float32)>]


Now we need to train ... our training dataset is word for the input and expected word for the output. But the notion of a word is our 1-hot encoding for each word in our vocab ... from the video here is the vocab:

In [5]:
vocab = ["Troll 2","gymkata","is","great","<EOS>"]
vocabonehot = [[1,0,0,0,0],[0,1,0,0,0],[0,0,1,0,0],[0,0,0,1,0],[0,0,0,0,1]]

# based on just two sentences: "Troll 2 is great" and "Gymkata is great", here is our training dataset
training_inputs = np.array([vocabonehot[0],vocabonehot[1],vocabonehot[2],vocabonehot[2],vocabonehot[2],vocabonehot[3]])
training_outputs = np.array([vocabonehot[2],vocabonehot[2],vocabonehot[3],vocabonehot[3],vocabonehot[3],vocabonehot[4]])
model.fit(training_inputs, training_outputs, batch_size=2, epochs=1000);
features = weight_embedding_model(np.array([vocabonehot[0], vocabonehot[1], vocabonehot[2], vocabonehot[3], vocabonehot[4]]))
print(features)


Epoch 1/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 13ms/step - loss: 1.6295
Epoch 2/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 1.6210 
Epoch 3/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - loss: 1.4634
Epoch 4/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - loss: 1.6042
Epoch 5/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - loss: 1.5671
Epoch 6/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - loss: 1.5878
Epoch 7/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step - loss: 1.4840 
Epoch 8/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step - loss: 1.4356 
Epoch 9/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - loss: 1.4397
Epoch 10/1000
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - loss: 1.539

Expected: ['keras_tensor']
Received: inputs=Tensor(shape=(5, 5))


## Positional encoding

We can do it manually (1st code cell) or Keras has a layer we can just use that takes care of all the details (2nd code cell).


In [24]:
import numpy as np
seq_length = 10 # maximum sequence length ... i.e,. we have tiny vocabulary and even tinier sentences
num_hiddens = 2 # we are using two weights in our token embedding
sinx = np.sin(np.arange(1, seq_length))
cosx = np.cos(np.arange(1, seq_length))
#print(sinx)
#print(cosx)

ourembeddings = features[0]
#print(ourembeddings)

# example sentence1: Troll 2 is great ... equivalent to [0 2 3]
# example sentence2: Gymkata is great ... equivalent to [1 2 3]
# example sentence3: great is great ... equivalent to [3 2 3]
sentence1 = np.array([ourembeddings[0], ourembeddings[2], ourembeddings[3]])
sentence2 = np.array([ourembeddings[1], ourembeddings[2], ourembeddings[3]])
sentence3 = np.array([ourembeddings[3], ourembeddings[2], ourembeddings[0]])
sentence4 = np.array([ourembeddings[3], ourembeddings[2], ourembeddings[1]])
sentence5 = np.array([ourembeddings[3], ourembeddings[2], ourembeddings[3]])
#print(sentence)

positionsvals = np.array([[sinx[0], cosx[0]], [sinx[1], cosx[1]], [sinx[2], cosx[2]]])
#print(positionsvals)

sentencewithpositions1 = sentence1 + positionsvals
sentencewithpositions2 = sentence2 + positionsvals
sentencewithpositions3 = sentence3 + positionsvals
sentencewithpositions4 = sentence4 + positionsvals
sentencewithpositions5 = sentence5 + positionsvals
print(sentencewithpositions1)
print(sentencewithpositions2)
print(sentencewithpositions3)
print(sentencewithpositions4)
print(sentencewithpositions5)

[[ 2.07737115 -1.84229347]
 [-0.00891302  1.64994041]
 [ 3.49662071 -0.85411076]]
[[ 1.11712593 -1.34437439]
 [-0.00891302  1.64994041]
 [ 3.49662071 -0.85411076]]
[[ 4.19697168  0.67618404]
 [-0.00891302  1.64994041]
 [ 1.37702017 -3.37258827]]
[[ 4.19697168  0.67618404]
 [-0.00891302  1.64994041]
 [ 0.41677495 -2.87466919]]
[[ 4.19697168  0.67618404]
 [-0.00891302  1.64994041]
 [ 3.49662071 -0.85411076]]


In [None]:
import keras_hub

layer = keras_hub.layers.PositionEmbedding(sequence_length=10)
