<a href="https://colab.research.google.com/github/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_11_05_embedding.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Ammended feeding for DKT: Applications of Deep Neural Networks

**Module 1: Embedding layer for Deep Knowledge Tracing**
* Researcher: [Jose Naranjo](https://www.linkedin.com/in/jolunavi1/)

# Google CoLab Instructions

The following code ensures that Google CoLab is running the correct version of TensorFlow.

In [1]:
try:
    %tensorflow_version 2.x
    COLAB = True
    print("Note: using Google CoLab")
except:
    print("Note: not using Google CoLab")
    COLAB = False

Note: not using Google CoLab


# What are Embedding Layers in Keras

[Embedding Layers](https://keras.io/layers/embeddings/) are a handy feature of Keras that allows the program to automatically insert additional information into the data flow of your neural network. Word2Vec can expand words to any dimension vector, for example 300 dimensions. An embedding layer would automatically allow you to insert these 300-dimension vectors in the place of word indexes.  

Programmers often use embedding layers with Natural Language Processing (NLP); however, you can use these layers when you wish to insert a lengthier vector in an index value place. In some ways, you can think of an embedding layer as dimension expansion or reduction. However, the hope is that these additional/reduced dimensions provide more information to the model and provide a better score.

## Simple Embedding Layer Example

* **input_dim** = How large is the vocabulary?  How many categories are you encoding? This parameter is the number of items in your "lookup table."
* **output_dim** = How many numbers in the vector you wish to return? Number of dimentions to be used.
* **input_length** = How many items are in the input feature vector that you need to transform?

Now we create a neural network with a vocabulary size of 10, which will reduce those values between 0-9 to 4 number vectors. This neural network does nothing more than passing the embedding on to the output. But it does let us see what the embedding is doing. Each feature vector coming in will have two such features.

In [1]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding
import numpy as np

model = Sequential()
embedding_layer = Embedding(input_dim=10, output_dim=4, input_length=2)
model.add(embedding_layer)
model.compile('adam', 'mse')


2022-12-03 14:49:52.353808: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-03 14:49:59.149943: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-03 14:49:59.162289: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.


Let's take a look at the structure of this neural network to see what is happening inside it.

In [2]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 2, 4)              40        
                                                                 
Total params: 40
Trainable params: 40
Non-trainable params: 0
_________________________________________________________________


For this neural network, which is just an embedding layer, the input is a vector of size 2. These two inputs are integer numbers from 0 to 9 (corresponding to the requested input_dim quantity of 10 values). Looking at the summary above, we see that the embedding layer has 40 parameters. This value comes from the embedded lookup table that contains four amounts (output_dim) for each of the 10 (input_dim) possible integer values for the two inputs. The output is 2 (input_length) length 4 (output_dim) vectors, resulting in a total output size of 8, which corresponds to the Output Shape given in the summary above.

Now, let us query the neural network with two rows. The input is two integer values, as was specified when we created the neural network.

In [11]:
input_data = np.array([
    [1, 2]
])

pred = model.predict(input_data)

print(input_data.shape)
print(pred)


(1, 2)
[[[ 0.020257   -0.00039854  0.00420161  0.04803732]
  [-0.03823198 -0.02339444 -0.04591949  0.02298042]]]


Here we see two length-4 vectors that Keras looked up for each input integer. Recall that Python arrays are zero-based. Keras replaced the value of 1 with the second row of the 10 x 4 lookup matrix. Similarly, Keras returned the value of 2 by the third row of the lookup matrix. The following code displays the lookup matrix in its entirety. The embedding layer performs no mathematical operations other than inserting the correct row from the lookup table.

In [12]:
embedding_layer.get_weights()

[array([[ 0.00237119,  0.03304969, -0.04085678,  0.03544824],
        [ 0.020257  , -0.00039854,  0.00420161,  0.04803732],
        [-0.03823198, -0.02339444, -0.04591949,  0.02298042],
        [-0.00975728, -0.02111838, -0.0238485 , -0.01319782],
        [ 0.0291988 ,  0.03084047, -0.00662228, -0.00288951],
        [-0.03544483,  0.04003379, -0.04505651,  0.00737227],
        [-0.02297102,  0.02879001,  0.02823186,  0.0113917 ],
        [-0.04229281, -0.01038597,  0.02475606, -0.00812998],
        [-0.02630647, -0.00332817,  0.03093815, -0.03831846],
        [ 0.01070474,  0.02937539, -0.02465727,  0.02722931]],
       dtype=float32)]

The values above are random parameters that Keras generated as starting points.  Generally, we will transfer an embedding or train these random values into something useful.  The following section demonstrates how to embed a hand-coded embedding. 

## Transferring An Embedding

Now, we see how to hard-code an embedding lookup that performs a simple one-hot encoding.  One-hot encoding would transform the input integer values of 0, 1, and 2 to the vectors $[1,0,0]$, $[0,1,0]$, and $[0,0,1]$ respectively. The following code replaced the random lookup values in the embedding layer with this one-hot coding-inspired lookup table.

In [13]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding
import numpy as np

embedding_lookup = np.array([
    [1, 0, 0],
    [0, 1, 0],
    [0, 0, 1]
])

model = Sequential()
embedding_layer = Embedding(input_dim=3, output_dim=3, input_length=2)
model.add(embedding_layer)
model.compile('adam', 'mse')

embedding_layer.set_weights([embedding_lookup])


We have the following parameters for the Embedding layer:
    
* input_dim=3 - There are three different integer categorical values allowed.
* output_dim=3 - Three columns represent a categorical value with three possible values per one-hot encoding.
* input_length=2 - The input vector has two of these categorical values.

We query the neural network with two categorical values to see the lookup performed.

In [None]:
input_data = np.array([
    [0, 1]
])

pred = model.predict(input_data)

print(input_data.shape)
print(pred)


(1, 2)
[[[1. 0. 0.]
  [0. 1. 0.]]]


The given output shows that we provided the program with two rows from the one-hot encoding table. This encoding is a correct one-hot encoding for the values 0 and 1, where there are up to 3 unique values possible. 

The following section demonstrates how to train this embedding lookup table.

## Training an Embedding

First, we make use of the following imports.

In [26]:
from numpy import array
from tensorflow.keras.preprocessing.text import one_hot
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Embedding, Dense

We create a neural network that classifies restaurant reviews according to positive or negative.  This neural network can accept strings as input, such as given here.  This code also includes positive or negative labels for each review.

In [27]:
# Define 10 resturant reviews.
reviews = [
    'Never coming back!',
    'Horrible service',
    'Rude waitress',
    'Cold food.',
    'Horrible food!',
    'Awesome',
    'Awesome service!',
    'Rocks!',
    'poor work',
    'Couldn\'t have done better']

# Define labels (1=negative, 0=positive)
labels = array([1, 1, 1, 1, 1, 0, 0, 0, 0, 0])


In [28]:
labels

array([1, 1, 1, 1, 1, 0, 0, 0, 0, 0])

In [38]:
yy

NameError: name 'yy' is not defined

In [76]:
#####jolunavi
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [129]:
#####jolunavi
df = pd.read_csv('preguntas_estudent_1261.csv')#,index_col='Pregunta')
df.index.freq='MS'

In [130]:
preguntas_pre = df['Question'].to_numpy()
respuestas = df['Answer'].to_numpy()

In [143]:
d_codigos = {} #This dictionary will let us find the question in the original data set
preguntas = list() #contains the list of the questions that student answered with a different numbering format
new_id = 1
for item in preguntas_pre:
    if item not in d_codigos.keys():
        d_codigos[item] = id
        new_id += 1         
    preguntas.append(d_codigos[item])
    

In [152]:
len(preguntas), len(respuestas)

(1261, 1261)

In [159]:
diff_questions = len(set(preguntas))
print(" The total number of different questions is:", diff_questions)


 The total number of different questions is: 71


In [162]:
#####jolunavi

skills = pd.read_csv('data/assist2009_updated/assist2009_updated_skill_mapping.txt', delimiter = "\t", names=["ID", "Skill"] )#,index_col='Pregunta')
skills

Unnamed: 0,ID,Skill
0,31,Circumference
1,65,Scientific Notation
2,38,Rounding
3,19,Multiplication Fractions
4,106,Finding Slope From Situation
...,...,...
105,18,Addition and Subtraction Positive Decimals
106,78,Rate
107,54,Interior Angles Triangle
108,70,Equation Solving More Than Two Steps


In [163]:
sk_dic = skills.sort_values(by=["ID"])
sk_dic
sk_dictionary = sk_dic.set_index('ID')['Skill'].to_dict()

In [166]:
sk_dictionary

{1: 'Area Trapezoid',
 2: 'Area Irregular Figure',
 3: 'Probability of Two Distinct Events',
 4: 'Table',
 5: 'Median',
 6: 'Stem and Leaf Plot',
 7: 'Mode',
 8: 'Mean',
 9: 'Range',
 10: 'Venn Diagram',
 11: 'Histogram as Table or Graph',
 12: 'Circle Graph',
 13: 'Equivalent Fractions',
 14: 'Proportion',
 15: 'Fraction Of',
 16: 'Probability of a Single Event',
 17: 'Scatter Plot',
 18: 'Addition and Subtraction Positive Decimals',
 19: 'Multiplication Fractions',
 20: 'Addition and Subtraction Integers',
 21: 'Multiplication and Division Integers',
 22: 'Addition Whole Numbers',
 23: 'Absolute Value',
 24: 'Addition and Subtraction Fractions',
 25: 'Subtraction Whole Numbers',
 26: 'Equation Solving Two or Fewer Steps',
 27: 'Order of Operations +,-,/,* () positive reals',
 28: 'Calculations with Similar Figures',
 29: 'Counting Methods',
 30: 'Ordering Fractions',
 31: 'Circumference ',
 32: 'Box and Whisker',
 33: 'Ordering Integers',
 34: 'Conversion of Fraction Decimals Percent

In [168]:
d_skills = {}
for item  in d_codigos.keys():
    d_skills[d_codigos[item]] = sk_dictionary[item]
d_skills

{1: 'Addition Whole Numbers',
 2: 'Addition and Subtraction Integers',
 3: 'Multiplication and Division Integers',
 4: 'Addition and Subtraction Positive Decimals',
 5: 'Ordering Integers',
 6: 'Ordering Positive Decimals',
 7: 'Ordering Fractions',
 8: 'Least Common Multiple',
 9: 'Addition and Subtraction Fractions',
 10: 'Multiplication Fractions',
 11: 'Division Fractions',
 12: 'Order of Operations +,-,/,* () positive reals',
 13: 'Order of Operations All',
 14: 'Exponents',
 15: 'Conversion of Fraction Decimals Percents',
 16: 'Proportion',
 17: 'Equation Solving Two or Fewer Steps',
 18: 'Solving for a variable',
 19: 'Pattern Finding ',
 20: 'Subtraction Whole Numbers',
 21: 'Write Linear Equation from Situation',
 22: 'Area Circle',
 23: 'Area Trapezoid',
 24: 'Unit Conversion Within a System',
 25: 'Perimeter of a Polygon',
 26: 'Finding Percents',
 27: 'Calculations with Similar Figures',
 28: 'Circumference ',
 29: 'Percent Of',
 30: 'Unit Rate',
 31: 'Equation Solving More

In [44]:
#####jolunavi
train = preguntas[:1020]
test = preguntas[1020:]
train_y = respuestas[20:1020]

In [45]:
len(train_y)

1000

In [46]:
VOCAB_SIZE = len(set(preguntas))
VOCAB_SIZE

67

In [47]:
from keras.preprocessing.sequence import TimeseriesGenerator

In [48]:
# define generator, consider the previos 20 answer, number of timesteps
n_input = 20
n_features = 1 # in this particular case is 2; question and answer
generator = TimeseriesGenerator(train, train, length=n_input, batch_size=1000)

In [49]:
h = 0
X,y = generator[h]
yy = train_y[h]
print(f'Given the Array: \n{X.flatten()}')
print(f'Predict this y: \n {y}')
print(f'Predict this y: \n {yy}')

Given the Array: 
[22 22 22 ... 29  3 29]
Predict this y: 
 [20 20 21 21 20 20 20 20 20 20 21 21 18 33 33 37 37 37 30 30 37 30 69 69
 69 69 69 69 69 69 69 69 69 24 24 69 69 69 69 69 69 24 24 24 24 19 19 19
 19 19 19 19 19 60 59 59 59 59 27 19 34 34 34 34 34 34 34 41 41 41 35 21
 14 14 14 14 14 14 36 36 36 36 26 26 26 26 26 26 26 26 26 26 26 26 70 70
 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70
 70 70 70 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26
 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 26 70 70 70 70 70
 70 70 70 70 70 70 70 70 70 70 26 26 14 14 14 14 14 14 14 26 26 26 26 26
 26 26 26 26 26 70 70 70 70 70 70 70 70 70 70 70 70 70 70 70 26 26 26 26
 26 26 26 26 26 26 58 58 26 42 42 43 26 43 43 43 43 43 66 66 66 66 66 66
 66 66 66 66 66 66 66 66 68 68 68 68 68 68  1  1  1  1  1  1  1  1  1  1
  1  1  1  2  2  1 82 31 80 18 80 18 80 18 80 18 80 18 57 28 28 28 28  1
  1  1  1  1  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2 81 

In [50]:

A = X.astype(int)
A = A.astype(str)
A

array([['22', '22', '22', ..., '20', '20', '20'],
       ['22', '22', '22', ..., '20', '20', '20'],
       ['22', '22', '22', ..., '20', '20', '20'],
       ...,
       ['50', '50', '50', ..., '29', '3', '29'],
       ['50', '50', '29', ..., '3', '29', '3'],
       ['50', '29', '3', ..., '29', '3', '29']], dtype='<U21')

In [51]:
# number of samples
print('Samples: %d' % len(generator))

Samples: 1


Notice that the second to the last label is incorrect.  Errors such as this are not too out of the ordinary, as most training data could have some noise.

We define a vocabulary size of 50 words.  Though we do not have 50 words, it is okay to use a value larger than needed.  If there are more than 50 words, the least frequently used words in the training set are automatically dropped by the embedding layer during training.  For input, we one-hot encode the strings.  We use the TensorFlow one-hot encoding method here rather than Scikit-Learn. Scikit-learn would expand these strings to the 0's and 1's as we would typically see for dummy variables.  TensorFlow translates all words to index values and replaces each word with that index.

In [52]:
a = reviews.lower()

AttributeError: 'list' object has no attribute 'lower'

In [53]:
A = X.astype(int)
reviews = A.astype(str)
reviews


array([['22', '22', '22', ..., '20', '20', '20'],
       ['22', '22', '22', ..., '20', '20', '20'],
       ['22', '22', '22', ..., '20', '20', '20'],
       ...,
       ['50', '50', '50', ..., '29', '3', '29'],
       ['50', '50', '29', ..., '3', '29', '3'],
       ['50', '29', '3', ..., '29', '3', '29']], dtype='<U21')

In [54]:
encoded_reviews = [one_hot(d, VOCAB_SIZE) for d in reviews]
print(f"Encoded reviews: {encoded_reviews}")

AttributeError: 'numpy.ndarray' object has no attribute 'lower'

The program one-hot encodes these reviews to word indexes; however, their lengths are different.  We pad these reviews to 4 words and truncate any words beyond the fourth word.

In [55]:
X.shape

(1000, 20)

In [65]:
#MAX_LENGTH = 20

padded_reviews = pad_sequences(X, maxlen=20, padding='post', value=-1)
print(padded_reviews)


[[22 22 22 ... 20 20 20]
 [22 22 22 ... 20 20 20]
 [22 22 22 ... 20 20 20]
 ...
 [50 50 50 ... 29  3 29]
 [50 50 29 ...  3 29  3]
 [50 29  3 ... 29  3 29]]


As specified by the **padding=post** setting, each review is padded by appending zeros at the end, as specified by the **padding=post** setting.

Next, we create a neural network to learn to classify these reviews. 

In [57]:
model = Sequential()
embedding_layer = Embedding(VOCAB_SIZE, 3, input_length=20)
model.add(embedding_layer)
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy',
              metrics=['acc'])

print(model.summary())


2022-12-03 16:20:00.370790: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-03 16:20:00.417380: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 20, 3)             201       
                                                                 
 flatten (Flatten)           (None, 60)                0         
                                                                 
 dense (Dense)               (None, 1)                 61        
                                                                 
Total params: 262
Trainable params: 262
Non-trainable params: 0
_________________________________________________________________
None


This network accepts four integer inputs that specify the indexes of a padded movie review. The first embedding layer converts these four indexes into four length vectors 8. These vectors come from the lookup table that contains 50 (VOCAB_SIZE) rows of vectors of length 8. This encoding is evident by the 400 (8 times 50) parameters in the embedding layer. The output size from the embedding layer is 32 (4 words expressed as 8-number embedded vectors). A single output neuron is connected to the embedding layer by 33 weights (32 from the embedding layer and a single bias neuron). Because this is a single-class classification network, we use the sigmoid activation function and binary_crossentropy.

The program now trains the neural network. The embedding lookup and dense 33 weights are updated to produce a better score.

In [60]:
padded_reviews[0,:]

array([22, 22, 22, 22, 22, 22, 22, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20,
       20, 20, 20], dtype=int32)

In [62]:
train_y.shape

(1000,)

In [92]:
# fit the model
model.fit(padded_reviews, train_y, epochs=100, verbose=1)


Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<keras.callbacks.History at 0x7f3feefc13a0>

We can see the learned embeddings.  Think of each word's vector as a location in the 8 dimension space where words associated with positive reviews are close to other words.  Similarly, training places negative reviews close to each other.  In addition to the training setting these embeddings, the 33 weights between the embedding layer and output neuron similarly learn to transform these embeddings into an actual prediction.  You can see these embeddings here.

In [93]:
print(embedding_layer.get_weights()[0].shape)
print(embedding_layer.get_weights())


(200, 3)
[array([[-4.18242067e-03, -3.07187680e-02, -3.80664468e-02],
       [ 5.44511415e-02, -1.17614917e-01, -2.35930711e-01],
       [-8.91648084e-02,  2.84456074e-01,  1.87652512e-03],
       [ 7.13545457e-02, -3.95735055e-02,  1.01152949e-01],
       [ 2.21871272e-01,  4.23231423e-01, -6.13450706e-01],
       [ 2.89976627e-01, -7.67723247e-02,  3.30943912e-02],
       [-9.60785300e-02,  3.87722492e-01, -1.38818249e-01],
       [ 3.56975108e-01, -7.66888559e-01,  1.24915504e+00],
       [-2.04668920e-02, -3.58094901e-01,  3.43151353e-02],
       [ 1.73030689e-01,  9.91169438e-02, -2.47010123e-02],
       [-6.22761552e-04, -3.15936208e-01,  1.76047370e-01],
       [-3.56514692e-01,  2.39314631e-01,  1.78394198e-01],
       [-6.17252290e-01,  8.88295889e-01, -1.04584348e+00],
       [-4.29872751e-01,  9.83202517e-01, -9.43970799e-01],
       [-3.19303758e-02,  4.88917828e-02, -3.90429720e-02],
       [-6.40143871e-01, -1.39374837e-01, -1.76609182e+00],
       [-9.73369852e-02,  2.88

We can now evaluate this neural network's accuracy, including the embeddings and the learned dense layer.  

In [95]:
loss, accuracy = model.evaluate(padded_reviews, train_y, verbose=1)
print(f'Accuracy: {accuracy}')


Accuracy: 0.8309999704360962


The accuracy is a perfect 1.0, indicating there is likely overfitting. It would be good to use early stopping to not overfit for a more complex data set.

In [96]:
print(f'Log-loss: {loss}')


Log-loss: 0.3672056198120117


However, the loss is not perfect. Even though the predicted probabilities indicated a correct prediction in every case, the program did not achieve absolute confidence in each correct answer. The lack of confidence was likely due to the small amount of noise (previously discussed) in the data set. Some words that appeared in both positive and negative reviews contributed to this lack of absolute certainty. 

In [133]:
from keras.models import Model
model2= Model(inputs=model.input, outputs=model.layers[-3].output)

In [137]:
model2.summary()

Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_1_input (InputLay  [(None, 20)]             0         
 er)                                                             
                                                                 
 embedding_1 (Embedding)     (None, 20, 3)             600       
                                                                 
Total params: 600
Trainable params: 600
Non-trainable params: 0
_________________________________________________________________


In [138]:
output = model2.predict(padded_reviews)
output



array([[[-0.12713255, -0.6571806 ,  0.17732362],
        [-0.12713255, -0.6571806 ,  0.17732362],
        [-0.12713255, -0.6571806 ,  0.17732362],
        ...,
        [ 0.1278804 ,  0.05947694,  0.19509049],
        [ 0.1278804 ,  0.05947694,  0.19509049],
        [ 0.1278804 ,  0.05947694,  0.19509049]],

       [[-0.12713255, -0.6571806 ,  0.17732362],
        [-0.12713255, -0.6571806 ,  0.17732362],
        [-0.12713255, -0.6571806 ,  0.17732362],
        ...,
        [ 0.1278804 ,  0.05947694,  0.19509049],
        [ 0.1278804 ,  0.05947694,  0.19509049],
        [ 0.1278804 ,  0.05947694,  0.19509049]],

       [[-0.12713255, -0.6571806 ,  0.17732362],
        [-0.12713255, -0.6571806 ,  0.17732362],
        [-0.12713255, -0.6571806 ,  0.17732362],
        ...,
        [ 0.1278804 ,  0.05947694,  0.19509049],
        [ 0.1278804 ,  0.05947694,  0.19509049],
        [ 0.1278804 ,  0.05947694,  0.19509049]],

       ...,

       [[-0.12113111, -0.17244482,  0.16385284],
        [-0

In [139]:
output.shape

(1000, 20, 3)

In [151]:
output[3,:,:]

array([[-0.12713255, -0.6571806 ,  0.17732362],
       [-0.12713255, -0.6571806 ,  0.17732362],
       [-0.12713255, -0.6571806 ,  0.17732362],
       [-0.12713255, -0.6571806 ,  0.17732362],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [ 0.1278804 ,  0.05947694,  0.19509049],
       [-0.24345031,  0.13167939, -0.39278865]], dtype=float32)

In [142]:
padded_reviews.shape

(1000, 20)

In [150]:
padded_reviews[3,:]

array([22, 22, 22, 22, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20,
       20, 20, 21], dtype=int32)

In [152]:
key = padded_reviews[3,19] 
value = output[3,19,:]

In [145]:
dictionary = {key:value}
dictionary

{22: array([-0.12713255, -0.6571806 ,  0.17732362], dtype=float32)}

In [153]:
dictionary[key] = value
dictionary

{22: array([-0.12713255, -0.6571806 ,  0.17732362], dtype=float32),
 20: array([0.1278804 , 0.05947694, 0.19509049], dtype=float32),
 21: array([-0.24345031,  0.13167939, -0.39278865], dtype=float32)}

In [None]:
# 22	Addition Whole Numbers
# 20	Addition and Subtraction Integers
# 21	Multiplication and Division Integers

In [158]:
A = np.array(dictionary[22])
B = np.array(dictionary[20])
C = np.array(dictionary[21])

In [159]:
from numpy.linalg import norm
# compute cosine similarity
cosine = np.dot(A,B)/(norm(A)*norm(B))
print("Cosine Similarity:", cosine)

Cosine Similarity: -0.12448312


In [160]:
# compute cosine similarity
cosine = np.dot(A,C)/(norm(A)*norm(C))
print("Cosine Similarity:", cosine)

Cosine Similarity: -0.37639174


In [161]:
# compute cosine similarity
cosine = np.dot(B,C)/(norm(B)*norm(C))
print("Cosine Similarity:", cosine)

Cosine Similarity: -0.86389613


The following code is related to using the dataloader function :


In [2]:
from data_loader import *

In [8]:
#数据加载，已经定义好了默认的值，路径
n_questions = 110
seq_len = 200
data = DATA_LOADER(n_questions, seq_len, ',')
train_data_path = "/home/jolunavi/laboratorio/embedding/data/assist2009_updated/assist2009_updated_train1.csv"
valid_data_path = "/home/jolunavi/laboratorio/embedding/data/assist2009_updated/assist2009_updated_valid1.csv"


train_q_data, train_qa_data = data.load_data(train_data_path)
print('Train data loaded')
valid_q_data, valid_qa_data = data.load_data(valid_data_path)
print('Valid data loaded')
print('Shape of train data : %s, valid data : %s' % (train_q_data.shape, valid_q_data.shape))
print('Start training')


Excercies tag
, Answers
Number of split : 1
Question tag : 24, Answer : 0, QA : 24
Question tag : 24, Answer : 1, QA : 134
Question tag : 24, Answer : 0, QA : 24
Question tag : 24, Answer : 1, QA : 134
Question tag : 24, Answer : 1, QA : 134
Question tag : 24, Answer : 1, QA : 134
Question tag : 24, Answer : 1, QA : 134
Question tag : 24, Answer : 1, QA : 134
Question tag : 24, Answer : 1, QA : 134
Question tag : 24, Answer : 1, QA : 134
Question tag : 24, Answer : 1, QA : 134
Question tag : 24, Answer : 0, QA : 24
Question tag : 24, Answer : 1, QA : 134
Question tag : 24, Answer : 1, QA : 134
Question tag : 24, Answer : 1, QA : 134
Question tag : 24, Answer : 1, QA : 134
Question tag : 24, Answer : 1, QA : 134
Excercies tag
, Answers
Number of split : 4
Question tag : 37, Answer : 1, QA : 147
Question tag : 26, Answer : 0, QA : 26
Question tag : 26, Answer : 1, QA : 136
Question tag : 26, Answer : 0, QA : 26
Question tag : 26, Answer : 1, QA : 136
Question tag : 26, Answer : 1, QA : 1

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



Question tag : 69, Answer : 1, QA : 179
Question tag : 69, Answer : 0, QA : 69
Question tag : 69, Answer : 1, QA : 179
Question tag : 69, Answer : 1, QA : 179
Question tag : 69, Answer : 1, QA : 179
Question tag : 28, Answer : 1, QA : 138
Question tag : 29, Answer : 0, QA : 29
Question tag : 3, Answer : 0, QA : 3
Question tag : 22, Answer : 1, QA : 132
Question tag : 27, Answer : 1, QA : 137
Question tag : 68, Answer : 1, QA : 178
Question tag : 68, Answer : 1, QA : 178
Question tag : 68, Answer : 0, QA : 68
Question tag : 68, Answer : 0, QA : 68
Question tag : 68, Answer : 0, QA : 68
Question tag : 68, Answer : 1, QA : 178
Question tag : 68, Answer : 0, QA : 68
Question tag : 68, Answer : 1, QA : 178
Question tag : 6, Answer : 1, QA : 116
Question tag : 7, Answer : 1, QA : 117
Question tag : 6, Answer : 1, QA : 116
Question tag : 7, Answer : 1, QA : 117
Excercies tag
, Answers
Number of split : 1
Question tag : 65, Answer : 0, QA : 65
Question tag : 65, Answer : 0, QA : 65
Question ta

NameError: name 'dkvmn' is not defined

In [9]:
print('Shape of train data : %s, valid data : %s' % (train_q_data.shape, valid_q_data.shape))

Shape of train data : (2697, 200), valid data : (740, 200)


In [24]:
train_qa_data[0,:]



array([ 24., 134.,  24., 134., 134., 134., 134., 134., 134., 134., 134.,
        24., 134., 134., 134., 134., 134.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   

In [14]:
train_qa_data

array([[ 24., 134.,  24., ...,   0.,   0.,   0.],
       [147.,  26., 136., ...,  37.,  30., 147.],
       [140.,  37.,  30., ..., 151.,  41.,  41.],
       ...,
       [ 65., 175., 175., ...,   0.,   0.,   0.],
       [ 65., 175., 175., ...,   0.,   0.,   0.],
       [175., 175., 175., ...,   0.,   0.,   0.]])