# Weight Extraction

Keras offers a very easy method for extracting the weights and biases from all layers of a trained model. This information is called directly from the trained model using the get_weights() function.

This is demonstrated below using the same example as previous notebooks; load pickled files of the MNIST dataset and create a 3 layered MLP with the following architecture;

### Model Architecture

| Layer | No. Input Weights | No. Output Weights | No. Biases | Total Parameters - (In*Out) + Bias |
|:-----:|:-----------------:|:------------------:|:----------:|:----------------------------------:|
|   1   |        784        |         50         |     50     |               39,250               |
|   2   |         50        |         50         |     50     |                2,550               |
|   3   |         50        |         10         |     10     |                 510                |


Total number of tunable parameters is 42,310.

This notebook demonstrates how to extract all these parameters if desired.

In [1]:
import pickle
import os
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense
from keras.optimizers import SGD

path = '/Users/Nick/Documents/Enrion/Datasets/'

X_train = pickle.load( open(os.path.join(path,"mnist_X_train.pkl"), "rb" ) )
X_test = pickle.load( open(os.path.join(path,"mnist_X_test.pkl"), "rb" ) )
y_train = pickle.load( open(os.path.join(path,"mnist_y_train.pkl"), "rb" ) )
y_test = pickle.load( open(os.path.join(path,"mnist_y_test.pkl"), "rb" ) )

Using Theano backend.


In [None]:
np.random.seed(0)

model = Sequential()

model.add(Dense(input_dim=X_train.shape[1], 
                output_dim=50, 
                init='normal', 
                activation='tanh'))

model.add(Dense(input_dim=50, 
                output_dim=50, 
                init='normal', 
                activation='tanh'))

model.add(Dense(input_dim=50, 
                output_dim=y_train.shape[1], 
                init='normal', 
                activation='softmax'))

sgd = SGD(lr=0.001, decay=1e-7, momentum=.9, nesterov=False)

model.compile(optimizer = sgd, loss='categorical_crossentropy', metrics=['accuracy'])

model.fit(X_train, y_train, nb_epoch=50, batch_size=300, validation_split=0.1, verbose=1)

Extract weights and bias values using the get_weights() function. Each layer is a list of length two, the first list contains a list of lists for containing all weights for each input, the second is a list of all biases.

E.g model.layers[0].get_weights()[0] yields a list containing 784 lists each containing 50 values. This represents the 784 input values and the connections to the 50 nodes in the first hidden layer of the network.

Show that the number of parameters stored in the variables matches our expected architecture expressed in the above [Table.](#Model-Architecture)

In [31]:
layers = {0:'Input',1:'Hidden',2:'Output'}

for i in range(3):
    print("Number of Input Values for {} layer is: {}".format(layers[i],len(model.layers[i].get_weights()[0])))
    print("Number of Biases for {} layer is: {}".format(layers[i],len(model.layers[i].get_weights()[1])))

Number of Input Values for Input layer is: 784
Number of Biases for Input layer is: 50
Number of Input Values for Hidden layer is: 50
Number of Biases for Hidden layer is: 50
Number of Input Values for Output layer is: 50
Number of Biases for Output layer is: 10


Extract and store weights:

In [29]:
input_layer_W = model.layers[0].get_weights()[0]
input_layer_B = model.layers[0].get_weights()[1]
hidden_layer_W = model.layers[1].get_weights()[0]
hidden_layer_B = model.layers[1].get_weights()[1]
output_layer_W = model.layers[2].get_weights()[0]
output_layer_B = model.layers[2].get_weights()[1]

In [37]:
print(input_layer_W[0])

[ 0.08820262  0.02000786  0.0489369   0.11204466  0.0933779  -0.0488639
  0.04750442 -0.00756786 -0.00516094  0.02052993  0.00720218  0.07271367
  0.03805188  0.00608375  0.02219316  0.01668372  0.07470395 -0.01025791
  0.01565338 -0.04270479 -0.12764949  0.03268093  0.04322181 -0.03710825
  0.11348773 -0.07271829  0.00228793 -0.00935919  0.07663896  0.07346794
  0.00774737  0.01890813 -0.04438929 -0.09903982 -0.01739561  0.00781745
  0.06151453  0.06011899 -0.01936634 -0.01511514 -0.05242765 -0.0710009
 -0.08531351  0.09753877 -0.02548261 -0.02190371 -0.06263977  0.03887452
 -0.08069489 -0.01063701]


50 Weights connecting the first input value to the first hidden layer

In [44]:
print(output_layer_B)

[-0.18164791 -0.06975333  0.16924177 -0.0286592  -0.07822143  0.21399206
 -0.04337408  0.09134258 -0.0532693  -0.01965077]


39250

10 Bias values of the output layer

In [47]:
input_param_count = len(input_layer_W) * len(input_layer_W[0]) + len(input_layer_B)
hidden_param_count = len(hidden_layer_W) * len(hidden_layer_W[0]) + len(hidden_layer_B)
output_param_count = len(output_layer_W) * len(output_layer_W[0]) + len(output_layer_B)
total_param_count = input_param_count + hidden_param_count + output_param_count

print('Total paramters of the model: {0:,}'.format(total_param_count))

Total paramters of the model: 42,310
