<a href="https://colab.research.google.com/github/AaryanJethva7/NMIMS-DL/blob/master/Keras_Initializations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Usage of initializers

Initializations define the way to set the initial random weights of Keras layers.

The keyword arguments used for passing initializers to layers will depend on the layer. Usually it is simply kernel_initializer and bias_initializer:

Lets test the various initializations of weights on our baseline model.

# Baseline model for classification on CIFAR10 dataset

In [0]:
#import packages
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
model = Sequential()
model.add(Dense(1024, input_shape=(3072, )))
model.add(Activation('relu'))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dense(10))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

# Model with Initializer as random uniform and zero

In [0]:
#import packages
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer='random_uniform', bias_initializer='zeros'))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer='random_uniform', bias_initializer='zeros'))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer='random_uniform', bias_initializer='zeros'))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer='random_uniform', bias_initializer='zeros'))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

#Model with weights as ones

In [0]:
#import packages
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer='ones', bias_initializer='ones'))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer='ones', bias_initializer='ones'))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer='ones', bias_initializer='ones'))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer='ones', bias_initializer='ones'))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

# Model initialized with constant values

Initializer that generates tensors initialized to a constant value.

Arguments

1. value: float; the value of the generator tensors.

In [0]:
#import packages
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
initializer = keras.initializers.Constant(value=0)
model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

#RandomNormal
keras.initializers.RandomNormal(mean=0.0, stddev=0.05, seed=None)
Initializer that generates tensors with a normal distribution.

Arguments

1. mean: a python scalar or a scalar tensor. Mean of the random values to generate.
2. stddev: a python scalar or a scalar tensor. Standard deviation of the random values to generate.
3. seed: A Python integer. Used to seed the random generator.

In [0]:
#import packages
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
initializer = keras.initializers.RandomNormal(mean=0.0, stddev=0.05, seed=None)
model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

#RandomUniform
keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None)
Initializer that generates tensors with a uniform distribution.

Arguments

1. minval: A python scalar or a scalar tensor. Lower bound of the range of random values to generate.
2. maxval: A python scalar or a scalar tensor. Upper bound of the range of random values to generate. Defaults to 1 for float types.
3. seed: A Python integer. Used to seed the random generator.

In [0]:
#import packages
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
initializer = keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None)
model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

#TruncatedNormal
keras.initializers.TruncatedNormal(mean=0.0, stddev=0.05, seed=None)
Initializer that generates a truncated normal distribution.

These values are similar to values from a RandomNormal except that values more than two standard deviations from the mean are discarded and redrawn. This is the recommended initializer for neural network weights and filters.

Arguments

1. mean: a python scalar or a scalar tensor. Mean of the random values to generate.
2. stddev: a python scalar or a scalar tensor. Standard deviation of the random values to generate.
3. seed: A Python integer. Used to seed the random generator.

In [0]:
#import packages
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
initializer = keras.initializers.TruncatedNormal(mean=0.0, stddev=0.05, seed=None)
model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

#VarianceScaling
1. keras.initializers.VarianceScaling(scale=1.0, mode='fan_in', distribution='normal', seed=None)
2. Initializer capable of adapting its scale to the shape of weights.

With distribution="normal", samples are drawn from a truncated normal distribution centered on zero, with stddev = sqrt(scale / n) where n is:

1. number of input units in the weight tensor, if mode = "fan_in"
2. number of output units, if mode = "fan_out"
3. average of the numbers of input and output units, if mode = "fan_avg"

With distribution="uniform", samples are drawn from a uniform distribution within [-limit, limit], with limit = sqrt(3 * scale / n).

Arguments

1. scale: Scaling factor (positive float).
2. mode: One of "fan_in", "fan_out", "fan_avg".
3. distribution: Random distribution to use. One of "normal", "uniform".
4. seed: A Python integer. Used to seed the random generator.

Raises

ValueError: In case of an invalid value for the "scale", mode" or "distribution" arguments.

In [0]:
#import packages
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
initializer = keras.initializers.VarianceScaling(scale=1.0, mode='fan_in', distribution='normal', seed=None)
model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

#Orthogonal
keras.initializers.Orthogonal(gain=1.0, seed=None)
Initializer that generates a random orthogonal matrix.

Arguments

1. gain: Multiplicative factor to apply to the orthogonal matrix.
2. seed: A Python integer. Used to seed the random generator.

In [0]:
#import packages
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
initializer = keras.initializers.Orthogonal(gain=2.0, seed=None)
model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

#Identity

1. keras.initializers.Identity(gain=1.0)
2. Initializer that generates the identity matrix.

Only use for 2D matrices. If the desired matrix is not square, it gets padded with zeros for the additional rows/columns.

Arguments

1. gain: Multiplicative factor to apply to the identity matrix.


In [0]:
#import packages
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
initializer = keras.initializers.Identity(gain=1.0)

model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

#lecun_uniform
keras.initializers.lecun_uniform(seed=None)

LeCun uniform initializer.

It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(3 / fan_in) where fan_in is the number of input units in the weight tensor.

Arguments

1. seed: A Python integer. Used to seed the random generator.

Returns

1. An initializer.

In [0]:
#import packages
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
initializer = keras.initializers.lecun_uniform(seed=None)

model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

#glorot_normal

keras.initializers.glorot_normal(seed=None)

Glorot normal initializer, also called Xavier normal initializer.

It draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(2 / (fan_in + fan_out)) where fan_in is the number of input units in the weight tensor and fan_out is the number of output units in the weight tensor.

Arguments

1. seed: A Python integer. Used to seed the random generator.

Returns

1. An initializer.

In [0]:
#import packages
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
initializer = keras.initializers.glorot_normal(seed=None)

model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

#glorot_uniform

keras.initializers.glorot_uniform(seed=None)

Glorot uniform initializer, also called Xavier uniform initializer.

It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(6 / (fan_in + fan_out)) where fan_in is the number of input units in the weight tensor and fan_out is the number of output units in the weight tensor.

Arguments

1. seed: A Python integer. Used to seed the random generator.

Returns

1. An initializer.

In [0]:
#import packages
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
initializer = keras.initializers.glorot_uniform(seed=None)


model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

#he_normal

keras.initializers.he_normal(seed=None)

He normal initializer.

It draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(2 / fan_in) where fan_in is the number of input units in the weight tensor.

Arguments

1. seed: A Python integer. Used to seed the random generator.

Returns

An initializer.

In [0]:
#import packages
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
initializer = keras.initializers.he_normal(seed=None)




model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

#lecun_normal

keras.initializers.lecun_normal(seed=None)

LeCun normal initializer.

It draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(1 / fan_in) where fan_in is the number of input units in the weight tensor.

Arguments

1. seed: A Python integer. Used to seed the random generator.

Returns

1. An initializer.

In [0]:
#import packages
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
initializer = keras.initializers.lecun_normal(seed=None)


model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

#he_uniform

keras.initializers.he_uniform(seed=None)

He uniform variance scaling initializer.

It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(6 / fan_in) where fan_in is the number of input units in the weight tensor.

Arguments

1. seed: A Python integer. Used to seed the random generator.

Returns

An initializer.

In [0]:
#import packages
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 10)
Y_test =  to_categorical(y_test, 10)

#Model building
initializer = keras.initializers.he_uniform(seed=None)

model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=5,
                        verbose=1,
                        validation_data=(X_test, Y_test))

## **CIFAR 100**

In [41]:
#import packages
import keras
from keras.datasets import cifar100
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = cifar100.load_data()

#change shape from image to vector
X_train = X_train.reshape(50000, 32 * 32 * 3)
X_test = X_test.reshape(10000, 32 * 32 * 3)

#preprocess
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255.0
X_test /= 255.0

#change labels from numeric to one hot encoded
Y_train = to_categorical(y_train, 100)
Y_test =  to_categorical(y_test, 100)

#Model building
initializer = keras.initializers.lecun_normal(seed=None)


model = Sequential()
model.add(Dense(1024, input_shape=(3072,), kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(100, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=10,
                        verbose=1,
                        validation_data=(X_test, Y_test))



Train on 50000 samples, validate on 10000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


## **MNIST**

In [38]:
import keras
from keras import models
from keras.layers import Dense, Dropout
from keras.utils import to_categorical
from keras.datasets import mnist
from keras.utils.vis_utils import model_to_dot
from IPython.display import SVG

NUM_ROWS = 28
NUM_COLS = 28
NUM_CLASSES = 10
BATCH_SIZE = 128
EPOCHS = 10

# Load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()



# Reshape data
X_train = X_train.reshape((X_train.shape[0], NUM_ROWS * NUM_COLS))
X_train = X_train.astype('float32') / 255
X_test = X_test.reshape((X_test.shape[0], NUM_ROWS * NUM_COLS))
X_test = X_test.astype('float32') / 255

# Categorically encode labels
y_train = to_categorical(y_train, NUM_CLASSES)
y_test = to_categorical(y_test, NUM_CLASSES)

#Model building
initializer = keras.initializers.lecun_normal(seed=None)

# Build neural network
model = models.Sequential()
model.add(Dense(512, activation='relu', input_shape=(NUM_ROWS * NUM_COLS,),kernel_initializer=initializer,bias_initializer=initializer))
model.add(Dense(256, activation='relu',kernel_initializer=initializer,bias_initializer=initializer))
model.add(Dense(10, activation='softmax',kernel_initializer=initializer,bias_initializer=initializer))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, y_train,
                        batch_size=128,
                        nb_epoch=10,
                        verbose=1,
                        validation_data=(X_test, y_test))



Train on 60000 samples, validate on 10000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


## **Fashion MNIST**

In [33]:
#import packages
import keras
from keras.datasets import fashion_mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.utils import to_categorical

#import dataset
(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()

# Reshape data
X_train = X_train.reshape((X_train.shape[0], 28 * 28))
X_train = X_train.astype('float32') / 255
X_test = X_test.reshape((X_test.shape[0], 28 * 28))
X_test = X_test.astype('float32') / 255

# Categorically encode labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

#Model building
initializer = keras.initializers.lecun_normal(seed=None)


model = Sequential()
model.add(Dense(1024, input_shape=(784,), kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(512, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('relu'))
model.add(Dense(10, kernel_initializer=initializer, bias_initializer=initializer))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(X_train, Y_train,
                        batch_size=128,
                        nb_epoch=10,
                        verbose=1,
                        validation_data=(X_test, Y_test))



Train on 60000 samples, validate on 10000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


## **IRIS**

In [42]:
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder

iris_data = load_iris()
	
x = iris_data.data
y_ = iris_data.target.reshape(-1, 1) # Convert data to a single column

# One Hot encode the class labels
encoder = OneHotEncoder(sparse=False)
y = encoder.fit_transform(y_)


# Split the data for training and testing
train_x, test_x, train_y, test_y = train_test_split(x, y, test_size=0.01)

#Model building
initializer = keras.initializers.lecun_normal(seed=None)

model = Sequential()

model.add(Dense(10, input_shape=(4,), activation='relu', kernel_initializer=initializer, bias_initializer=initializer))
model.add(Dense(10, activation='relu', kernel_initializer=initializer, bias_initializer=initializer))
model.add(Dense(3, activation='softmax', kernel_initializer=initializer, bias_initializer=initializer))

# Compile
model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
 

# training
history = model.fit(train_x, train_y,
                        batch_size=5,
                        nb_epoch=10,
                        verbose=2,
                        validation_data=(test_x, test_y))





Train on 148 samples, validate on 2 samples
Epoch 1/10
 - 2s - loss: 0.8106 - acc: 0.5811 - val_loss: 0.6756 - val_acc: 1.0000
Epoch 2/10
 - 0s - loss: 0.6697 - acc: 0.9257 - val_loss: 0.6259 - val_acc: 1.0000
Epoch 3/10
 - 0s - loss: 0.5370 - acc: 0.9392 - val_loss: 0.4951 - val_acc: 1.0000
Epoch 4/10
 - 0s - loss: 0.4640 - acc: 0.9595 - val_loss: 0.3911 - val_acc: 1.0000
Epoch 5/10
 - 0s - loss: 0.4159 - acc: 0.9459 - val_loss: 0.3455 - val_acc: 1.0000
Epoch 6/10
 - 0s - loss: 0.3775 - acc: 0.9662 - val_loss: 0.3167 - val_acc: 1.0000
Epoch 7/10
 - 0s - loss: 0.3476 - acc: 0.9595 - val_loss: 0.2643 - val_acc: 1.0000
Epoch 8/10
 - 0s - loss: 0.3217 - acc: 0.9797 - val_loss: 0.2517 - val_acc: 1.0000
Epoch 9/10
 - 0s - loss: 0.3046 - acc: 0.9662 - val_loss: 0.2097 - val_acc: 1.0000
Epoch 10/10
 - 0s - loss: 0.2820 - acc: 0.9662 - val_loss: 0.2147 - val_acc: 1.0000


1.	Baseline Model: Accuracy- 0.4702, Validation Accuracy- 0.4714
2.	Model with Initializer as random uniform and zero: Accuracy- 0.4758, Validation Accuracy- 0.4762
3.	Model with weights as ones: Accuracy-0.1, Validation Accuracy-0.1
4.	Model initialized with constant values: Accuracy-0.0960, Validation Accuracy-0.1
5.	Model with Random Normal: Accuracy-0.4774, Validation Accuracy-0.4710
6.	Model with Random Uniform: Accuracy-0.4757, Validation Accuracy-0.4695
7.	Model with Truncated Normal: Accuracy-0.4763, Validation Accuracy-0.4718
8.	Model with Variance Scaling: Accuracy-0.4789, Validation Accuracy-0.4750
9.	Model with Orthogonal: Accuracy-0.1, Validation Accuracy-0.1
10.	Model with Identity: Accuracy-0.3407, Validation Accuracy-0.3416
11.	Model with lecun_uniform: Accuracy-0.4795, Validation Accuracy-0.4748
12.	Model with glorot_normal: Accuracy-0.4796, Validation Accuracy-0.4781
13.	Model with glorot_uniform: Accuracy-0.4799, Validation Accuracy-0.4785
14.	Model with he_normal: Accuracy-0.4767, Validation Accuracy-0.4667
15.	Model with lecun_normal: Accuracy-0.4829, Validation Accuracy-0.4844
16.	Model with he_uniform: Accuracy-0.4792, Validation Accuracy-0.4841

Conclusion:
Model with lecun_normal has the highest accuracy and validation accuracy hence it is the most optimal initializer. 

CIFAR 100:
Accuracy-0.2910, Validation Accuracy-0.2470

MNIST:
Accuracy-0.9960, Validation Accuracy-0.9794

Fashion MNIST:
Accuracy-0.9188, Validation Accuracy-0.8874

IRIS:
Accuracy-0.9662, Validation Accuracy-1.0


#Thank you for completing this notebook