Skip to content
No description, website, or topics provided.
Python
Branch: master
Clone or download
Latest commit e00330e Nov 14, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
pics moving pics to subdirectory Nov 14, 2019
README.md moved files Nov 14, 2019
mnist2.py adding training pytorch file Nov 5, 2019
mnist5.py keras + tensorflow Nov 9, 2019
mnist6.py removing unnecessary relu Nov 13, 2019
mnist6c.py two paths Nov 14, 2019
mnist7.py adding auto augmentation Nov 12, 2019

README.md

MNIST Wide Network

These experiments ran on Ubuntu 19.10 & NVIDIA GEFORCE GTX 1080 Ti.

PyTorch

MNIST wide network training with PyTorch.

Execute these commands in Ubuntu terminal (configG in DeeperThought or WideOpenThoughts):

  • pip3 install torch torchvision
  • python3 mnist2.py
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 200, 8, 1)
        self.fc1 = nn.Linear(1800, 130)
        self.fc2 = nn.Linear(130, 10)

    def forward(self, x):
        x = self.conv1(x)
        x = F.max_pool2d(x, 7, 7)
        x = x.view(-1, 1800)
        x = self.fc1(x)
        x = torch.sigmoid(x)
        x = torch.nn.functional.dropout(x, 0.5)
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

Testing accuracy progress:

wide network test accuracy

Top Test accuracy: 99.42 %

Keras + TensorFlow

Or if you prefer keras + tensorflow:

  • pip3 install tensorflow
  • pip3 install tensorflow-gpu
  • python3 mnist5.py
model = Sequential()
model.add(Conv2D(200, kernel_size=(8, 8), input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(7, 7)))
model.add(Flatten())
model.add(Dense(130))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy, 
              optimizer=keras.optimizers.Adam(), 
              metrics=['accuracy'])

Epoch 272/10000 val_loss: 0.0335 - val_accuracy: 0.9939

Top Test accuracy: 99.39 %

TensorFlow

Or if you prefer pure tensorflow code:

  • pip3 install tensorflow
  • pip3 install tensorflow-gpu
  • python3 mnist6.py
X_image = tf.reshape(X, [-1, 28, 28, 1])
Wconv_1 = tf.get_variable("WConv_1", shape=[8, 8, 1, 200], initializer=initializer)
bconv_1 = tf.get_variable("bConv_1", shape=[200], initializer=initializer)

h_conv1 = tf.nn.conv2d(X_image, Wconv_1, strides=[1, 1, 1, 1], padding='VALID') + bconv_1
h_pool1 = tf.nn.max_pool(h_conv1, ksize=[1, 7, 7, 1], strides=[1, 7, 7, 1], padding='VALID')

h_pool2_flat = tf.reshape(h_pool1, [-1, 1800])
W_1 = tf.get_variable("W_1", shape=[1800, 130], initializer=initializer)
b_1 = tf.get_variable("b_1", shape=[130], initializer=initializer)

hidden = tf.matmul(h_pool2_flat, W_1) + b_1
hidden_sig = tf.nn.sigmoid(hidden)
hidden_drop = tf.nn.dropout(hidden_sig, keep_prob)

W_2 = tf.get_variable("W_2", shape=[130, 10], initializer=initializer)
b_2 = tf.get_variable("b_2", shape=[10], initializer=initializer)
output = tf.matmul(hidden_drop, W_2) + b_2

wide network tf test accuracy

Top Test accuracy: 99.41 %

Or if you want to try automatic augmentation (configH in DeeperThought):

  • pip3 install tensorflow
  • pip3 install tensorflow-gpu
  • python3 mnist7.py
x1 = tf.reshape(X, [-1, 1, 28, 28])

am = numpy.random.rand(1,5,28,28) * 0.1
for i in range(5):
        for j in range(28):
                am[0,i,j,j] = 1.0
am = numpy.float32(am)
aug_mat = tf.get_variable("aug_mat", initializer=am, dtype=tf.float32)
x2 = tf.matmul(x1, aug_mat)

X_image = tf.reshape(x2, [-1, 28, 28, 1])
Wconv_1 = tf.get_variable("WConv_1", shape=[8, 8, 1, 200], initializer=initializer)
bconv_1 = tf.get_variable("bConv_1", shape=[200], initializer=initializer)

h_conv1 = tf.nn.conv2d(X_image, Wconv_1, strides=[1, 1, 1, 1], padding='VALID') + bconv_1
h_pool1 = tf.nn.max_pool(h_conv1, ksize=[1, 7, 7, 1], strides=[1, 7, 7, 1], padding='VALID')

h_pool2_flat = tf.reshape(h_pool1, [-1, 1800 * 5])
W_1 = tf.get_variable("W_1", shape=[1800 * 5, 130], initializer=initializer)
b_1 = tf.get_variable("b_1", shape=[130], initializer=initializer)

hidden = tf.matmul(h_pool2_flat, W_1) + b_1
hidden_sig = tf.nn.sigmoid(hidden)
hidden_drop = tf.nn.dropout(hidden_sig, keep_prob)

W_2 = tf.get_variable("W_2", shape=[130, 10], initializer=initializer)
b_2 = tf.get_variable("b_2", shape=[10], initializer=initializer)
output = tf.matmul(hidden_drop, W_2) + b_2

wide network with auto augmentation

Top Test accuracy: 99.44 %

Though gain in DeeperThought is higher (from 99.41 % to 99.48 % while in tensorflow it is only from 99.41 % to 99.44 %). This might be due to different way of training and training with relative step size in DeeperThought.

Or if you want really wide networks...

Or if you want to try two paths with second path using FFT:

  • pip3 install tensorflow
  • pip3 install tensorflow-gpu
  • python3 mnist6c.py
X_image2 = tf.dtypes.cast(X_image, tf.complex64)
im_fft = tf.signal.fft2d(X_image2)
im_fft2 = tf.math.abs(im_fft)
im_fft3 = tf.dtypes.cast(X_image, tf.float32)

Wconv_0 = tf.get_variable("WConv_0", shape=[4, 4, 1, 1], initializer=initializer)
im_fft4 = tf.nn.conv2d(im_fft3, Wconv_0, strides=[1, 1, 1, 1], padding='VALID')


Wconv_1 = tf.get_variable("WConv_1", shape=[8, 8, 1, 200], initializer=initializer)
bconv_1 = tf.get_variable("bConv_1", shape=[200], initializer=initializer)

h_conv1 = tf.nn.conv2d(X_image, Wconv_1, strides=[1, 1, 1, 1], padding='VALID') + bconv_1
h_pool1 = tf.nn.max_pool(h_conv1, ksize=[1, 7, 7, 1], strides=[1, 7, 7, 1], padding='VALID')

h_pool2_flat = tf.reshape(h_pool1, [-1, 1800])
im_fft4 = tf.reshape(im_fft4, [-1, 25 * 25])

xx = tf.concat([im_fft4, h_pool2_flat], 1)
xx2 = tf.reshape(xx, [-1, 1800 + 25 * 25])


W_1 = tf.get_variable("W_1", shape=[1800 + 25 * 25, 130], initializer=initializer)
b_1 = tf.get_variable("b_1", shape=[130], initializer=initializer)

hidden = tf.matmul(xx2, W_1) + b_1
hidden_sig = tf.nn.sigmoid(hidden)
hidden_drop = tf.nn.dropout(hidden_sig, keep_prob)

W_2 = tf.get_variable("W_2", shape=[130, 10], initializer=initializer)
b_2 = tf.get_variable("b_2", shape=[10], initializer=initializer)
output = tf.matmul(hidden_drop, W_2) + b_2

wide network with two paths augmentation

Top Test accuracy: 99.47 %

You can’t perform that action at this time.