Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark: 2 conv avg pool + 1 fc #47

Closed
rfratila opened this issue Aug 31, 2017 · 7 comments
Closed

Benchmark: 2 conv avg pool + 1 fc #47

rfratila opened this issue Aug 31, 2017 · 7 comments

Comments

@rfratila
Copy link

rfratila commented Aug 31, 2017

No preprocessing. See source code for exact network config.

Fashion-MNIST test accuracy: 97.39 %
Digit-MNIST test accuracy: 99.13 %

Source code: https://github.com/rfratila/Vulcan/blob/master/train_mnist_conv.py

Built with Lasagne and Theano

@kashif
Copy link
Collaborator

kashif commented Sep 2, 2017

thanks @rfratila I will confirm this and get back to you.

@kashif
Copy link
Collaborator

kashif commented Sep 3, 2017

Is the networks something like this:

model = Sequential()
model.add(Conv2D(16, kernel_size=(5, 5),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(32, (5, 5), activation='relu'))
model.add(AveragePooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(num_classes, activation='softmax'))

because over 200 epochs I can only manage Test accuracy: 0.9156 on fashion-mnist

@rfratila
Copy link
Author

rfratila commented Sep 3, 2017

There should be another average pool in between the conv2d layers. I think Keras defaults to no padding which I actually use for both the conv2d layers. As a result, I'm not sure if the Keras AveragePooling layer will include the extra padding (I exclude it) in its calculation. Also, I trained with cuDNN and ran the tests on my computer which only has CPUs. I'm not sure if there is a discrepancy in performance between GPU and CPU for Theano.

@kashif
Copy link
Collaborator

kashif commented Sep 3, 2017

model = Sequential()
model.add(Conv2D(16, kernel_size=(5, 5),
                 activation='relu',
                 input_shape=input_shape))
model.add(AveragePooling2D(pool_size=(2,2)))
model.add(Conv2D(32, (5, 5), activation='relu'))
model.add(AveragePooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(num_classes, activation='softmax'))

with this model i get for 200 epochs: Test accuracy: 0.9144 on fashion-mnist.

I think your 97.39% test accuracy is a bit fishy.

@rfratila
Copy link
Author

rfratila commented Sep 3, 2017

Interesting. I have the trained model saved and when I run tests on it on the CPU, I get ~97% and when I run the exact same thing on the GPU I get ~92% (In both cases with the test set). Any ideas as to why this may be?

@kashif
Copy link
Collaborator

kashif commented Sep 4, 2017

Hard to say why... I would try to get rid of 1 layer at a time and compare the CPU and GPU versions to see if a particular layer is responsible... Start with the Dropout layer... and then perhaps the Avg. layers etc.

After that I would try to perhaps use a simpler SGD optimiser to see if the results between CPU and GPU become similar...

Also have a look at your .theanorc file, to see if Theano is not defaulting to say float64 or something on the CPU... good luck!

@hanxiao
Copy link
Collaborator

hanxiao commented Sep 4, 2017

close as it is not a valid benchmark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants