New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark: 2 conv avg pool + 1 fc #47
Comments
thanks @rfratila I will confirm this and get back to you. |
Is the networks something like this: model = Sequential()
model.add(Conv2D(16, kernel_size=(5, 5),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(32, (5, 5), activation='relu'))
model.add(AveragePooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(num_classes, activation='softmax')) because over 200 epochs I can only manage Test accuracy: 0.9156 on fashion-mnist |
There should be another average pool in between the conv2d layers. I think Keras defaults to no padding which I actually use for both the conv2d layers. As a result, I'm not sure if the Keras AveragePooling layer will include the extra padding (I exclude it) in its calculation. Also, I trained with cuDNN and ran the tests on my computer which only has CPUs. I'm not sure if there is a discrepancy in performance between GPU and CPU for Theano. |
model = Sequential()
model.add(Conv2D(16, kernel_size=(5, 5),
activation='relu',
input_shape=input_shape))
model.add(AveragePooling2D(pool_size=(2,2)))
model.add(Conv2D(32, (5, 5), activation='relu'))
model.add(AveragePooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(num_classes, activation='softmax')) with this model i get for 200 epochs: Test accuracy: 0.9144 on fashion-mnist. I think your 97.39% test accuracy is a bit fishy. |
Interesting. I have the trained model saved and when I run tests on it on the CPU, I get ~97% and when I run the exact same thing on the GPU I get ~92% (In both cases with the test set). Any ideas as to why this may be? |
Hard to say why... I would try to get rid of 1 layer at a time and compare the CPU and GPU versions to see if a particular layer is responsible... Start with the Dropout layer... and then perhaps the Avg. layers etc. After that I would try to perhaps use a simpler SGD optimiser to see if the results between CPU and GPU become similar... Also have a look at your .theanorc file, to see if Theano is not defaulting to say float64 or something on the CPU... good luck! |
close as it is not a valid benchmark |
No preprocessing. See source code for exact network config.
Fashion-MNIST test accuracy: 97.39 %
Digit-MNIST test accuracy: 99.13 %
Source code: https://github.com/rfratila/Vulcan/blob/master/train_mnist_conv.py
Built with Lasagne and Theano
The text was updated successfully, but these errors were encountered: