-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accuracy drops during extended training. #8
Comments
Thanks a lot for the interest in our work! |
Hi, Thanks for the quick reply. I'm attaching the file. |
I closed the issue by mistake. Here's the code. You might need to change the 'data_directory' variable.
|
This looks like a learning rate problem! I tried running your code with a learning rate of 0.05 and had no problems. |
Oh okay. I was using the same parameters for a comparable real-valued network and that worked fine, so maybe that's why I may have missed this. Thank you. On a side note, have you used your methods to try and construct a quaternion model that performs better than a real-valued counterpart at a classification task, say like the one in Gaudet's paper? |
We did! Actually that's what we are working on right now, as soon as we find a configuration with a noticeable improvement over real NN's we will update the repo. |
Hi,
I've built the following quaternion CNN using the methods provided.
When training the model for an extended duration on the MNIST dataset, the accuracy suddenly drops to nearly 10%, which is what we would expect from an untrained model, and doesn't improve any further. An image of the accuracy values as training progresses is attached.
The same issue also persists when using the methods in Parcollet's original repo. I would appreciate some insight into why this might be happening. If you need additional info, I can provide the code to recreate this issue.
Thanks,
Sahel
The text was updated successfully, but these errors were encountered: