Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training without reconstruction loss #34

Open
JeanElsner opened this issue Jun 5, 2018 · 3 comments
Open

Training without reconstruction loss #34

JeanElsner opened this issue Jun 5, 2018 · 3 comments

Comments

@JeanElsner
Copy link

The fully connected layers added on top of the capsule network consist of ~1.6m parameters, whereas the capsules only have roughly 60k trainable parameters in the small configuration. As the matrix capsules are supposed to generalize better with fewer parameters, as compared to traditional architectures, this approach seems counterintuitive to me.
However, removing the reconstruction loss and training with spread loss alone doesn't appear to converge (on smallNORB). Where you able to train your network with spread loss only (as suggested by the paper)?

@ashleygritzman
Copy link

When I trained with spread loss only using 2 routing iterations, the training did converge, but the model only seemed to start learning after 6k iterations.

No matter what I have tried so far, I can't seem to get the model to work with 3 or more routing iterations.

@JeanElsner
Copy link
Author

Interesting, what were your results like? The paper reports a test error rate of 2.2% on the smallNORB dataset, even with just two routing iterations (albeit 32 capsule types per layer).

@ashleygritzman
Copy link

I got more like 8-9% error, I haven't heard of anyone that has successfully reproduced the error reported in the paper yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants