Routing algorithm #8

bshao001 · 2017-10-30T19:46:23Z

To the owner and all other visitors:

I do not mean to be offensive, but I decided to speak out my understanding of this routing algorithm as I have not seen any correct implementation so far yet.

The correct implementation of the routing algorithm should be treated something like the dynamic RNN in TensorFlow. In other words, if you implement it in a static way, and if you do 3 iterations, the two caps layers are actually 6 such layers. The primary layer performs line 4 and output to the digits layer, and then the digits layer performs line 5, 6, and 7 with b_ij updated, and then loop back to the primary layer again. This will need to use tf.while_loop if you use a dynamic way.

What confuses me or stops me from implementing myself is I am not sure how the weights and biases associated with the conv units are updated, as I assume other than the weights and biases associated with the capsules, each individual conv unit inside still carries its own parameters. Maybe I missed this by reading the paper.

Feel free to correct me if you believe I am wrong. Thanks.

laubonghaudoi · 2017-10-31T01:16:39Z

Agree. I still have difficulties in understanding how b_ij and W_ij are updated, there is little information on this in the paper. Hinton has mentioned to give up back-propagation in his speeches but what I have seen so far in this implementation is still using Adam to update weights by back-propagating gradients from the two capsule layers.

naturomics · 2017-10-31T01:24:58Z

@bshao001 You're right, I misunderstood the routing algorithm. Thank you for your pointing out. The interesting thing is, the loss can go down steadily (not effective as the result of the paper). I'm fixing it.

13331151 · 2017-11-01T00:49:32Z

I don't think the routing process could be unfolded into 6 layers, because u_ij is only calculated once in one forward pass.

InnerPeace-Wu · 2017-11-01T10:52:28Z

Here is some of my idea about log prior or bias. @bshao001 I think the static way will still work fine, the key is we should not update bias during routing. Since we only use dynamic routing to compute digit caps in this case, and with static way, the digit caps could be updated. Here is my implementation of Capsnet for reference.

bshao001 · 2018-02-02T13:23:37Z

@naturomics
不幸被我言中，原作者的代码果然用了tf.while_loop来实现核心的_update_routing方法。

naturomics added a commit that referenced this issue Oct 31, 2017

Note the issue #8

d0af5cc

naturomics closed this as completed in 24ac9fb Nov 1, 2017

naturomics added a commit that referenced this issue Nov 1, 2017

Fixing issue #8, and reimplement the routing algorithm in a matrix way

2c8d0ce

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Routing algorithm #8

Routing algorithm #8

bshao001 commented Oct 30, 2017

laubonghaudoi commented Oct 31, 2017

naturomics commented Oct 31, 2017

13331151 commented Nov 1, 2017

InnerPeace-Wu commented Nov 1, 2017 •

edited

bshao001 commented Feb 2, 2018

Routing algorithm #8

Routing algorithm #8

Comments

bshao001 commented Oct 30, 2017

laubonghaudoi commented Oct 31, 2017

naturomics commented Oct 31, 2017

13331151 commented Nov 1, 2017

InnerPeace-Wu commented Nov 1, 2017 • edited

bshao001 commented Feb 2, 2018

InnerPeace-Wu commented Nov 1, 2017 •

edited