Routing #28

egorkrash · 2017-12-01T06:08:29Z

As I understand, you reset coupling coefficients after each training sample (batch). Don't you think it would be better to keep their previous state and update them depending on it?

XifengGuo · 2017-12-01T07:08:06Z

@egorkrash First, the paper resets coupling coefficients after each sample. Second, I don’t think the value should be kept. Each sample should have its own coefficients, so keeping them will take too many resources. And when it comes to testing samples, they don’t have coefficients in memory, so their coefficients have to initialized to uniform. To keep the consistency, in training process we shouldn’t keep the coefficients of training samples.

egorkrash · 2017-12-01T12:00:03Z

@XifengGuo Sorry if I seem to be offensive. I'm just trying to fix my misunderstanding and it would be great if you help me to do that.

Could you please point me where it is said in the paper that coupling coefficients should be reseted after each training sample? And in which way do you mean keeping them will cost too much resources? So should routing algorithm take place while testing process?

XifengGuo · 2017-12-01T13:43:04Z

@egorkrash

As shown in Routing algorithm, it considers only one sample and resets b_ij to 0 at every call. The routing algorithm is called in every forwarding pass. So whenever a training sample comes, the b_ij for this sample will change from 0 for some iterations. Even the same sample comes in the next epoch, when the routing algorithm is called, the b_ij of this sample will also start from 0. To sum up, the coupling coefficients are reset at each training step.

Now that each sample requires a b_ij whose size is 1152*10=11520. If we use batch training with batch size 100, the elements of b_ij to reserve are 11520*100=1,152,000. Assume we don't need to save the values of b_ij after routing, we can use the same 1,152,000 b_ij for each batch training (they are reset to 0 at each training step anyway). However, if we want to keep track of the b_ij for every samples in the training set, we have to save 60,000 * 1,152,000 = 6.912e10 elements for these b_ijs. Each element is a float type with 4 byte, then saving b_ij requires 4 * 6.912e10Byte=257GB.

The routing algorithm surely takes place in testing phase. For each testing sample, the b_ij will start from 0 as same as in training phase.

You can get more perspectives about routing from #1

egorkrash · 2017-12-02T21:33:40Z

@XifengGuo Thanks a lot for the explanation!
As for memory resources, I didn't suggest to keep b_ij in memory for each sample. I thought that updating them while training without setting them to zeros after each batch may work.
One thing seems strange for me now. Why is it better to have shared b_ij for all samples in batch?

XifengGuo · 2017-12-03T02:50:48Z

@egorkrash b_ij are not shared for all samples in batch, each sample in the batch has its own b_ij.

ghost · 2018-01-18T11:31:15Z

Thankyou very very much !!!!!! for your clear explanation!!!!!!!

jlevy44 · 2019-09-13T00:07:36Z

Interesting discussion. Many of the tutorials I've seen for capsnets do not have a b_ij per sample, which is rather unfortunate.

XifengGuo closed this as completed Dec 4, 2017

canyalniz mentioned this issue Jul 30, 2019

Extracting dynamic routing parameters of trained model #93

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Routing #28

Routing #28

egorkrash commented Dec 1, 2017 •

edited

Loading

XifengGuo commented Dec 1, 2017

egorkrash commented Dec 1, 2017 •

edited

Loading

XifengGuo commented Dec 1, 2017 •

edited

Loading

egorkrash commented Dec 2, 2017

XifengGuo commented Dec 3, 2017

ghost commented Jan 18, 2018

jlevy44 commented Sep 13, 2019

Routing #28

Routing #28

Comments

egorkrash commented Dec 1, 2017 • edited Loading

XifengGuo commented Dec 1, 2017

egorkrash commented Dec 1, 2017 • edited Loading

XifengGuo commented Dec 1, 2017 • edited Loading

egorkrash commented Dec 2, 2017

XifengGuo commented Dec 3, 2017

ghost commented Jan 18, 2018

jlevy44 commented Sep 13, 2019

egorkrash commented Dec 1, 2017 •

edited

Loading

egorkrash commented Dec 1, 2017 •

edited

Loading

XifengGuo commented Dec 1, 2017 •

edited

Loading