Why are you feeding the prorposal region again through the encoder? #73

pfjaeger · 2018-03-20T13:34:32Z

Hi,

i was wondering what the benefit is of feeding the proposed region of the input image again through the encoder to the 3rd layer (where x/y = 24/24). These exact features have been computed in the first pass and you could crop the respective region in the feature map out using the normalized proposal coordinates. You could go straight to the 2x2x2 max pooling layer.

Or am i getting something wrong?
Thanks for your answer!

lfz · 2018-03-20T14:23:32Z

Theoretically, it is absolutely feasible, but consider that the feature will change from epoch to epoch (since the w of conv layers has been changed), so you have to extract feature again and again during tarining. It is time consuming if we use the method you mentioned. I choose to store the coordinate during the first pass, and extract feature from a small patch around it (96x96x96). So that the computation during feature extraction is very low. 2018-03-20 21:34 GMT+08:00 Paul Jaeger <notifications@github.com>:

…

Hi, i was wondering what the benefit is of feeding the proposed region of the input image again through the encoder to the 3rd layer (where x/y = 24/24). These exact features have been computed in the first pass and you could crop the respective region in the feature map out using the normalized proposal coordinates. You could go straight to the 2x2x2 max pooling layer. Or am i getting something wrong? Thanks for your answer! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#73>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIigQzYtwE7fwLTPzS3_aqenAZrr4MyQks5tgQVpgaJpZM4Sx5tR> .

-- 廖方舟清华大学医学院 Liao Fangzhou School of Medicine Tsinghua University Beijing 100084 China

pfjaeger · 2018-03-20T14:37:55Z

right so this is due to the alternating training procedure you use. What if you would train the two losses simultaneously? You could take features and proposal coordinates from one single pass. Similar to the "approximate-joint training" described in the Faster RCNN paper. Does this make results worse in your case?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why are you feeding the prorposal region again through the encoder? #73

Why are you feeding the prorposal region again through the encoder? #73

pfjaeger commented Mar 20, 2018

lfz commented Mar 20, 2018 via email

pfjaeger commented Mar 20, 2018 •

edited

Loading

Why are you feeding the prorposal region again through the encoder? #73

Why are you feeding the prorposal region again through the encoder? #73

Comments

pfjaeger commented Mar 20, 2018

lfz commented Mar 20, 2018 via email

pfjaeger commented Mar 20, 2018 • edited Loading

pfjaeger commented Mar 20, 2018 •

edited

Loading