You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
i was wondering what the benefit is of feeding the proposed region of the input image again through the encoder to the 3rd layer (where x/y = 24/24). These exact features have been computed in the first pass and you could crop the respective region in the feature map out using the normalized proposal coordinates. You could go straight to the 2x2x2 max pooling layer.
Or am i getting something wrong?
Thanks for your answer!
The text was updated successfully, but these errors were encountered:
Theoretically, it is absolutely feasible, but consider that the feature
will change from epoch to epoch (since the w of conv layers has been
changed), so you have to extract feature again and again during tarining.
It is time consuming if we use the method you mentioned.
I choose to store the coordinate during the first pass, and extract feature
from a small patch around it (96x96x96). So that the computation during
feature extraction is very low.
2018-03-20 21:34 GMT+08:00 Paul Jaeger <notifications@github.com>:
Hi,
i was wondering what the benefit is of feeding the proposed region of the
input image again through the encoder to the 3rd layer (where x/y = 24/24).
These exact features have been computed in the first pass and you could
crop the respective region in the feature map out using the normalized
proposal coordinates. You could go straight to the 2x2x2 max pooling layer.
Or am i getting something wrong?
Thanks for your answer!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#73>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AIigQzYtwE7fwLTPzS3_aqenAZrr4MyQks5tgQVpgaJpZM4Sx5tR>
.
--
廖方舟
清华大学医学院
Liao Fangzhou
School of Medicine
Tsinghua University
Beijing 100084
China
right so this is due to the alternating training procedure you use. What if you would train the two losses simultaneously? You could take features and proposal coordinates from one single pass. Similar to the "approximate-joint training" described in the Faster RCNN paper. Does this make results worse in your case?
Hi,
i was wondering what the benefit is of feeding the proposed region of the input image again through the encoder to the 3rd layer (where x/y = 24/24). These exact features have been computed in the first pass and you could crop the respective region in the feature map out using the normalized proposal coordinates. You could go straight to the 2x2x2 max pooling layer.
Or am i getting something wrong?
Thanks for your answer!
The text was updated successfully, but these errors were encountered: