You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I read the original paper and this implementation, and that's awsome!
However, I have a question concerning the details of SSTD net, and I'm really looking forward to see you reply:)
(1) In the deconvolution part, I see that you use groups=64 to upsample. But generally speaking, groups=1 might be more reasonale, so I guess it's for saving computational complexity? Or is there any other reasons?
(2) The original paper uses deconv3_3, conv1_1 to establish attention map. I see that you're using deconv16_16 and two conv3_3 to do it. Does it mean that this implementation is better than that in the original paper?
It's a very nice code and I really appretite your comment!
Thanks
The text was updated successfully, but these errors were encountered:
(1). Yes. You can also try to use groups=1 to see the difference. I am not familiar on this part. I find one interesting paper shuffle-net on these discussions.
(2). I would say this part is not so important. I do not observe much difference on the performance.
Hi there,
I read the original paper and this implementation, and that's awsome!
However, I have a question concerning the details of SSTD net, and I'm really looking forward to see you reply:)
(1) In the deconvolution part, I see that you use groups=64 to upsample. But generally speaking, groups=1 might be more reasonale, so I guess it's for saving computational complexity? Or is there any other reasons?
(2) The original paper uses deconv3_3, conv1_1 to establish attention map. I see that you're using deconv16_16 and two conv3_3 to do it. Does it mean that this implementation is better than that in the original paper?
It's a very nice code and I really appretite your comment!
Thanks
The text was updated successfully, but these errors were encountered: