Model details different from original paper #28

weishuanglong · 2018-05-30T02:47:09Z

Hi there,

I read the original paper and this implementation, and that's awsome!

However, I have a question concerning the details of SSTD net, and I'm really looking forward to see you reply:)

(1) In the deconvolution part, I see that you use groups=64 to upsample. But generally speaking, groups=1 might be more reasonale, so I guess it's for saving computational complexity? Or is there any other reasons?

(2) The original paper uses deconv3_3, conv1_1 to establish attention map. I see that you're using deconv16_16 and two conv3_3 to do it. Does it mean that this implementation is better than that in the original paper?

It's a very nice code and I really appretite your comment!

Thanks

BestSonny · 2018-05-31T18:13:57Z

@weishuanglong Thank you for the questions.

(1). Yes. You can also try to use groups=1 to see the difference. I am not familiar on this part. I find one interesting paper shuffle-net on these discussions.
(2). I would say this part is not so important. I do not observe much difference on the performance.

BestSonny closed this as completed May 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model details different from original paper #28

Model details different from original paper #28

weishuanglong commented May 30, 2018

BestSonny commented May 31, 2018

Model details different from original paper #28

Model details different from original paper #28

Comments

weishuanglong commented May 30, 2018

BestSonny commented May 31, 2018