Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model details different from original paper #28

Closed
weishuanglong opened this issue May 30, 2018 · 1 comment
Closed

Model details different from original paper #28

weishuanglong opened this issue May 30, 2018 · 1 comment

Comments

@weishuanglong
Copy link

Hi there,

I read the original paper and this implementation, and that's awsome!

However, I have a question concerning the details of SSTD net, and I'm really looking forward to see you reply:)

(1) In the deconvolution part, I see that you use groups=64 to upsample. But generally speaking, groups=1 might be more reasonale, so I guess it's for saving computational complexity? Or is there any other reasons?

(2) The original paper uses deconv3_3, conv1_1 to establish attention map. I see that you're using deconv16_16 and two conv3_3 to do it. Does it mean that this implementation is better than that in the original paper?

It's a very nice code and I really appretite your comment!

Thanks

@BestSonny
Copy link
Owner

@weishuanglong Thank you for the questions.

(1). Yes. You can also try to use groups=1 to see the difference. I am not familiar on this part. I find one interesting paper shuffle-net on these discussions.
(2). I would say this part is not so important. I do not observe much difference on the performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants