How to increase model capacity for training on a larger dataset? #53

daniyalDE · 2020-08-11T12:32:21Z

First of all thanks for the amazing work on U-2-net. Now i am trying to train the model from scratch on my own dataset of 60k images which is larger than your dataset. I would like to know how i can increase the model capacity to be able to train on such a dataset.

I have considered replacing the standard rebnconv blocks with residuals as suggested in another issue. What other options i could try? I understand that i need to make the architecture deeper, does this mean that i should make RSU-8 or RSU-9 blocks by adding more convolution layers?

xuebinqin · 2020-08-12T03:31:21Z

Thanks for your interest. You can try following ideas: (1) increase the filter numbers of each layer or add more layers in the basic bn_relu_conv module, (2) remove some of the dense supervision, (3) try to build RSU-8 or RSU-9, (4) input resolution also matters, etc.

…

On Tue, Aug 11, 2020 at 6:32 AM Daniyal Arshad ***@***.***> wrote: First of all thanks for the amazing work on U-2-net. Now i am trying to train the model from scratch on my own dataset which is 60k images which is larger than your dataset. I would like to know how i can increase the model capacity to be able to train on such a dataset. I have considered replacing the standard rebnconv blocks with residuals as suggested in another issue. What other options i could try? I understand that i need to make the architecture deeper, does this mean that i should make RSU-8 or RSU-9 blocks by adding more convolution layers? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#53>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADSGORKJ5UXPQJVYO5AW2RTSAE26NANCNFSM4P27TBGA> .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

shgidi · 2020-08-12T05:41:41Z

@daniyalDE Hi Daniel, I'm interested in similar tasks as well.
Why do you assume that the original model doesn't have the capacity of such a task? How do you determine that the model was "maxed out" on the 10K dataset it was trained on?

daniyalDE · 2020-08-12T10:07:05Z

@Nathanua thanks for the feedback. One last thing regarding (4) the input resolution, from what i understand the training dataloader always rescales the input images to 320x320, so if i want to train with higher resolution images should i change the rescale ratio to a higher value?

daniyalDE · 2020-08-12T10:11:41Z

@shgidi i have tried training on my dataset and the loss/accuracy stalls after a while which might be that the current model is not complex enough to learn the features of my data which is very different from the datasets that they trained on originally.

xuebinqin · 2020-08-13T02:28:59Z

yes, at the same time, you may also need to modify the random_crop size correspondingly.

…

On Wed, Aug 12, 2020 at 4:07 AM Daniyal Arshad ***@***.***> wrote: @Nathanua <https://github.com/NathanUA> thanks for the feedback. One last thing regarding (4) the input resolution, from what i understand the training dataloader always rescales the input images to 320x320, so if i want to train with higher resolution images should i change the rescale ratio to a higher value? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#53 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADSGORP2FJJD4MBFWL5KH4LSAJSVTANCNFSM4P27TBGA> .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

EricLe-dev · 2020-09-04T12:31:11Z

Thanks for your interest. You can try following ideas: (1) increase the filter numbers of each layer or add more layers in the basic bn_relu_conv module, (2) remove some of the dense supervision, (3) try to build RSU-8 or RSU-9, (4) input resolution also matters, etc.
…
On Tue, Aug 11, 2020 at 6:32 AM Daniyal Arshad @.***> wrote: First of all thanks for the amazing work on U-2-net. Now i am trying to train the model from scratch on my own dataset which is 60k images which is larger than your dataset. I would like to know how i can increase the model capacity to be able to train on such a dataset. I have considered replacing the standard rebnconv blocks with residuals as suggested in another issue. What other options i could try? I understand that i need to make the architecture deeper, does this mean that i should make RSU-8 or RSU-9 blocks by adding more convolution layers? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#53>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORKJ5UXPQJVYO5AW2RTSAE26NANCNFSM4P27TBGA .
-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

Can you please tell me how to disable the side output? I tried disabling them by commenting them out but it did not work. Thank you so much.

xuebinqin · 2020-09-08T03:12:01Z

The simplest way to disable that is to comment the line 32 - line 37 in the u2net_train.py out. And change line 39 to: loss = loss0.

…

On Fri, Sep 4, 2020 at 6:31 AM EricLe-dev ***@***.***> wrote: Thanks for your interest. You can try following ideas: (1) increase the filter numbers of each layer or add more layers in the basic bn_relu_conv module, (2) remove some of the dense supervision, (3) try to build RSU-8 or RSU-9, (4) input resolution also matters, etc. … <#m_6985007689474618957_> On Tue, Aug 11, 2020 at 6:32 AM Daniyal Arshad *@*.***> wrote: First of all thanks for the amazing work on U-2-net. Now i am trying to train the model from scratch on my own dataset which is 60k images which is larger than your dataset. I would like to know how i can increase the model capacity to be able to train on such a dataset. I have considered replacing the standard rebnconv blocks with residuals as suggested in another issue. What other options i could try? I understand that i need to make the architecture deeper, does this mean that i should make RSU-8 or RSU-9 blocks by adding more convolution layers? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#53 <#53>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORKJ5UXPQJVYO5AW2RTSAE26NANCNFSM4P27TBGA . -- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/ Can you please tell me how to disable the side output? I tried disabling them by commenting them out but it did not work. Thank you so much. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#53 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADSGORIKEUSARWHTI4QKJYTSEDMZ7ANCNFSM4P27TBGA> .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

EricLe-dev · 2020-09-08T03:52:32Z

Thank you so much for your reply. I have a very quick question since I am a big fan of your previous work - BASNet. Does this shares any similarity with this (line 47 - 53 in basnet_train.py)?

As I also shall this kind of behavior with BASNet.
Your quick response is appreciated.

xuebinqin · 2020-09-08T05:04:34Z

Thanks for your interests. It is a bit different. You may have to keep both loss0 and loss1 because loss0 is the refined prediction of loss1.

…

On Mon, Sep 7, 2020 at 9:52 PM EricLe-dev ***@***.***> wrote: Thank you so much for your reply. I have a very quick question since I am a big fan of your previous work BASNet. Does this share any similarity with this <https://github.com/NathanUA/BASNet/blob/6355e62eeb20fa7a033092e33b2d4d87e879b0cc/basnet_train.py#L47> (line 47 - 53 in basnet_train.py)? As I also shall this kind of behavior with BASNet. Your quick response is appreciated. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#53 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADSGORNCOBBTSHKGVBFPEQ3SEWTA3ANCNFSM4P27TBGA> .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

daniyalDE closed this as completed Aug 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to increase model capacity for training on a larger dataset? #53

How to increase model capacity for training on a larger dataset? #53

daniyalDE commented Aug 11, 2020 •

edited

Loading

xuebinqin commented Aug 12, 2020 via email

shgidi commented Aug 12, 2020

daniyalDE commented Aug 12, 2020

daniyalDE commented Aug 12, 2020

xuebinqin commented Aug 13, 2020 via email

EricLe-dev commented Sep 4, 2020

xuebinqin commented Sep 8, 2020 via email

EricLe-dev commented Sep 8, 2020 •

edited

Loading

xuebinqin commented Sep 8, 2020 via email

How to increase model capacity for training on a larger dataset? #53

How to increase model capacity for training on a larger dataset? #53

Comments

daniyalDE commented Aug 11, 2020 • edited Loading

xuebinqin commented Aug 12, 2020 via email

shgidi commented Aug 12, 2020

daniyalDE commented Aug 12, 2020

daniyalDE commented Aug 12, 2020

xuebinqin commented Aug 13, 2020 via email

EricLe-dev commented Sep 4, 2020

xuebinqin commented Sep 8, 2020 via email

EricLe-dev commented Sep 8, 2020 • edited Loading

xuebinqin commented Sep 8, 2020 via email

daniyalDE commented Aug 11, 2020 •

edited

Loading

EricLe-dev commented Sep 8, 2020 •

edited

Loading