Do you have a trained model (checkpoint) for this? #62

LiuTingWed · 2019-10-14T13:34:58Z

hi ! this is an amazing work !
I just wanna use this project to do some researches but my PC too slow.
Can you support a trained model for this ?
thanks.

zhizhangxian · 2019-10-19T02:30:17Z

Do you mean search or retrain?

LiuTingWed · 2019-10-19T08:11:19Z

yes, I am so appriciate if you can support

zhizhangxian · 2019-10-19T12:12:27Z

But Now the search result is not very good. if you want, maybe I can offer a baiduyun drive

LiuTingWed · 2019-10-24T12:06:52Z

Never mind, thanks :-)

LiuTingWed · 2019-10-24T12:26:02Z

I am so interesting about this paper and your work implement to Pytorch.
But as the figure that you provide implementation result see, it's curve is almost match the paper result.
Why you said the search result is not very good ?

zhizhangxian · 2019-10-25T03:30:23Z

miou is lower than paper reports

LiuTingWed · 2019-10-25T13:19:34Z

Oh,too strange,
do you figure out this issue should be?
do you finally search the cell architecture as same as the paper?

zhizhangxian · 2019-10-25T15:33:21Z

No, now we dont solve this issue.
The architect after search is different from paper reports obviously, If run darts, I dont get the same result as paper too...

LiuTingWed · 2019-10-27T09:22:38Z

what a pity!
If you don't mind, please tell me the reason when you figure out.
By the way,what GPU you use to train, P100?

zhizhangxian · 2019-10-27T09:23:44Z

in search V100 *1
in retrain 2080ti * 8

zhizhangxian · 2019-11-05T03:10:18Z

Hey, boy, we have know get a better result for search, it has 0.34miou

baiduyun drive：https://pan.baidu.com/s/1ASRyzK_0m9CvhfN3yHZZ5Q
passwd：px6y

LiuTingWed · 2019-11-05T11:19:51Z

thanks :-)

Sunshine-Ye · 2020-01-11T12:43:18Z

in search V100 *1
in retrain 2080ti * 8

hi ! thanks for doing such amazing work !
If it is convenient, can you tell me how long do you need to finish retrain with 2080ti * 8?
and how high performance can be achieved with the derived model in the paper ?
@zhizhangxian

zhizhangxian · 2020-01-11T14:11:11Z

I trained for about 1M iters on autodeeplab-M I remembered, not used SDP, and got 79.8miou without MS

zhizhangxian · 2020-01-11T14:11:54Z

total train time is about 20 days

Sunshine-Ye · 2020-01-11T15:39:23Z

thanks for your quick reply！
1.SDP means the Scheduled Drop Path method？MS means using multi_scale in train？79.8 miou is evaluated under multi-scale or not？
2.what kind of training do you use：python train.py or CUDA_VISIBLE_DEVICES=0,1,2,···,n python -m torch.distributed.launch --nproc_per_node=n train_distributed.py ？
3.20 days for retrain is a bit too long, it Is caused by the training code or the model itself? can you share the direction of optimization?
4.I have tried to retrain，but the retrain default args and train_distributed.py are not properly set. Although I've tuned the code, I don't know if the parameters are set correctly. can you give me some advice?
look like to your reply, sincerely.

zhizhangxian · 2020-01-11T15:48:22Z

yes, not under ms I remember, I adapt the retrain code from chenxi's deeplab v3 reproduce, he didnot use MS
if you mean retrain, you should use CUDA_VISIBLE_DEVICES=0,1,2,···,n python -m torch.distributed.launch --nproc_per_node=n train_distributed.py to use distributed training
It is just because too much iterations(more than 1M in paper) and only 30K in deeplab v3plus, you can not get a good result if train only several thousands iters without imageNet pretrain, the retrain configurations are same as deeplab v3/v3+
I retrained it with another code(adapted from chenxi's code), I should have fixed the bugs in our project, but these days I am busy in some other things, but maybe I can pay some attention on it after March...
Thanks, good luck!

Sunshine-Ye · 2020-01-13T01:45:47Z

thank you very much！

…

------------------ 原始邮件 ------------------ 发件人: "zhizhangxian"<notifications@github.com>; 发送时间: 2020年1月11日(星期六) 晚上11:48 收件人: "NoamRosenberg/AutoML"<AutoML@noreply.github.com>; 抄送: "0.0"<1639998780@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [NoamRosenberg/AutoML] Do you have a trained model (checkpoint) for this? (#62) yes, not under ms I remember, I adapt the retrain code from chenxi's deeplab v3 reproduce, he didnot use MS if you mean retrain, you should use CUDA_VISIBLE_DEVICES=0,1,2,···,n python -m torch.distributed.launch --nproc_per_node=n train_distributed.py to use distributed training It is just because too much iterations(more than 1M in paper) and only 30K in deeplab v3plus, you can not get a good result if train only several thousands iters without imageNet pretrain, the retrain configurations are same as deeplab v3/v3+ I retrained it with another code(adapted from chenxi's code), I should have fixed the bugs in our project, but these days I am busy in some other things, but maybe I can pay some attention on it after March... Thanks, good luck! — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

HankKung assigned zhizhangxian Oct 18, 2019

zhizhangxian closed this as completed Nov 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do you have a trained model (checkpoint) for this? #62

Do you have a trained model (checkpoint) for this? #62

LiuTingWed commented Oct 14, 2019

zhizhangxian commented Oct 19, 2019

LiuTingWed commented Oct 19, 2019

zhizhangxian commented Oct 19, 2019 •

edited

LiuTingWed commented Oct 24, 2019

LiuTingWed commented Oct 24, 2019

zhizhangxian commented Oct 25, 2019

LiuTingWed commented Oct 25, 2019

zhizhangxian commented Oct 25, 2019

LiuTingWed commented Oct 27, 2019

zhizhangxian commented Oct 27, 2019

zhizhangxian commented Nov 5, 2019

LiuTingWed commented Nov 5, 2019

Sunshine-Ye commented Jan 11, 2020 •

edited

zhizhangxian commented Jan 11, 2020

zhizhangxian commented Jan 11, 2020

Sunshine-Ye commented Jan 11, 2020

zhizhangxian commented Jan 11, 2020

Sunshine-Ye commented Jan 13, 2020 via email

Do you have a trained model (checkpoint) for this? #62

Do you have a trained model (checkpoint) for this? #62

Comments

LiuTingWed commented Oct 14, 2019

zhizhangxian commented Oct 19, 2019

LiuTingWed commented Oct 19, 2019

zhizhangxian commented Oct 19, 2019 • edited

LiuTingWed commented Oct 24, 2019

LiuTingWed commented Oct 24, 2019

zhizhangxian commented Oct 25, 2019

LiuTingWed commented Oct 25, 2019

zhizhangxian commented Oct 25, 2019

LiuTingWed commented Oct 27, 2019

zhizhangxian commented Oct 27, 2019

zhizhangxian commented Nov 5, 2019

LiuTingWed commented Nov 5, 2019

Sunshine-Ye commented Jan 11, 2020 • edited

zhizhangxian commented Jan 11, 2020

zhizhangxian commented Jan 11, 2020

Sunshine-Ye commented Jan 11, 2020

zhizhangxian commented Jan 11, 2020

Sunshine-Ye commented Jan 13, 2020 via email

zhizhangxian commented Oct 19, 2019 •

edited

Sunshine-Ye commented Jan 11, 2020 •

edited