Discussion: How to save pytorch model weights in each sub experiments #13

kdg1993 · 2022-12-08T04:13:38Z

What

Discuss how to save pytorch model weights in each sub experiment in hydra multirun + no ray tune setting
I guess we need to discuss the thing we need to log, log file names, and logging structure maybe

Why

There are 4 different cases ( [hydra multirun on/off] X [ray tune on/off] ) when we do experiments with our custom code set
I figured out that under hydra multirun + no ray tune setting, only one best model is saved (not sure for the other person's environment but for me) like below

./logs
└── 2022-12-08_02-12-25
├── best_saved.pth
├── epochs=2,loss=BCE,mode=default,model=mobilenetv3_small_050,num_samples=2,optimizer=adam
│ └── main_Tuner.log
├── epochs=2,loss=BCE,mode=default,model=tinynet_e,num_samples=2,optimizer=adam
│ └── main_Tuner.log
├── epochs=2,loss=Multi_Soft_Margin,mode=default,model=mobilenetv3_small_050,num_samples=2,optimizer=adam
│ └── main_Tuner.log
├── epochs=2,loss=Multi_Soft_Margin,mode=default,model=tinynet_e,num_samples=2,optimizer=adam
│ └── main_Tuner.log
└── multirun.yaml

How

Discuss and fix convention #14
Apply to the code

chrstnkgn · 2022-12-08T04:16:54Z

I thought that I've fixed the issue in the recent commit, but it doesn't seem so.
Can you show me the source code that was used to save the weights?

kdg1993 · 2022-12-08T04:21:59Z

Thanks for your kind and fast response @chrstnkgn 👍

I got that result under the most recent commit so far (bd7043a) which was committed 22/12/08 10:15 without any change in code set

by command in terminal

python main_Tuner.py --multirun loss=BCE,Multi_Soft_Margin model=tinynet_e,mobilenetv3_small_050 epochs=2 num_samples=2 optimizer=adam mode=default

I'll run that again to double-check the result. Thanks again

chrstnkgn · 2022-12-08T04:46:03Z

I've quickly checked the code and yes, the path for saving the weight is set under hydra run_dir for now. The problem is that the saving path for multirun is one level deeper than that of single run.

As for now, I think I can come up with two options to solve this problem:

Make the run_dir for single run one level deeper so that it matches with the run_dir for multirun, or
Set the path for best_saved.pth to the directory under each multirun run_dir (In this case, the single run directory does not change)

kdg1993 · 2022-12-08T05:07:36Z

Both options look like absolutely good solutions!
In my opinion, despite the thing that the single run directory does not change is very attractive, solution (1) might slightly be better than (2).

As far as I know, if I turn the ray tune mode on, then ray automatically makes a directory for each trail and the working directory also moves to the trial's directory. Then, it matches lovely with the torch.save("the current working directory")

However, if we choose the solution (2), then I'm afraid it might be more complicated to make it match with the ray logging system.

I am still not sure about that I fully understand your suggestion, so please let me know whenever if I misunderstood 😃

chrstnkgn · 2022-12-08T05:13:56Z

I think you did fully understand what I meant! Also, I agree with your opinion to make the working directory look more straightforward

kdg1993 · 2022-12-08T05:14:54Z

Any other opinions on this issue and solutions? @juppak @jieonh @seoulsky-field 🙏

kdg1993 added the Discussion Extra attention is needed label Dec 8, 2022

kdg1993 assigned kdg1993, juppak, seoulsky-field, jieonh and chrstnkgn Dec 8, 2022

chrstnkgn mentioned this issue Dec 8, 2022

Hotfix: Minor bugs and working directory problem #16

Closed

3 tasks

kdg1993 closed this as completed Dec 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: How to save pytorch model weights in each sub experiments #13

Discussion: How to save pytorch model weights in each sub experiments #13

kdg1993 commented Dec 8, 2022 •

edited by juppak

Loading

chrstnkgn commented Dec 8, 2022

kdg1993 commented Dec 8, 2022

chrstnkgn commented Dec 8, 2022

kdg1993 commented Dec 8, 2022

chrstnkgn commented Dec 8, 2022

kdg1993 commented Dec 8, 2022

Discussion: How to save pytorch model weights in each sub experiments #13

Discussion: How to save pytorch model weights in each sub experiments #13

Comments

kdg1993 commented Dec 8, 2022 • edited by juppak Loading

What

Why

How

chrstnkgn commented Dec 8, 2022

kdg1993 commented Dec 8, 2022

chrstnkgn commented Dec 8, 2022

kdg1993 commented Dec 8, 2022

chrstnkgn commented Dec 8, 2022

kdg1993 commented Dec 8, 2022

kdg1993 commented Dec 8, 2022 •

edited by juppak

Loading