How to get same result on multiple runs ? #21

Dongwoo-Im · 2022-09-25T15:13:33Z

Hi, thanks for your sharing and contribution!

I tried to reproduce same result about training loss on my custom dataset, but it didn't work.

So, I wonder if HAT can return the exactly same result about training loss.

Any help would be much appreciated, thanks.

My environments

windows 10
python : 3.7.13
pytorch : 1.12.1+cu113
torchvision : 0.13.1+cu113
cuda : 11.3
cudnn : 8.4.1
basicsr : both 1.3.4.9 and 1.4.2 (latest version)

Methods I tried

use_hflip = False
use_rot = False
use_shuffle = False
num_worker_per_gpu = 0

The text was updated successfully, but these errors were encountered:

Dongwoo-Im · 2022-09-25T16:22:23Z

When I set print_freq to 1, we can check that there's some negligible difference in loss

chxy95 · 2022-09-26T08:33:02Z

I do not understand your question. But you can refer to the original training log of HAT for SRx4 on DF2K.
train_416_train_Final_Model_SRx4_scratch_DF2K_CR3W16P64_500k_B4G8_20220301_002049.log

Dongwoo-Im · 2022-09-26T08:46:53Z

@chxy95 Thanks for quick response.
When I run same yaml file twice on my custom dataset, loss is not same. Maybe this is caused by some non-deterministic algorhitm in HAT. Anyway I want to reproduce same result on same yaml file, but I do not know how to do it.

chxy95 · 2022-09-27T07:30:35Z

@Dongwoo-Im I'm not sure the reason for this either. The ONLY random factor in HAT appears to be drop_path_rate. It should have no effect on the performance. You can set it to 0 and observe the training process. If it does not work, there may be some random factors that the current version of BasicSR does not take into account.

Dongwoo-Im · 2022-09-27T08:58:47Z

@chxy95 Thanks your advice. I haven't solved yet, but I'll re-open when I get some insights or find solution.

Dongwoo-Im changed the title ~~HAT can return the exact same result about training loss ?~~ HAT can return the exactly same result about training loss ? Sep 25, 2022

Dongwoo-Im changed the title ~~HAT can return the exactly same result about training loss ?~~ How to get same result on multiple runs ? Sep 26, 2022

Dongwoo-Im closed this as completed Sep 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to get same result on multiple runs ? #21

How to get same result on multiple runs ? #21

Dongwoo-Im commented Sep 25, 2022 •

edited

Dongwoo-Im commented Sep 25, 2022 •

edited

chxy95 commented Sep 26, 2022

Dongwoo-Im commented Sep 26, 2022 •

edited

chxy95 commented Sep 27, 2022

Dongwoo-Im commented Sep 27, 2022

How to get same result on multiple runs ? #21

How to get same result on multiple runs ? #21

Comments

Dongwoo-Im commented Sep 25, 2022 • edited

Dongwoo-Im commented Sep 25, 2022 • edited

chxy95 commented Sep 26, 2022

Dongwoo-Im commented Sep 26, 2022 • edited

chxy95 commented Sep 27, 2022

Dongwoo-Im commented Sep 27, 2022

Dongwoo-Im commented Sep 25, 2022 •

edited

Dongwoo-Im commented Sep 25, 2022 •

edited

Dongwoo-Im commented Sep 26, 2022 •

edited