uw initialization #18

antct · 2022-10-22T03:19:33Z

Hi, I found that the value -0.5 was used when initializing the parameter in line 19 of uw.py.
My doubt is why this value is not 0, since the variable loss_scale is equivalent to log \sigma in the original paper.

self.loss_scale = nn.Parameter(torch.tensor([-0.5]*self.task_num, device=self.device))

The text was updated successfully, but these errors were encountered:

Baijiong-Lin · 2022-10-22T06:49:50Z

The initialization here refers to mtan. I think you can also use zero initialization.

antct · 2022-10-22T06:53:10Z

Thanks for your reply~

antct closed this as completed Oct 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uw initialization #18

uw initialization #18

antct commented Oct 22, 2022

Baijiong-Lin commented Oct 22, 2022

antct commented Oct 22, 2022

uw initialization #18

uw initialization #18

Comments

antct commented Oct 22, 2022

Baijiong-Lin commented Oct 22, 2022

antct commented Oct 22, 2022