Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gradvac的梯度更新 #36

Closed
MartinPR307 opened this issue Apr 17, 2023 · 4 comments
Closed

Gradvac的梯度更新 #36

MartinPR307 opened this issue Apr 17, 2023 · 4 comments

Comments

@MartinPR307
Copy link

您好!Gradvac原文中提到了网络不同层间的梯度相似度最后收敛到不同的值,所以对不同任务以及不同层设置了不同的目标值。
原文描述如下:
To incorporate these three factors, we exploit an exponential moving average (EMA) variable for tasks i, j and parameter
group k (e.g. the k-th layer) as:
image
但你们实现的Gradvac仍然只是对不同任务间设置了不同目标值。这是否合理?

@Baijiong-Lin
Copy link
Collaborator

这里的parameter group k是一个可调的超参数,正如原文的实验,k可以是whole_model也可以是all_layer。我们实现的是whole_model版本,后续我们会修改我们的实现,把k这个超参数加进去。

@MartinPR307
Copy link
Author

谢谢回复!期待细粒度的版本!

Baijiong-Lin added a commit that referenced this issue Jun 19, 2023
@Baijiong-Lin
Copy link
Collaborator

@MartinPR307 GradVac的实现已经修改,通过--GradVac_group_type可以选择whole_model,all_layer和all_matrix,分别对应原文的描述如下,
image

@Baijiong-Lin
Copy link
Collaborator

Closed as no further updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants