Skip to content

Log the gradient norm of loss component and model parameters#289

Merged
wiederm merged 7 commits intomainfrom
dev-log-gradient-norm
Oct 19, 2024
Merged

Log the gradient norm of loss component and model parameters#289
wiederm merged 7 commits intomainfrom
dev-log-gradient-norm

Conversation

@wiederm
Copy link
Copy Markdown
Member

@wiederm wiederm commented Oct 17, 2024

Pull Request Summary

To better understand multitask training behavior, this PR adds features to log the gradient norms (for the model parameter and the loss components). And, as @andrrizzi speculated correctly, this feature requires two additional backward passes (to calculate the loss component gradient norm) and, therefore, adds additional computational costs. This features is not enabled by default.

This is added to the WandB report:
image

Pull Request Checklist

  • Issue(s) raised/addressed and linked
  • Includes appropriate unit test(s)
  • Appropriate docstring(s) added/updated
  • Appropriate .rst doc file(s) added/updated
  • PR is ready for review

@wiederm wiederm added the enhancement New feature or request label Oct 17, 2024
@wiederm wiederm self-assigned this Oct 17, 2024
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Oct 17, 2024

Codecov Report

Attention: Patch coverage is 31.25000% with 22 lines in your changes missing coverage. Please review.

Project coverage is 84.65%. Comparing base (f56dc12) to head (501b4c3).
Report is 8 commits behind head on main.

Additional details and impacted files

@wiederm wiederm requested a review from MarshallYan October 17, 2024 20:02
@wiederm wiederm merged commit 59edb03 into main Oct 19, 2024
@wiederm wiederm deleted the dev-log-gradient-norm branch October 19, 2024 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants