Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix FSDP ci test issue with "TypeError: Object of type Tensor is not … #137

Open
wants to merge 1 commit into
base: habana-main
Choose a base branch
from

Conversation

hsubramony
Copy link

…JSON serializable"

Transformer 4.38 logs the grad_norm in log_history. But FSDP doesn't have global grad norm function. When logging non-scalar tensor, the .item fails. The solution now is not to log grad_norm in logging_history for FSDP.

What does this PR do?

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

…JSON serializable"

Transformer 4.38 logs the grad_norm in log_history. But FSDP doesn't have global grad norm function.
When logging non-scalar tensor, the .item fails. The solution now is not to log grad_norm in
logging_history for FSDP.
@libinta libinta requested a review from vivekgoe March 29, 2024 00:35
@vivekgoe
Copy link

@libinta why do we need this on OH-fork? These changes should come to OH-fork when we rebase OH-fork to next OH release, we do not need to cherry-pick these.

@libinta
Copy link
Collaborator

libinta commented Mar 30, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants