FSDP with named_parameters for disabling weight_decay on some parameters. #17142
Unanswered
fmocking
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, FSDP requires to use
self.trainer.model.parameters()
however to disable the weight decay on some parameters (norm and bias) usingself.trainer.model.named_parameters()
still triggers the error. Here is the code that I'm trying:Unfortunately no luck so far. Any ideas?
Beta Was this translation helpful? Give feedback.
All reactions