You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In pytorch embedding module, it has scale_grad_by_freq option, to scale gradients by the inverse of frequency of the words in the mini-batch. I am thinking if this is also doable in torchrec, The motivation is that in RecSys or CTR prediction, the embedding table could be very sparse, thus the learning rate for those sparse entries should be larger.
The text was updated successfully, but these errors were encountered:
In pytorch embedding module, it has scale_grad_by_freq option, to scale gradients by the inverse of frequency of the words in the mini-batch. I am thinking if this is also doable in torchrec, The motivation is that in RecSys or CTR prediction, the embedding table could be very sparse, thus the learning rate for those sparse entries should be larger.
The text was updated successfully, but these errors were encountered: