-
Notifications
You must be signed in to change notification settings - Fork 28.9k
Removing the modification of loss value due to rounding off to 4 digits #38032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think we could accept a PR for that! Ideally we'd keep the old format for losses > 1e-4 and then switch to scientific notation with 4 (decimal) digits of the significand below that? Please wait for @SunMarc to confirm, though! |
Thanks for the quick reply, @Rocketknight1. Yes, exactly, as soon as the loss <1e-4, we can switch to scientific notation with 4 significant digits in the mantissa and introduce an exponent. Looking forward to hearing the thoughts from @SunMarc as well |
Happy to have this feature ! Would you like to give it a try @harish6696 ? This rounding was mainly introduced to have cleaner logs #9491. Let's also check that these can be correctly interpreted by third party libraries like wandb. |
@Rocketknight1 and @SunMarc . Yes, I would definitely like to try it out. To be honest, I have never done a pull request to such a big repository, but i am willing to try it. Let me know if there is anything special that i should be careful about.. any hints/suggestions are also welcome. |
Since this is a small change, just check that it logs the loss as you expected on wandb for example ! |
System Info
transformers version: 4.50.2
python version: 3.13.1
Who can help?
@zach-huggingface @SunMarc
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Inside the Trainer class, why is the loss rounded to 4 digits? I have applications where I am interested to see the loss go below 4 significant digits, but they all get rounded to 0. Please let the user set this rounding number, or let the loss be displayed in scientific notation like 1.xxxe^y
this is set inside def _maybe_log_save_evaluate()
logs["loss"] = round(tr_loss_scalar / (self.state.global_step - self._globalstep_last_logged), 4)
Expected behavior
It would be great if this hardcoding of rounding off the "loss" is removed. Best output would be to remove the round function and log the loss value in scientific notation which is much clearer.
The text was updated successfully, but these errors were encountered: