-
Notifications
You must be signed in to change notification settings - Fork 465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multi-process logger utility for status monitoring #254
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Jon this will help with debugging timeouts massively! Left a few tiny nitpicks 🙏
README.md
Outdated
export TRLX_LOG_LEVEL=WARNING | ||
``` | ||
|
||
> 💡 Tip: To reduce the amount of logging output, you might find it helpful to change log levels of third-party libraries. For the `transformers` library, try setting `transformers.logging.set_verbosity_error()`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really handy, they've added some annoying logs in 4.25.1 (especially if you set prompts to [<eos>]
). Can this method be used for trlx as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup! I should clarify this - I meant to suggest calling transformers.logging.set_verbosity_error()
from a user's trlx script e.g. at the top of examples/ilql_sentiments.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Let me know if it's any clearer 😄 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I understood, I will also clarify that I meant to ask whether you can currently do trlx.logging.set_verbosity_error()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not currently. Maybe we should include logging methods in a similar way to transformers. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean since we're already half-way there we might as well indulge, unless it's harder than it seems of course 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😂 will do!
Update: I've added Hugging Face's logging API (used across their projects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, it looks really pretty especially when training with larger models!
(also I've observed that deepspeed's logger does everything similarly and can be filtered with deepspeed.utils.logger.setLevel(logging.ERROR)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I would not use all this code for logging.
- push the rank configurations around the code that access global variables inside the logging configuration. As a rule of thumb the env variables should be access at the start of the process at configuration time and only once.
- decide on the env variable to use TRLX_LOG_LEVEL vs TRLX_VERBOSITY
- printing tables in the logs is a nightmare for devops because parsing the output programmatically becomes very hard.
|
||
This will suppress `INFO` level messages, but still print `WARNING`, `ERROR`, and `CRITICAL` level messages. | ||
|
||
You can also control logging verbosity by setting the `TRLX_VERBOSITY` environment variable to one of the standard logging [level names](https://docs.python.org/3/library/logging.html#logging-levels): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it TRLX_VERBOSITY
or TRLX_LOG_LEVEL
?
I kind of like more TRLX_LOG_LEVEL
@Mistobaan Yes, I agree with your points but the idea was to keep the API consistent with Hugging Face's logging API which is familiar to many. See links from #254 (comment) If you have strong opinions about not doing this I can revert to the commit prior to introducing this (3034595). Re tables: They already exist in the current |
This PR introduces a basic
logging.Logger
utility for status monitoring from the console with multi-process support. By default, the logger is set to theINFO
level but verbosity can be controlled by setting theTRLX_LOG_LEVEL
environment variable to one of the standard logging level names. Check theREADME.md
update for more.Replaces prints with logger calls to allow for more control over when and how trlx messages are displayed to users. (
print_rank_0
is kept in utils as a debug tool)Adds logger info to indicate evaluation stages such as reward and metric function process during training Verbosity during Rollout + Eval stages in trlx training #239
Re-introduces @PhungVanDuy 's progress bar for evaluating batches of samples Fix progress bar in
BaseRLTrainer.evaluate
for multi-processing #178