-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Labels
enhancementNot as big of a feature, but technically not a bug. Should be easy to fixNot as big of a feature, but technically not a bug. Should be easy to fixoncall: distributedAdd this issue/PR to distributed oncall triage queueAdd this issue/PR to distributed oncall triage queuetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Description
🐛 Describe the bug
Some components of pytorch currently log to python's root logger, e.g.:
$ag -s 'logging.info'
distributed/optim/zero_redundancy_optimizer.py
1387: logging.info(
1392: logging.info(
1506: logging.info(
distributed/elastic/timer/local_timer.py
118: logging.info(f"Process with pid={worker_id} does not exist. Skipping")
distributed/elastic/timer/api.py
191: logging.info(
196: logging.info(f"Successfully reaped worker=[{worker_id}]")
208: logging.info(
216: logging.info("Starting watchdog thread...")
220: logging.info(f"Stopping {type(self).__name__}")
223: logging.info("Stopping watchdog thread...")
227: logging.info("No watchdog thread running, doing nothing")
239: logging.info(f"Timer client configured to: {type(_timer_client).__name__}")
distributed/distributed_c10d.py
3232: logging.info("Rank {} is assigned to subgroup {}".format(rank, ranks))
I believe this isn't a good practice. As a result user cannot easily control the behavior of pytorch's logs / split it from user's own logs.
The issue is most frequent under torch.distributed
.
These code should log to logger.getLogger(__name__)
instead. Same for warning
and error
.
Versions
latest master
cc @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @SciPioneer @H-Huang
Metadata
Metadata
Assignees
Labels
enhancementNot as big of a feature, but technically not a bug. Should be easy to fixNot as big of a feature, but technically not a bug. Should be easy to fixoncall: distributedAdd this issue/PR to distributed oncall triage queueAdd this issue/PR to distributed oncall triage queuetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module