Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Duplicate logging #1012

Closed
Vozf opened this issue Sep 24, 2020 · 12 comments
Closed

[Bug] Duplicate logging #1012

Vozf opened this issue Sep 24, 2020 · 12 comments
Labels
bug Something isn't working help wanted Community help is wanted

Comments

@Vozf
Copy link

Vozf commented Sep 24, 2020

🐛 Bug

Description

I am using pytorch lightning with hydra and output from pytorch-lightning is duplicated by hydra
Here is an example

CometLogger will be initialized in online mode
[2020-09-24 16:04:43,389][lightning][INFO] - CometLogger will be initialized in online mode
GPU available: True, used: True
[2020-09-24 16:04:43,441][lightning][INFO] - GPU available: True, used: True
TPU available: False, using: 0 TPU cores
[2020-09-24 16:04:43,442][lightning][INFO] - TPU available: False, using: 0 TPU cores
CUDA_VISIBLE_DEVICES: [2]
[2020-09-24 16:04:43,442][lightning][INFO] - CUDA_VISIBLE_DEVICES: [2]
@Vozf Vozf added the bug Something isn't working label Sep 24, 2020
@omry
Copy link
Collaborator

omry commented Sep 24, 2020

Thanks for reporting.
You are missing a minimal repro.
Please reopen if you can provide it.

As a side tip:
You can disable Hydra's logging configuration by overriding the config group hydra/job_logging to disabled.

@omry omry closed this as completed Sep 24, 2020
@Huizerd
Copy link

Huizerd commented Dec 24, 2020

Could this be reopened? I'm having the same issue, please see this MWE:

config.yaml:

defaults:
  - hydra/job_logging: disabled

hydra:
  output_subdir: Null
  run:
    dir: .

main.py:

import hydra
from omegaconf import OmegaConf

import pytorch_lightning as pl
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F


class Model(pl.LightningModule):
    def __init__(self, in_size, hid_size, out_size):
        super().__init__()

        self.fc1 = nn.Linear(in_size, hid_size)
        self.fc2 = nn.Linear(hid_size, out_size)

    def forward(self, x):
        return self.fc2(self.fc1(x))

    def training_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = F.cross_entropy(logits, y)
        return loss
    
    def configure_optimizers(self):
        return optim.Adam(self.parameters(), lr=1e-3)


@hydra.main(config_name="config")
def main(cfg):
    print(OmegaConf.to_yaml(cfg))

    model = Model(5, 10, 2)
    trainer = pl.Trainer()


if __name__ == "__main__":
    main()

Having hydra/logging set to disabled (which is desired in my case) leads to the following output (disregard the warning):

{}

/*/issues/venv/lib/python3.8/site-packages/pytorch_lightning/utilities/distributed.py:49: UserWarning: GPU available but not used. Set the --gpus flag when calling the script.
  warnings.warn(*args, **kwargs)

Commenting out hydra/logging leads to the following output:

{}

GPU available: True, used: False
[2020-12-24 10:30:34,753][lightning][INFO] - GPU available: True, used: False
TPU available: None, using: 0 TPU cores
[2020-12-24 10:30:34,753][lightning][INFO] - TPU available: None, using: 0 TPU cores
/home/huis/Documents/issues/venv/lib/python3.8/site-packages/pytorch_lightning/utilities/distributed.py:49: UserWarning: GPU available but not used. Set the --gpus flag when calling the script.
  warnings.warn(*args, **kwargs)

As you can see, with hydra/logging: disabled also the output from PyTorch Lightning is blocked in some way, whereas with Hydra logging enabled, it is printed in duplicate.

In my case, I would want Hydra logging to be disabled, while still being able to see the output from PyTorch Lightning.

pip freeze:

absl-py==0.11.0
antlr4-python3-runtime==4.8
cachetools==4.2.0
certifi==2020.12.5
chardet==4.0.0
fsspec==0.8.5
future==0.18.2
google-auth==1.24.0
google-auth-oauthlib==0.4.2
grpcio==1.34.0
hydra-core==1.0.4
idna==2.10
importlib-resources==4.0.0
Markdown==3.3.3
numpy==1.19.4
oauthlib==3.1.0
omegaconf==2.0.5
pkg-resources==0.0.0
protobuf==3.14.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pytorch-lightning==1.1.2
PyYAML==5.3.1
requests==2.25.1
requests-oauthlib==1.3.0
rsa==4.6
six==1.15.0
tensorboard==2.4.0
tensorboard-plugin-wit==1.7.0
torch==1.7.1
tqdm==4.54.1
typing-extensions==3.7.4.3
urllib3==1.26.2
Werkzeug==1.0.1

@omry
Copy link
Collaborator

omry commented Dec 24, 2020

Thanks for the repro. We will take a look.

About disabled as a logging config:
The intention there is to disable logging output, not to disable the configuration of the logging.

There is a pull request that added support for not configuring the logging in Hydra 1.1 (which is not yet released).
Take a look at that PR. you should be apply the config there to your own application using Hydra 1.0.

Generally speaking, when using Hydra - by default it configures the logging.
You can use Hydra to configure all the application logging. The problem is that many other libraries are also configuring the logging and sometimes the two logging configurations are clashing.
My guess is that both Hydra and Lightning are adding a console appender, which is why you are seeing the log outputs twice.
I further guess that you can fix this by doing one of the following:

  1. Have Hydra not configure the logging at all (see PR above).
  2. Have Hydra configure file logging, but not add a console appender.
  3. Somehow tell Lightning to not configure the logging and trust Hydra to do it.

I am interested in community help to identify the root cause, if you want to do some digging it will help.

@omry omry reopened this Dec 24, 2020
@omry omry added the help wanted Community help is wanted label Dec 24, 2020
@Huizerd
Copy link

Huizerd commented Dec 24, 2020

Thanks for the very quick response and the directions, I will try them out and get back to you with the root cause as soon as I have found it!

@Huizerd
Copy link

Huizerd commented Dec 28, 2020

Ok, multiple things to go on:

  • Configuring the configs like in Implemented disabling logging configuration & added config file #1130 (hydra/job_logging=none and hydra/hydra_logging=none) works perfectly; only the Lightning things will be printed
  • It seems to be related to disable_exisiting_loggers in hydra/job_logging. For the default.yaml config, this is set to false, which leads to duplicate output; however setting it to true (as in disabled.yaml) leads to no output, seemingly because Hydra in that case doesn't receive anything from Lightning..

@omry
Copy link
Collaborator

omry commented Dec 28, 2020

I don't want to disable existing loggers by default because there are many scenarios where you would want to preserve the existing loggers.

Did you find what in Lightning is configuring the logging?
Another idea is to explicitly configure the lightning loggers in your Hydra logging config.

@JackUrb
Copy link

JackUrb commented Dec 31, 2020

Running into a related issue with Mephisto as well. In my case, Hydra doesn't simply duplicate logs, but it turns on logging of all loggers globally to the INFO level:

root.setLevel(logging.INFO)

This leads me to get all kinds of spam from other used packages... Is this actually a desired behavior? If so I'll have to have Mephisto add:

 {"hydra/job_logging": "disabled"},
 {"hydra/hydra_logging": "disabled"},

to all defaults lists, but I'm not sure if I'm losing important logging info by doing this. Am I missing something?

@omry
Copy link
Collaborator

omry commented Dec 31, 2020

@JackUrb, since the problem you are describing is different we should discuss it on it's own issue.

@Huizerd
Copy link

Huizerd commented Jan 11, 2021

@omry it seems to be related to this issue with PyTorch Lightning, and not anything hydra-related, so will continue the search there

@omry
Copy link
Collaborator

omry commented Jan 11, 2021

@Huizerd, thanks for the update!
Closing this, feel free to comment here if you have any follow-up questions/comments.

@hughplay
Copy link

hughplay commented Mar 25, 2021

I provide my observation here for anyone who encounters similar problems.

Show answer

Make sure only the root logger has StreamHandler. For example, if you use pytorch-lightning 1.2.5, you can do:

  1. In your main function, run
pl._logger.handlers = []
pl._logger.propagate = True
  1. Or just simply delay your import pytorch_lightning as pl in your main function.

Long answer and explanation

  1. Basically, this is because there are two loggers both have StreamHandler.
  2. In the newest version of PytorchLightning, they even check whether root logger already has a handler.
# pytorch_lightning.__init__.py, version: 1.2.5
_root_logger = logging.getLogger()
_logger = logging.getLogger(__name__)
_logger.setLevel(logging.INFO)

# if root logger has handlers, propagate messages up and let root logger process them
if not _root_logger.hasHandlers():
    _logger.addHandler(logging.StreamHandler())
    _logger.propagate = False
  1. However, in most cases, we use PL and Hydra like this:
import hydra
import pytorch_lightning
from omegaconf import DictConf

@hydra.main(config_path='conf', config_name='config')
def main(cfg: DictConfig) -> None:
...

if __name__ == '__main__':
    main()
  1. The problem is the checking in step 2 is too early before hydra constructing its logger. The _root_logger is still empty in the checking time until hydra.main.

  2. Therefore, we can use the two ways introduced in the short answer to solve this.

@omry
Copy link
Collaborator

omry commented Mar 25, 2021

Thanks for the comment @hughplay.
I think it's not great that importing PL has a side effect of manipulating the logging subsystem.
Consider reaching out to the PL team to allow some way to opt out of this behavior (or for them to make it opt-in).

One thing you can try is to use hydra.hydra_logging.disable_existing_loggers=true.
I have yet to see a minimal repro using Hydra and PL, so if you can come up with one it will help (Please file it in a different issue to avoid overloading this one).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Community help is wanted
Projects
None yet
Development

No branches or pull requests

5 participants