Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for WandbLogger #1176

Merged
merged 26 commits into from Apr 30, 2021
Merged

Add support for WandbLogger #1176

merged 26 commits into from Apr 30, 2021

Conversation

AyushExel
Copy link
Contributor

@AyushExel AyushExel commented Apr 14, 2021

Before submitting (checklist)

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contribution guide?
  • Did you check the code style? catalyst-make-codestyle && catalyst-check-codestyle (pip install -U catalyst-codestyle).
  • Did you make sure to update the docs? We use Google format for all the methods and classes.
  • Did you check the docs with make check-docs?
  • Did you write any new necessary tests?
  • Did you check that your code passes the unit tests pytest . ?
  • Did you add your new functionality to the docs?
  • Did you update the CHANGELOG?
  • Did you run colab minimal CI/CD with latest and minimal requirements?

Description

This PR adds support for WandbLogger that enables logging metrics and media to W&B dashboard

Related Issue

Type of Change

  • Examples / docs / tutorials / contributors update
  • Bug fix (non-breaking change which fixes an issue)
  • Improvement (non-breaking change which improves an existing feature)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Additional Deatils:

  • The minimum tests colab seemed stuck while running tests, but the tests passed on my machine. I'll update this thread with test results
  • I've made this draft PR as I still want to confirm the hyperparameter logging behavior. On running the test_finetune2.py train method, no hyperparameters are being logged.

Test logs

Code style
--> catalyst-make-codestyle && catalyst-check-codestyle

python3.8/site-packages/isort/settings.py:619: UserWarning: Failed to pull configuration information from /home/saksham/Desktop/catalyst/setup.cfg
  warn(f"Failed to pull configuration information from {potential_config_file}")
Skipped 55 files
python3.8/site-packages/isort/main.py:1000: UserWarning: W0501: The following deprecated CLI flags were used and ignored: --apply!
  warn(python3.8/site-packages/isort/main.py:1004: UserWarning: W0500: Please see the 5.0.0 Upgrade guide: https://pycqa.github.io/isort/docs/upgrade_guides/5.0.0/
  warn(
All done! ✨ 🍰 ✨
350 files left unchanged.
python3.8/site-packages/isort/settings.py:619: UserWarning: Failed to pull configuration information from /home/catalyst/setup.cfg
  warn(f"Failed to pull configuration information from {potential_config_file}")
Skipped 55 files
All done! ✨ 🍰 ✨
350 files would be left unchanged.
Failed to pull configuration information from home/catalyst/setup.cfg
0

Docs check
-->rm -rf ./builds; REMOVE_BUILDS=0 make check-docs

reading sources... [100%] tutorials/ddp                                                                                                                                                              
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] tutorials/ddp                                                                                                                                                               
generating indices...  genindex py-modindexdone
highlighting module code... [100%] torch.utils.data.sampler                                                                                                                                          
writing additional pages...  search/home/saksham/anaconda3/envs/catalyst_dev/lib/python3.8/site-packages/catalyst_sphinx_theme/search.html:21: RemovedInSphinx30Warning: To modify script_files in the theme is deprecated. Please insert a <script> tag directly in your theme instead.
  {% trans %}Please activate JavaScript to enable the search
done
copying static files... ... done
copying extra files... done
dumping search index in English (code: en)... done
dumping object inventory... done
build succeeded.

The HTML pages are in builds.
#### CODE: 0 ####

Tests
--> pytest .

337 passed, 134 skipped, 2 xfailed, 93 warnings in 439.09s (0:07:19)

@Scitator Let me know if I missed any steps here

FAQ

Please review the FAQ before submitting an issue:

Copy link
Member

@Scitator Scitator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logger looks great! After a few noted hotfixes, let's update the branch to the current master and CI test and merge it 👍

catalyst/loggers/wandb.py Show resolved Hide resolved
catalyst/loggers/wandb.py Outdated Show resolved Hide resolved
catalyst/loggers/wandb.py Outdated Show resolved Hide resolved
catalyst/tests/test_finetune2.py Outdated Show resolved Hide resolved
@pep8speaks
Copy link

pep8speaks commented Apr 22, 2021

Hello @AyushExel! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-04-30 14:36:16 UTC

@AyushExel
Copy link
Contributor Author

@Scitator I'm not sure why tests are failing. I don't think the changes should cause it. Here's the log of pytest on my machine from this branch:

336 passed, 132 skipped, 2 xfailed, 92 warnings in 450.74s (0:07:30) 

@AyushExel AyushExel marked this pull request as ready for review April 22, 2021 13:12
ditwoo
ditwoo previously approved these changes Apr 22, 2021
Co-authored-by: Sergey Kolesnikov <scitator@gmail.com>
@mergify mergify bot dismissed ditwoo’s stale review April 22, 2021 18:27

Pull request has been modified.

@AyushExel
Copy link
Contributor Author

AyushExel commented Apr 26, 2021

@Scitator something that I just noticed. As discussed earlier, I'm sticking with using global_epoch_step for each metric as wandb supports only one global step for all metrics and that should be strictly increasing. But even in the current settings I'm getting these warnings. ( Testing on test/test_finetune2.py's train_experiment("cuda:0"))
Screenshot from 2021-04-26 13-10-57
Now, I remember that this worked fine the last time when I marked this PR for review( example output). It is strange as only the global epoch step is being used and it should definitely be increasing unless there's a typo. Can you find something out of place? Or maybe a recent commit introduced this?

@Scitator
Copy link
Member

@AyushExel could you please try use _suffix rather than /suffix for the logging? maybe different names for the batch-based and epoch-based metrics would work for Wandb

@AyushExel
Copy link
Contributor Author

AyushExel commented Apr 27, 2021

@Scitator sorry it was a false alarm. I didn't rebuild the package after changing sample_step to global_step. The logger works fine and I've updated the test to not log hyperparameters when initializing, instead it only logs through log_hparams now. Should be good to go
EDIT : Example run

if scope == "batch":
metrics = {k: float(v) for k, v in metrics.items()}
self._log_metrics(
metrics=metrics, step=global_epoch_step, loader_key=loader_key, suffix="/batch"
Copy link
Member

@Scitator Scitator Apr 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
metrics=metrics, step=global_epoch_step, loader_key=loader_key, suffix="/batch"
metrics=metrics, step=global_sample_step, loader_key=loader_key, suffix="/batch"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as far as this is per-batch/sample statistics we have to use per-sample counter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using global_sample_step will cause this problem -> #1176 (comment)

Copy link
Member

@Scitator Scitator Apr 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AyushExel could you please try using _suffix rather than /suffix for the logging? maybe different names for the batch-based and epoch-based metrics would work for Wandb

you could not write per-batch metrics under the epoch counter - it's simply incorrect behaviour
could you please try using different names for per-batch and per-epoch metrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the step counter behavior is pretty strict which comes up a lot. But I'm pretty sure that's how it is right now. I tried changing names as you suggested but that didn't work. See this where this behaviour is explained.

As long as you keep passing the same value for step, W&B will collect the keys and values from each call in one unified dictionary. As soon you call wandb.log() with a different value for step than the previous one, W&B will write all the collected keys and values to the history, and start collection over again. Note that this means you should only use this with consecutive values for step: 0, 1, 2, .... This feature doesn't let you write to absolutely any history step that you'd like, only the "current" one and the "next" one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I think you could use different names for batch-based and epoch-based metrics – just rename them in the way, that Wandb would understand :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this might make more sense. So, instead of suffix, I think we can use the prefix. If you log anything in wandb as {'prefix/key': value}, it creates it's own section in the dashboard called prefix and all metrics belonging to the same prefix share the same section.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's up to you, you could create {metric_name}_{prefix}/{somethin}, so it would be accuracy_epoch and accuracy_batch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay updated the names of the metrics as you suggested. Now each metric has its section like {metric_name}_{prefix}. I think it's all good now.

@ditwoo
Copy link
Contributor

ditwoo commented Apr 27, 2021

Hi, @AyushExel

Could you please check that metrics, parameters, etc. will be logged as expected when the experiment has multiple stages?

@AyushExel
Copy link
Contributor Author

Hi, @AyushExel

Could you please check that metrics, parameters, etc. will be logged as expected when the experiment has multiple stages?

@ditwoo any specific example in tests that I should run the logger with ?

@ditwoo
Copy link
Contributor

ditwoo commented Apr 27, 2021

Hi, @AyushExel
Could you please check that metrics, parameters, etc. will be logged as expected when the experiment has multiple stages?

@ditwoo any specific example in tests that I should run the logger with ?

I think you can adjust an example from readme:

import os
from torch import nn, optim
from torch.utils.data import DataLoader
from catalyst import dl, utils
from catalyst.contrib.datasets import MNIST
from catalyst.data.transforms import ToTensor


class CustomRunner(dl.IRunner):
    def __init__(self, logdir, device):
        super().__init__()
        self._logdir = logdir
        self._device = device

    def get_engine(self):
        return dl.DeviceEngine(self._device)

    def get_loggers(self):
        return {
            "console": dl.ConsoleLogger(),
            "csv": dl.CSVLogger(logdir=self._logdir),
            # TODO: finish this part
            "wandb": dl.WandbLogger(...),
        }

    @property
    def stages(self):
        return ["train_freezed", "train_unfreezed"]

    def get_stage_len(self, stage: str) -> int:
        return 3

    def get_loaders(self, stage: str):
        loaders = {
            "train": DataLoader(
                MNIST(os.getcwd(), train=True, download=True, transform=ToTensor()), batch_size=32
            ),
            "valid": DataLoader(
                MNIST(os.getcwd(), train=False, download=True, transform=ToTensor()), batch_size=32
            ),
        }
        return loaders

    def get_model(self, stage: str):
        model = (
            self.model
            if self.model is not None
            else nn.Sequential(nn.Flatten(), nn.Linear(784, 128), nn.ReLU(), nn.Linear(128, 10))
        )
        if stage == "train_freezed":
            # freeze layer
            utils.set_requires_grad(model[1], False)
        else:
            utils.set_requires_grad(model, True)
        return model

    def get_criterion(self, stage: str):
        return nn.CrossEntropyLoss()

    def get_optimizer(self, stage: str, model):
        if stage == "train_freezed":
            return optim.Adam(model.parameters(), lr=1e-3)
        else:
            return optim.SGD(model.parameters(), lr=1e-1)

    def get_scheduler(self, stage: str, optimizer):
        return None

    def get_callbacks(self, stage: str):
        return {
            "criterion": dl.CriterionCallback(
                metric_key="loss", input_key="logits", target_key="targets"
            ),
            "optimizer": dl.OptimizerCallback(metric_key="loss"),
            "checkpoint": dl.CheckpointCallback(
                self._logdir, loader_key="valid", metric_key="loss", minimize=True, save_n_best=3
            ),
        }

    def handle_batch(self, batch):
        x, y = batch
        logits = self.model(x)

        self.batch = {
            "features": x,
            "targets": y,
            "logits": logits,
        }

runner = CustomRunner("./logs", "cpu")
runner.run()

Or adjust this test file.

@AyushExel
Copy link
Contributor Author

@ditwoo Yes above mentioned examples work. Here's sample dashboard. something that I've been noticing in these examples is that they're passing empty Dicts in log_hparams function call. Is that an intended use case or something that's still in the works?

@ditwoo
Copy link
Contributor

ditwoo commented Apr 28, 2021

log_hparams

As far as I know, hparams used when you are doing some architecture search (using optuna or similar).
@Scitator could you please comment if it's expected behavior for hparams?

@Scitator
Copy link
Member

@AyushExel
Copy link
Contributor Author

Okay I think then it's all good. Runner.hparams for these experiments is empty so it makes sense that no values are logged.

@Scitator
Copy link
Member

@AyushExel could you please update the changelog (link in the PR description)?

@Scitator
Copy link
Member

btw, you also need to add the Logger to the docs

@mergify
Copy link

mergify bot commented Apr 30, 2021

This pull request is now in conflicts. @AyushExel, could you fix it? 🙏

@Scitator Scitator merged commit 6cfa27f into catalyst-team:master Apr 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants