Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ClearML Logging #6014

Merged
merged 11 commits into from
Feb 28, 2023
Merged

Add ClearML Logging #6014

merged 11 commits into from
Feb 28, 2023

Conversation

ArtyomZemlyak
Copy link
Contributor

@ArtyomZemlyak ArtyomZemlyak commented Feb 14, 2023

What does this PR do ?

Adding ClearML integration for logging of training process:

  • Enable PyTorch auto logging (cfg parameter)
  • TensorBoard auto logging (on default)
  • Logging .nemo checkpoint files

Collection: [Note which collection this PR will affect]
utils

Changelog

  • Add new loggers module into nemo.utils
  • Add new clearml_logger.py into nemo.utils.loggers
  • Add new ClearMLParams and ClearMLLogger classes into clearml_logger.py
  • Move dllogger.py to nemo.utils.loggers
  • Integrate ClearMLLogger to PL loggers

Usage

In model config.yml file:

exp_manager:
  # You may use this section to create a ClearML logger
  create_clearml_logger: true
  clearml_logger_kwargs:
    task: "Test Integration ClearML" # Task name for this experiment 
    project: "Speech/ASR/RUS_OS/Train_NeMo" # Project name for this experiment
    tags: ["Test1", "2"] # Additional tags (default always include "NeMo" tag)
    model_name: "Some name" # Model name in ClearML (default from exp_manager:name)
    log_model: true # If true - logging model .nemo file ClearML
    log_cfg: true # If true - logging model config to model ClearML instance
    log_metrics: true # If true - logging lightning metrics to model ClearML instance
    connect_pytorch: false # If true - enable automatic pytorch logging (all saved checkpoints)

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

Artem Zemliak added 2 commits February 14, 2023 13:29
Signed-off-by: Artem Zemliak <azemlyak@smart-consulting.ru>
Signed-off-by: Artem Zemliak <azemlyak@smart-consulting.ru>
@ArtyomZemlyak
Copy link
Contributor Author

Hi!
@ericharper, @titu1994, @blisc, or @okuchaiev can you check?
I hope this addition can be good for a lot of users :)

Copy link
Collaborator

@titu1994 titu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for the PR ! While the PR does follow the general guidance on optional dependencies (thank you for that), it adds significant custom functions that do not particularly belong in exp manager.

A proposal would be to build a Pytorch Lighting Logger that handles the internals for you and then that object is imported and instantiated inside exp manager.

@SeanNaren please take a look

@ArtyomZemlyak
Copy link
Contributor Author

Thanks for explonaition! I first thought for implementing Pytorch Lighting Logger for ClearML. But I whanted to save .nemo models and misunderstood, I thought it can be done only outside lightning, only in exp_manager. Now I realized, what we can easely save .nemo models from checkpoint and internal config from lightning module (as it saves in exp_manager).

I will try to build a Pytorch Lighting Logger for ClearML with same funcs.

@SeanNaren
Copy link
Collaborator

Thanks for this PR @ArtyomZemlyak!

Agree with the above suggestion, if we can fit this into a ClearMLLogger class rather than touching the internals of the exp_manager that would be preferred. This might require refactor and if it does, let us know!

I thought this could go directly into Lightning but they changed their policy: Lightning-AI/pytorch-lightning#14137 (comment) unfortunately. Not sure why, considering these are important third party libraries (I think HF Trainer even integrates directly).

SeanNaren and others added 3 commits February 19, 2023 11:14
Signed-off-by: Артём Земляк <azemlyak@smart-consulting.ru>
Signed-off-by: Артём Земляк <azemlyak@smart-consulting.ru>
@ArtyomZemlyak
Copy link
Contributor Author

@titu1994 Tried implement ClearMLLogger for .nemo files.

@SeanNaren
Copy link
Collaborator

Looks fantastic! Just a small comment :)

Артём Земляк added 2 commits February 20, 2023 19:45
Signed-off-by: Артём Земляк <azemlyak@smart-consulting.ru>
Signed-off-by: Артём Земляк <azemlyak@smart-consulting.ru>
Артём Земляк and others added 3 commits February 21, 2023 22:01
Signed-off-by: Артём Земляк <azemlyak@smart-consulting.ru>
Signed-off-by: Артём Земляк <azemlyak@smart-consulting.ru>
Copy link
Collaborator

@SeanNaren SeanNaren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice addition!

@SeanNaren SeanNaren merged commit 89b073e into NVIDIA:main Feb 28, 2023
titu1994 pushed a commit to titu1994/NeMo that referenced this pull request Mar 24, 2023
* Add ClearML Logging

Signed-off-by: Artem Zemliak <azemlyak@smart-consulting.ru>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
hsiehjackson pushed a commit to hsiehjackson/NeMo that referenced this pull request Jun 2, 2023
* Add ClearML Logging

Signed-off-by: Artem Zemliak <azemlyak@smart-consulting.ru>
Co-authored-by: Sean Naren <snarenthiran@nvidia.com>
Signed-off-by: hsiehjackson <c2hsieh@ucsd.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants