[feat] Model version control using W&B Artifacts #1137

ayulockin · 2021-10-29T07:20:06Z

🚀 I have extended the WandbLogger with the ability to log the current.pt checkpoint as W&B Artifacts. Note that this PR is based on top of this PR.

What is W&B Artifacts?

W&B Artifacts was designed to make it effortless to version your datasets and models, regardless of whether you want to store your files with us or whether you already have a bucket you want us to track. Once you've tracked your dataset or model files, W&B will automatically log each and every modification, giving you a complete and auditable history of changes to your files.

Through this PR, W&B Artifacts can help save and organize machine learning models throughout a project's lifecycle. More details in the documentation here.

Modification

This PR adds a log_model_checkpoint method to the WandbLogger class in the utils/logger.py file. This method is called in the utils/checkpoint.py file.

Usage

To use this, in the config/defaults.yaml do, training.wandb.enabled=true and training.wandb.log_checkpoint=true.

Result

The screenshot shows the current.pt checkpoints saved at intervals defined by training.checkpoint_interval. You can check out the logged artifacts page here.

Superpowers

With this small addition, now one can easily track different versions of the model, download a checkpoint of interest by using the API in the API tab, easily share the checkpoints with teammates, etc.

Requests

This is a draft PR as there are a few more things that can be improved here.

Is there a better way to access the path to the current.pt checkpoint? Rather is the modification made to utils/checkpoint.py an acceptable way of approaching this?
While logging a file as W&B artifacts we can also provide metadata associated with that file. In this case, we can add current iteration, training metrics, etc. as the metadata. Would love to get suggestions about the different data points that I should log as metadata alongside the checkpoints.
How to determine if a checkpoint is the best one? If a checkpoint is best I can add best as an alias for that checkpoint's artifact.

…rgs (#1) ability to log config file, initialize wandb with kwargs and pass entity argument for teams account.

…rics, log lr

ayulockin · 2021-11-08T08:33:05Z

Hey @ebsmothers, thought of tagging you here for visibility since you looked over my first PR.

apsdehal · 2021-11-09T18:37:14Z

@ayulockin Thanks for the PR! Give us a few days to review this. We will get back to you soon.

ebsmothers

Thanks for the PR, and for your patience on the review. The changes look good. Can you rebase to factor out the changes from PR#1129? Alternatively we can just close the other PR and use this one instead, whichever you prefer.

…ndb-ckpt

ayulockin · 2021-11-12T21:17:47Z

Hey, @ebsmothers I rebased to factor in the changes. This PR now contains all the changes from PR#1129. Please take a look and let me know. If you want you can close the PR#1129.

facebook-github-bot · 2021-11-12T21:46:08Z

@ebsmothers has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-11-23T15:05:07Z

@ayulockin has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2021-11-23T15:06:00Z

@ebsmothers has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: 🚀 I have extended the `WandbLogger` with the ability to log the `current.pt` checkpoint as W&B Artifacts. Note that this PR is based on top of this [PR](#1129). ### What is W&B Artifacts? > W&B Artifacts was designed to make it effortless to version your datasets and models, regardless of whether you want to store your files with us or whether you already have a bucket you want us to track. Once you've tracked your dataset or model files, W&B will automatically log each and every modification, giving you a complete and auditable history of changes to your files. Through this PR, W&B Artifacts can help save and organize machine learning models throughout a project's lifecycle. More details in the documentation [here](https://docs.wandb.ai/guides/artifacts/model-versioning). ### Modification This PR adds a `log_model_checkpoint` method to the `WandbLogger` class in the `utils/logger.py` file. This method is called in the `utils/checkpoint.py` file. ### Usage To use this, in the `config/defaults.yaml` do, `training.wandb.enabled=true` and `training.wandb.log_checkpoint=true`. ### Result The screenshot shows the `current.pt` checkpoints saved at intervals defined by `training.checkpoint_interval`. You can check out the logged artifacts page [here](https://wandb.ai/ayut/mmf/artifacts/model/run_ey9xextf_model/0dc64164acbdc300fd01/api). ![image](https://user-images.githubusercontent.com/31141479/139390462-d5c8445e-5c20-4fdd-85d0-51ef64846bf0.png) ### Superpowers With this small addition, now one can easily track different versions of the model, download a checkpoint of interest by using the API in the API tab, easily share the checkpoints with teammates, etc. ### Requests This is a draft PR as there are a few more things that can be improved here. * Is there a better way to access the path to the `current.pt` checkpoint? Rather is the modification made to `utils/checkpoint.py` an acceptable way of approaching this? * While logging a file as W&B artifacts we can also provide metadata associated with that file. In this case, we can add current iteration, training metrics, etc. as the metadata. Would love to get suggestions about the different data points that I should log as metadata alongside the checkpoints. * How to determine if a checkpoint is the best one? If a checkpoint is best I can add `best` as an alias for that checkpoint's artifact. Pull Request resolved: #1137 Test Plan: Imported from GitHub, without a `Test Plan:` line. **Static Docs Preview: mmf** |[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D32402090/V6/mmf/)| |**Modified Pages**| |[docs/notes/logger](https://our.intern.facebook.com/intern/staticdocs/eph/D32402090/V6/mmf/docs/notes/logger/)| Reviewed By: apsdehal Differential Revision: D32402090 Pulled By: ebsmothers fbshipit-source-id: 94b881ec55c4197301331d571bc926521e2feecc

ayulockin and others added 8 commits October 20, 2021 19:14

[refactor] Extend WandbLogger to log config variables, entity and kwa…

78b1428

…rgs (#1) ability to log config file, initialize wandb with kwargs and pass entity argument for teams account.

Merge branch 'facebookresearch:main' into main

f879c53

Merge branch 'facebookresearch:main' into main

1a74a25

cleaned passing of kwargs, added wandb_logger to write validation met…

a0decd2

…rics, log lr

update docs

a97cf2a

init kwargs (#3)

5cc98b8

wandb checkpointing

5d4ee99

wandb default false

aec0bd7

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Oct 29, 2021

ebsmothers added 2 commits November 1, 2021 16:35

replace usage of open_dict

5dbbbd8

Add config back to WandB init

23addf0

ayulockin marked this pull request as ready for review November 8, 2021 08:29

ayulockin and others added 3 commits November 10, 2021 02:34

minor change to correct the error it was throwing (#4)

2571772

update checkpointing

8f4a55e

remove extra condition to check

2c5e240

ebsmothers requested changes Nov 11, 2021

View reviewed changes

ayulockin added 5 commits November 13, 2021 02:25

wandb checkpointing

443f8c2

wandb default false

dd9db1d

update checkpointing

2d0e08e

remove extra condition to check

0059302

Merge branch 'wandb-ckpt' of https://github.com/ayulockin/mmf into wa…

124cedc

…ndb-ckpt

ayulockin mentioned this pull request Nov 22, 2021

[feat] Integrating W&B Tables for Prediction Visualization #1154

Open

3 tasks

minor doc fix

224daff

facebook-github-bot closed this Nov 23, 2021

ayulockin mentioned this pull request Nov 24, 2021

[refactor] Extend WandbLogger to log config variables, entity and kwargs #1129

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] Model version control using W&B Artifacts #1137

[feat] Model version control using W&B Artifacts #1137

ayulockin commented Oct 29, 2021

ayulockin commented Nov 8, 2021

apsdehal commented Nov 9, 2021

ebsmothers left a comment

ayulockin commented Nov 12, 2021

facebook-github-bot commented Nov 12, 2021

facebook-github-bot commented Nov 23, 2021

facebook-github-bot commented Nov 23, 2021

[feat] Model version control using W&B Artifacts #1137

[feat] Model version control using W&B Artifacts #1137

Conversation

ayulockin commented Oct 29, 2021

What is W&B Artifacts?

Modification

Usage

Result

Superpowers

Requests

ayulockin commented Nov 8, 2021

apsdehal commented Nov 9, 2021

ebsmothers left a comment

Choose a reason for hiding this comment

ayulockin commented Nov 12, 2021

facebook-github-bot commented Nov 12, 2021

facebook-github-bot commented Nov 23, 2021

facebook-github-bot commented Nov 23, 2021