-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat] Model version control using W&B Artifacts #1137
Conversation
…rgs (#1) ability to log config file, initialize wandb with kwargs and pass entity argument for teams account.
Hey @ebsmothers, thought of tagging you here for visibility since you looked over my first PR. |
@ayulockin Thanks for the PR! Give us a few days to review this. We will get back to you soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, and for your patience on the review. The changes look good. Can you rebase to factor out the changes from PR#1129? Alternatively we can just close the other PR and use this one instead, whichever you prefer.
Hey, @ebsmothers I rebased to factor in the changes. This PR now contains all the changes from PR#1129. Please take a look and let me know. If you want you can close the PR#1129. |
@ebsmothers has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@ayulockin has updated the pull request. You must reimport the pull request before landing. |
@ebsmothers has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Summary: 🚀 I have extended the `WandbLogger` with the ability to log the `current.pt` checkpoint as W&B Artifacts. Note that this PR is based on top of this [PR](#1129). ### What is W&B Artifacts? > W&B Artifacts was designed to make it effortless to version your datasets and models, regardless of whether you want to store your files with us or whether you already have a bucket you want us to track. Once you've tracked your dataset or model files, W&B will automatically log each and every modification, giving you a complete and auditable history of changes to your files. Through this PR, W&B Artifacts can help save and organize machine learning models throughout a project's lifecycle. More details in the documentation [here](https://docs.wandb.ai/guides/artifacts/model-versioning). ### Modification This PR adds a `log_model_checkpoint` method to the `WandbLogger` class in the `utils/logger.py` file. This method is called in the `utils/checkpoint.py` file. ### Usage To use this, in the `config/defaults.yaml` do, `training.wandb.enabled=true` and `training.wandb.log_checkpoint=true`. ### Result The screenshot shows the `current.pt` checkpoints saved at intervals defined by `training.checkpoint_interval`. You can check out the logged artifacts page [here](https://wandb.ai/ayut/mmf/artifacts/model/run_ey9xextf_model/0dc64164acbdc300fd01/api). ![image](https://user-images.githubusercontent.com/31141479/139390462-d5c8445e-5c20-4fdd-85d0-51ef64846bf0.png) ### Superpowers With this small addition, now one can easily track different versions of the model, download a checkpoint of interest by using the API in the API tab, easily share the checkpoints with teammates, etc. ### Requests This is a draft PR as there are a few more things that can be improved here. * Is there a better way to access the path to the `current.pt` checkpoint? Rather is the modification made to `utils/checkpoint.py` an acceptable way of approaching this? * While logging a file as W&B artifacts we can also provide metadata associated with that file. In this case, we can add current iteration, training metrics, etc. as the metadata. Would love to get suggestions about the different data points that I should log as metadata alongside the checkpoints. * How to determine if a checkpoint is the best one? If a checkpoint is best I can add `best` as an alias for that checkpoint's artifact. Pull Request resolved: #1137 Test Plan: Imported from GitHub, without a `Test Plan:` line. **Static Docs Preview: mmf** |[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D32402090/V6/mmf/)| |**Modified Pages**| |[docs/notes/logger](https://our.intern.facebook.com/intern/staticdocs/eph/D32402090/V6/mmf/docs/notes/logger/)| Reviewed By: apsdehal Differential Revision: D32402090 Pulled By: ebsmothers fbshipit-source-id: 94b881ec55c4197301331d571bc926521e2feecc
🚀 I have extended the
WandbLogger
with the ability to log thecurrent.pt
checkpoint as W&B Artifacts. Note that this PR is based on top of this PR.What is W&B Artifacts?
Through this PR, W&B Artifacts can help save and organize machine learning models throughout a project's lifecycle. More details in the documentation here.
Modification
This PR adds a
log_model_checkpoint
method to theWandbLogger
class in theutils/logger.py
file. This method is called in theutils/checkpoint.py
file.Usage
To use this, in the
config/defaults.yaml
do,training.wandb.enabled=true
andtraining.wandb.log_checkpoint=true
.Result
The screenshot shows the
current.pt
checkpoints saved at intervals defined bytraining.checkpoint_interval
. You can check out the logged artifacts page here.Superpowers
With this small addition, now one can easily track different versions of the model, download a checkpoint of interest by using the API in the API tab, easily share the checkpoints with teammates, etc.
Requests
This is a draft PR as there are a few more things that can be improved here.
Is there a better way to access the path to the
current.pt
checkpoint? Rather is the modification made toutils/checkpoint.py
an acceptable way of approaching this?While logging a file as W&B artifacts we can also provide metadata associated with that file. In this case, we can add current iteration, training metrics, etc. as the metadata. Would love to get suggestions about the different data points that I should log as metadata alongside the checkpoints.
How to determine if a checkpoint is the best one? If a checkpoint is best I can add
best
as an alias for that checkpoint's artifact.