Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds dvclive tracker #2139

Merged
merged 6 commits into from
Nov 17, 2023
Merged

Conversation

dberenbaum
Copy link
Contributor

@dberenbaum dberenbaum commented Nov 9, 2023

What does this PR do?

Adds a DVCLive tracker. DVCLive is a logger for DVC, which version data, models, metrics, etc. and keeps them connected to your Git repo. They can then be visualized in several different interfaces.

A PR was recently merged to add a logger to transformers, and this PR adds support for accelerate.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@muellerzr From what I can tell, it looks like you would be the relevant reviewer. Please feel free to redirect me otherwise.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@muellerzr muellerzr self-requested a review November 10, 2023 01:08
Copy link
Collaborator

@muellerzr muellerzr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! Very beautiful PR, thanks for this! One last little thing, can you add this to the test_trackers section of the setup.py so the tests can be ran?

@dberenbaum
Copy link
Contributor Author

Excellent! Very beautiful PR, thanks for this! One last little thing, can you add this to the test_trackers section of the setup.py so the tests can be ran?

Thanks for catching that! Added it in the latest commit.

@dberenbaum
Copy link
Contributor Author

Will take a look at the test failures

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clean implementation, thanks for that. I have only some small comments about usage and setting the step, the rest looks good.

A `Tracker` class that supports `dvclive`. Should be initialized at the start of your script.

Args:
run_name (`str`):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's type should mention that it's optional. Also, I don't quite understand what I should do with kwargs here. I think an example would be useful to have.

kwargs:
Additional key word arguments passed along to `dvclive.Live.log_metric()`.
"""
if step:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if step == 0. Should it not still be set? I.e., would it be safer to check if step is not None?

@dberenbaum
Copy link
Contributor Author

@BenjaminBossan Thanks for the catching those issues and the quick review! All should be resolved now.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for addressing my comments, LGTM.

Copy link
Collaborator

@muellerzr muellerzr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all your hard work and iterating with us over this!

@muellerzr muellerzr merged commit 99877f5 into huggingface:main Nov 17, 2023
23 checks passed


@require_dvclive
@mock.patch("dvclive.live.get_dvc_repo", return_value=None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dberenbaum FYI we're seeing failures on the transformers side in the example scripts because of this. Is there a better way to ignore the git repo than patching for cases where we're running a full training script? (Like via an env variable or something). Otherwise I'll have to disable dvc for all of the example script tests in transformers which wouldn't be ideal. https://github.com/huggingface/accelerate/actions/runs/7023901505/job/19111477258

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to take a closer look when I have a minute, but you can use DVCLIVE_TEST = true to disable the git-related functionality. For example:

@mock.patch.dict(os.environ, {"WANDB_MODE": "offline", "DVCLIVE_TEST": "true"})

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I missed that one, fantastic! That should do the job just fine I think. I’ll get to that tommorow and cc you. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is #2196 related?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s not, we saw dependency errors in an isolated env related to the release

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Added a comment there since it should be fixed now. Let me know if it would be helpful to submit a PR for either or if it's quicker to take care of it yourself.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dberenbaum a PR would be most welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants