Skip to content

Conversation

@pacifikus
Copy link
Contributor

Closes #83

@daavoo daavoo changed the title Dev Huggingface integration Aug 13, 2021
@daavoo daavoo self-assigned this Aug 13, 2021
Copy link
Contributor

@daavoo daavoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @pacifikus ! It looks great.

Could you:

super().__init__()
self.model_file = model_file

def on_evaluate(
Copy link
Contributor

@daavoo daavoo Aug 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why the Callbacks that are part of HuggingFace use on_log instead of on_evaluate:

https://huggingface.co/transformers/_modules/transformers/integrations.html#CometCallback

This callback does look simpler than those other implementations, but I'm just curious about that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comment!

PR in dvc.org: iterative/dvc.org#2718

Yes, there is a reason why the Callbacks that are part of HuggingFace use on_log instead of on_evaluate. As you can see in https://huggingface.co/transformers/main_classes/callback.html#trainercallback, metrics are only available in the on_evaluate event. Thus, we can get the loss values both at the train and eval from on_log, but eval metrics only from on_evaluate.

The on_evaluate event has its disadvantage: we cannot get the training loss during this event, only the evaluation loss.

Copy link
Contributor

@daavoo daavoo Aug 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. Would it make sense to extend DvcLiveCallback to use both on_log (for train and eval losses) and on_evaluate (for eval metrics) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it's a good idea. I did it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the huggingface docs might be a little misleading. I just debug the test locally and it seems that the same values available as metrics inside on_evaluate are previously being passed as logs to on_log.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, it's strange, but metrics logs are also written in on_log event. Thank you for your comment!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, now for different operating systems, different len(first(logs.values())) values in the test are obtained

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that is strange, I will debug locally the macOS test case which is the one failing

Copy link
Contributor

@daavoo daavoo Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still don't understand what the bug was. If I have to guess, it might be related with first behaving differently in MacOS. Anyway I made the test more explicit and it now passes 🎉

@daavoo daavoo self-requested a review August 14, 2021 11:32
@codecov-commenter
Copy link

codecov-commenter commented Aug 14, 2021

Codecov Report

Merging #137 (c236003) into master (01c0513) will increase coverage by 0.43%.
The diff coverage is 100.00%.

❗ Current head c236003 differs from pull request most recent head 9b48440. Consider uploading reports for the commit 9b48440 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #137      +/-   ##
==========================================
+ Coverage   90.00%   90.43%   +0.43%     
==========================================
  Files          12       13       +1     
  Lines         310      324      +14     
==========================================
+ Hits          279      293      +14     
  Misses         31       31              
Impacted Files Coverage Δ
dvclive/huggingface.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 01c0513...9b48440. Read the comment docs.

@daavoo daavoo merged commit 7023203 into iterative:master Aug 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

integrations: HuggingFace transformers

3 participants