Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

separate model card git push from the rest #13514

Merged
merged 5 commits into from
Sep 14, 2021
Merged

separate model card git push from the rest #13514

merged 5 commits into from
Sep 14, 2021

Conversation

elishowk
Copy link

@elishowk elishowk commented Sep 10, 2021

What does this PR do?

  • After model card metadata contents validation was deployed to the Hub, we need to ensure transformer's trainer git push are not blocked because of an invalid README.mld yaml.

  • as discussed with @julien-c @Pierrci @sgugger and @LysandreJik the first step to match Hub's model card validation system is to avoid failing a whole git push after training, for the only reason that README.md metadata is not valid.

  • therefore, I tried in this PR to git push the training result independently from the modelcard update, so that the modelcard update failing does not fail the rest, keeping only logging for README.Md push failures.

  • Relates to display git push warnings huggingface_hub#326

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

@elishowk elishowk added model card Related to pretrained model cards work in progress trainer labels Sep 10, 2021
@elishowk elishowk self-assigned this Sep 10, 2021
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

src/transformers/trainer.py Outdated Show resolved Hide resolved
self.create_model_card(model_name=model_name, **kwargs)
try:
self.repo.push_to_hub(commit_message="update model card README.md")
except Exception as exc:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether it's best to catch the error and log it, or just let it be raised naturally. I don't have a strong opinion on this so let's see what others think!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes me neither IDK, it's an ergonomy issue because at this stage, the training has been pushed, so the question is what should we do with a faulty Readme ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And without forgetting fixing model card metadata generation like with this issue #13528

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LysandreJik, any thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this happens mid-training, I would advocate for a very visible warning that the model card is incorrect, rather than erroring out. However, if it's possible to generate the model card before the training/evaluation starts (I understand some values will be invalid such as evaluation results) and identify a potential failure there, then we could error out.

Definitely fine to just log the error though.

PS: Shouldn't the exception caught be a bit more specific than Exception?

Copy link
Author

@elishowk elishowk Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the warnings on git push : git push warnings are now logged as a logger.warning cf. https://github.com/huggingface/huggingface_hub/pull/326/files
As explained the docs https://huggingface.co/transformers/main_classes/trainer.html?highlight=trainer#logging, the only possibility (AFAIK) where a user could not see the warning is if her/his script using transformers sets the logLevel to error only. Elsewhere, the user gets the same warning as huggingface-cli users. Am i wrong ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think logging at the error level is fine. Can you remove the PR from draft and mrge it?

Copy link
Author

@elishowk elishowk Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, please note that I mixed (may be I'm wring) a commit to update to the docs, because huggingface/model_card is gonna be deprecated soon I think

If not its place here, I can remove it

Copy link
Author

@elishowk elishowk Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I resolved the conflict by adding blocking=blocking. Is it okay for you ?

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
@julien-c
Copy link
Member

For tracking purposes, do we have another issue for datasets: - null @elishowk? (which will still fail with this, AFAICT)

@elishowk
Copy link
Author

For tracking purposes, do we have another issue for datasets: - null @elishowk? (which will still fail with this, AFAICT)

Yep : #13528

@elishowk elishowk marked this pull request as ready for review September 14, 2021 15:40
@elishowk elishowk merged commit 054b601 into huggingface:master Sep 14, 2021
@elishowk elishowk deleted the fix-trainer-modelcard branch September 14, 2021 16:07
@elishowk
Copy link
Author

🥳 first PR on transformers, thanks for your help you all !

Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request Jan 13, 2022
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request Jan 27, 2022
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model card Related to pretrained model cards trainer work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants