Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add ability to upload PT/TF models to Huggingface Hub #881

Merged
merged 44 commits into from
Apr 13, 2022

Conversation

felixdittrich92
Copy link
Contributor

@felixdittrich92 felixdittrich92 commented Apr 1, 2022

This PR is a follow up on @fg-mindee object detection from_hub and adds the following functionality:

  • you can add --push-to-hub as flag to your training script
  • This does ask to login (if not already logged in) to HF hub
  • creates a repo on the hub for your model (same name as experiment)
  • converts the model in the needed format / saves model configuration
  • adds a README to the repo with some information
  • and finally push to the hub 😃

Any feedback is very welcome 🤗
@osanseviero Suggestions from your side as to whether we have forgotten anything or can do better ? 😄

@fg-mindee @charlesmindee
I am not sure if it is the correct place for the file ..maybe add in doctr.utils.hub to add some tests??

Example Uploads:
https://huggingface.co/Felix92/Test-recognition-PyTorch
https://huggingface.co/Felix92/Test-recognition-TensorFlow

Note: PR include cfg fix for TF Resnet50 and FasterRCNN link to #883 (sry for the mistake 😅 )

@codecov
Copy link

codecov bot commented Apr 1, 2022

Codecov Report

Merging #881 (7c0a2be) into main (1b4b687) will decrease coverage by 0.22%.
The diff coverage is 73.41%.

@@            Coverage Diff             @@
##             main     #881      +/-   ##
==========================================
- Coverage   94.92%   94.69%   -0.23%     
==========================================
  Files         133      135       +2     
  Lines        5279     5355      +76     
==========================================
+ Hits         5011     5071      +60     
- Misses        268      284      +16     
Flag Coverage Δ
unittests 94.69% <73.41%> (-0.23%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
doctr/models/factory/hub.py 66.12% <66.12%> (ø)
doctr/models/__init__.py 100.00% <100.00%> (ø)
doctr/models/classification/resnet/tensorflow.py 100.00% <100.00%> (ø)
doctr/models/detection/zoo.py 96.29% <100.00%> (+0.29%) ⬆️
doctr/models/factory/__init__.py 100.00% <100.00%> (ø)
doctr/models/obj_detection/faster_rcnn/pytorch.py 100.00% <100.00%> (ø)
doctr/models/recognition/zoo.py 100.00% <100.00%> (ø)
doctr/datasets/vocabs.py 100.00% <0.00%> (ø)
doctr/transforms/modules/base.py 94.59% <0.00%> (ø)
doctr/models/recognition/sar/pytorch.py 100.00% <0.00%> (+0.79%) ⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1b4b687...7c0a2be. Read the comment docs.

@felixdittrich92
Copy link
Contributor Author

Depends on: #883

Copy link
Collaborator

@charlesmindee charlesmindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this amazing feature, for the location I think it might be better to include it in the package in a models//factory folder or in a models/factory folder as @fg-mindee did for the from_hub() method (since it is a symmetrical operation and can be decorrelated from training).
What do you think ?

references/hub.py Outdated Show resolved Hide resolved
@felixdittrich92
Copy link
Contributor Author

@charlesmindee Ok i have moved it to models/factory so it is for both TF and PT we do not need to split this :)

But ... any idea how we can test this ? 😅 I am really not sure how without a test account 😃

Copy link
Collaborator

@frgfm frgfm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR Felix 🙏

Regarding the general idea of this feature, it needs to be complementary to "from_hub". For this, I would suggest that we try to find a way to imbue the config structure by default in the function. (Perhaps passing the content as args or kzwargs and creating the dictionary in the function)

Either way, this function needs to have a reduced minimal snippet to push a model in a shell.

So I'd argue that we need to ensure that :

  • we create models with a comprehensive config attribute
  • we use that attribute to create the config and push to hub
  • this config is then used with from_hub to recreate the model

About testing, let's narrow down what we want to test:

  • the HF api is well tested, so we need to focus on our own lines of code
  • the process here is to create a dictionary, dump parameters, create a git repo with it and push it
  • with temporary dirs of pytest, I'd argue we can go up until the creation of the repo
  • the question now is how easily we can split those steps

What do you think? 🙂

doctr/models/factory/hub.py Outdated Show resolved Hide resolved
doctr/models/factory/hub.py Outdated Show resolved Hide resolved
doctr/models/factory/hub.py Outdated Show resolved Hide resolved
doctr/models/factory/hub.py Outdated Show resolved Hide resolved
doctr/models/factory/hub.py Outdated Show resolved Hide resolved
doctr/models/factory/hub.py Outdated Show resolved Hide resolved
doctr/models/factory/hub.py Outdated Show resolved Hide resolved
doctr/models/factory/hub.py Outdated Show resolved Hide resolved
doctr/models/factory/hub.py Outdated Show resolved Hide resolved
@felixdittrich92
Copy link
Contributor Author

felixdittrich92 commented Apr 5, 2022

@frgfm
Nice that you are back 🤗 👍

For the config part:

  • i use currently the model.cfg ([classification] Fix cfgs #883 fixes the last) this are updated while training otherwise its the default_cfg i think this is enough to rebuild the model additional there is stored the model arch and now also the task (recognition/detection/classification/object_detection)

  • i have added some tests i think that will cover most of our site. wdyt ?

Failing tests currently depends on #883 merge and 2 tests in TF about get_layer i will check this later

@felixdittrich92
Copy link
Contributor Author

felixdittrich92 commented Apr 5, 2022

@frgfm @charlesmindee
wdyt ? 🤗

@felixdittrich92 felixdittrich92 changed the title [WIP][references] feat: Add ability to upload PT/TF models to Huggingface Hub feat: Add ability to upload PT/TF models to Huggingface Hub Apr 5, 2022
@fharper
Copy link
Contributor

fharper commented Apr 8, 2022

@charlesmindee can you follow up on the discussion FG started for this PR please?

@felixdittrich92
Copy link
Contributor Author

@charlesmindee failing test has nothing to do with this PR please trigger again if we are fine with the other stuff :)

Copy link
Collaborator

@charlesmindee charlesmindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tests & the edits!

@charlesmindee charlesmindee merged commit f78b22d into mindee:main Apr 13, 2022
@fharper
Copy link
Contributor

fharper commented Apr 13, 2022

Thanks for the PR @felixdittrich92 🎉

@frgfm
Copy link
Collaborator

frgfm commented Apr 27, 2022

Missing "type: new feature", "module: models", "ext: references" and "ext: test" as PR labels I think @charlesmindee :)

@frgfm frgfm added topic: documentation Improvements or additions to documentation module: models Related to doctr.models ext: references Related to references folder type: new feature New feature labels May 2, 2022
@felixdittrich92 felixdittrich92 added this to the 0.6.0 milestone Jun 28, 2022
@felixdittrich92 felixdittrich92 mentioned this pull request Jun 29, 2022
85 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ext: references Related to references folder module: models Related to doctr.models topic: documentation Improvements or additions to documentation type: new feature New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants