-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for private models from huggingface.co #9141
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, though the documentation needs tweaking IMO.
Specify token to use as Bearer authorization for remote files. If true, will get token from | ||
~/.huggingface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specify token to use as Bearer authorization for remote files. If true, will get token from | |
~/.huggingface. | |
The token to use as bearer authorization for the remote files. If :obj:`True`, will use the token | |
generated when running :obj:`transformers-cli login` (stored in :obj:`~/.huggingface`). |
Think it's better to tell the user what to do to get that token in the right place than where the token is stored (unless I misunderstood where that token is coming from).
This might also warrant a note like:
.. note::
Passing a token is required when you want to use a private model.
if the default is to not use the token when a user is logged in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense @sgugger. I'll wait for other potential comments and batch them tomorrow.
Re. the note, where would I add it? e.g. https://huggingface.co/transformers/philosophy.html#main-concepts ?
Thoughts on your side on the "should make it enabled by default" question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking in the doc string of from_pretrained
, but it can also go there. The more the merrier since people don't read all the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for "should make it enabled by default", I have no strong feeling either way. It makes sense to me to have some added code where something won't work out of the box for all users. It also makes sense to want logged-in users to be able to use it automatically. I think we can start with your approach and see users thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ For now, I decided against adding token by default to all calls if user is logged in. Let's discuss though!
I think that for a more seamless experience it makes sense to add the token automatically for logged-in users, but otherwise LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd vote for use_auth_token
to be just Optional[str]
and to automatically allow access to privet model by default if user is authorized. I don't see a reason why someone would not want to have access to his/her model if authorized.
IMO, this is both cleaner in terms of design (don't like Union[str, bool]
) and a nicer UX.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
On the default approach, I am also in favor of taking the token by default too.
The only drawback I can see is that now code depends on the token file, which might lead to some confusion for people running code in a different place. We probably should make sure the error they receive is something different than 404 does not exist but 403 forbidden so users can figure out what is the problem.
What do you think ?
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
@patrickvonplaten I don't follow here. In this PR we want to have a way to either pass a token directly, or to opt in to use the one that's store in |
I might have misunderstood a bit what constrains there are on the functionality. I thought, the following logic is possible and makes sense here:
Not sure if there is something I am completely overlooking here in the logic though, e.g. if we cannot know before hand whether the model is private or not |
Ok never mind - as discussed offline this would require more features to add which is out-of-scope for this PR -> so LGTM! |
Also cc'ing @borisdayma as this PR adds a |
Co-Authored-By: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, very nice functionality. I like that it doesn't log anything more than the user agent in the default case.
LGTM
* minor wording tweaks * Create private model repo + exist_ok flag * file_utils: `use_auth_token` * Update src/transformers/file_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Propagate doc from @sgugger Co-Authored-By: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Add a
use_auth_token
flag (or string) to allfrom_pretrained
entry points, to specify token to use as Bearer authorization for remote files.~/.huggingface/token
(will throw if no token there)You can test this with:
We'll add unit tests down the line but need to think about which environment those tests are going to hit.