Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor git credential handling in login workflow #1138

Merged
merged 17 commits into from
Nov 4, 2022

Conversation

Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Oct 28, 2022

Fix #1051. Read entire discussion there to get more context.
Also deprecate HfApi.set_access_token which was ambiguous (fix #661).

Here is the new workflow:

1.a. In notebook_login or interpreter_login, a checkbox/prompt asks the user if the token must be set as git credential. Default value is True as I expect most users don't care about it and it would avoid topics will "hey my git clone doesn't work after login, what should I do ?". Since default value is explicitly displayed to the user, it doesn't seem a problem to me.
notebook_login

>>> from huggingface_hub import login; login()

    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    To login, `huggingface_hub` now requires a token generated from https://huggingface.co/settings/tokens .
    
Token: 
Add token as git credential? (Y/n)

1.b. In login, an extra boolean argument add_to_git_credential allows the user to set the token in git credential in the non-blocking login process. Default value is False as we don't want to overwrite a value without the user knowing. A message is still printed to the user to warn about this possibility.

from huggingface_hub import login

login(token="hf_***, add_to_git_credential=True)

2.a If add_to_git_credential=False (either programmatically or from user input), git credential stay untouched.

2.b. If add_to_git_credential=True, we check the list of git credential helpers configured on the machine (cache, store, keychain,...)

2.b.i If no helper is configured, we display a RED warning "hey you should configure a git credential helper". Login is successful (no error raised) but git credential is not set. Special case: in a Google Colab, we set "store" as the default credential helper (if not previously set) and we do not print the warning.

Token is valid.
Cannot authenticate through git-credential as no helper is defined on your machine.
You might have to re-authenticate when pushing to the Hugging Face Hub.
Run the following command in your terminal in case you want to set the 'store' credential helper as default.

git config --global credential.helper store

Read https://git-scm.com/book/en/v2/Git-Tools-Credential-Storage for more details.
Token has not been saved to git credential helper.
Your token has been saved to /home/wauplin/.huggingface/token
Login successful

2.b.ii If at least 1 helper is configured, the credentials are set in all helpers using git credential approve. A message is printed to the user to confirm it.

Token is valid.
Your token has been saved in your configured git credential helpers (store).
Your token has been saved to /home/wauplin/.huggingface/token
Login successful

cc @julien-c @LysandreJik I think this is a workflow as will be as easy as possible for the users while giving them flexibility on overwriting credentials are not. WDYT ?

TODO:

  • adapt tests
  • adapt documentation

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 28, 2022

The documentation is not available anymore as the PR was closed or merged.

@julien-c
Copy link
Member

This sound good to me! Now let's see if it passes the caudine forks of both @LysandreJik and @sgugger 🤞

@julien-c
Copy link
Member

And also tagging @osanseviero for potential DX guidance

@Wauplin
Copy link
Contributor Author

Wauplin commented Oct 28, 2022

This sound good to me! Now let's see if it passes the caudine forks of both LysandreJik and sgugger 🤞

Youhou :D

I also forgot to write about the logout method. Previous implementation was deleting the token from the git credential store. I kept that and replaced it to delete access token from any configured git credential helper. I assume a user that uses the logout function really wants to be logged out. I can optionally add a boolean arg remove_from_git_credential which defaults to False.
To be honest, I really don't know about the usage of this logout method anyway. It's quite new.

@sgugger
Copy link
Contributor

sgugger commented Oct 28, 2022

This is mostly @LysandreJik 's territory and I know too little to be of real help here. The most important thing is to keep Repository working as is in the Trainer (otherwise all the push_to_hub API will fail) but from the conversations I read, it looks like it's taken care of :-)

@Wauplin Wauplin marked this pull request as ready for review November 2, 2022 17:35
Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good to me. I think the implementation is sound.



logger = logging.get_logger(__name__)


def login(token: Optional[str] = None) -> None:
def login(token: Optional[str] = None, add_to_git_credential: bool = False) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have it default to True here as well :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👎 👎 👎

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I genuinely think this default value will not affect a lot of people. In the end, the value is only used in the case the token is hard-coded (e.g. login(token="hf_***")) which is a minor case. Anyone that login with the notebook widget or the terminal prompt will get asked if git credentials must be set, regardless of the add_to_git_credential value.

For the record, when set to False a warning message tells the user "hey, be aware that git credentials are not updated". I am also fine with True so that we don't have users complaining that the login doesn't work. But once again, we are not talking about the generic use case here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the extra context! As a disclaimer, I don't feel very strongly so feel free to go with what you think is best.

Thanks for this great refactor!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll set back the default value to False as @julien-c seems to feel more strongly about it (even though I'm not a fan of the ":-1: :-1: :-1:" feedback).

I made the update in #1152. It's not a related PR but I'm too lazy to open a PR for that 🙄.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry about the confusing feedback! I think it's ok to keep it that way, no big deal at all

it was mostly a (bad) joke targeted at @LysandreJik, I apologize, it was more confusing then necessary. No strong opinion about the actual value here, I think True is fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahah ok, let's keep it to False and users will be warned anyway.

src/huggingface_hub/utils/_git_credential.py Outdated Show resolved Hide resolved
src/huggingface_hub/utils/_subprocess.py Show resolved Hide resolved
@Wauplin
Copy link
Contributor Author

Wauplin commented Nov 4, 2022

Thanks for the review @LysandreJik ! I have made the requested changes and added a test.
Will merge the PR when checks are green ✔️ :)

@Wauplin Wauplin merged commit e646412 into main Nov 4, 2022
@Wauplin Wauplin deleted the 1051-git-credential-refactoring branch November 4, 2022 10:14
Wauplin added a commit that referenced this pull request Nov 7, 2022
* Fix list_models bool parameter

* remove cardData deprecation

* quality

* expect deprecation is code

* switch back add_to_git_credential default value see #1138
@Wauplin Wauplin mentioned this pull request Mar 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants