Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement additional CLI repo functionality #1671

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

martinbrose
Copy link
Contributor

References #1656

Key features

  • Adds huggingface-cli repo list: List repos on huggingface.co, with additional filter for author and string match
  • Adds huggingface-cli repo delete: Delete a repo on huggingface.co
  • Adds huggingface-cli repo toggle: Toggle a repo on huggingface.co private or public
  • Moves huggingface-cli repo create from user.py to the repo focussed repo.py within the src/huggingface_hub/commands folder

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@martinbrose
Copy link
Contributor Author

martinbrose commented Sep 16, 2023

Thinking of this... maybe huggingface-cli repo toggle should be huggingface-cli repo edit [<repository>] --visibility public / private like in GitHub gh repo edit?

@codecov
Copy link

codecov bot commented Sep 19, 2023

Codecov Report

Attention: 111 lines in your changes are missing coverage. Please review.

Comparison is base (89cc691) 81.52% compared to head (e1d8311) 81.21%.

❗ Current head e1d8311 differs from pull request most recent head f951094. Consider uploading reports for the commit f951094 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1671      +/-   ##
==========================================
- Coverage   81.52%   81.21%   -0.32%     
==========================================
  Files          62       63       +1     
  Lines        7198     7228      +30     
==========================================
+ Hits         5868     5870       +2     
- Misses       1330     1358      +28     
Files Coverage Δ
src/huggingface_hub/commands/user.py 0.00% <ø> (ø)
src/huggingface_hub/commands/huggingface_cli.py 0.00% <0.00%> (ø)
src/huggingface_hub/commands/repo.py 31.01% <31.01%> (ø)

... and 14 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @martinbrose thanks for taking this issue! I've made a high-level review focused on how we should build and think the CLI in my opinion.

  • first, that was a good idea to take create_repo from the UsersCommand and create a new dedicated one for create/list/delete
  • now that we reboot the CLI, I think it's time to get rid of most of the legacy code and structure. In particular:
    • no mention to local git whatsoever
    • trying to clean the output so that it can be piped to another command
    • parse the args and statically type them in __init__, when possible. No need to do it when it's not possible.
    • keep the logic as lightweight as possible. Ideally most of the logic must be in the methods we call (create_repo, delete_repo, list_models,...). The only remaining logic to implement is how to parse and forward the values to those methods.
  • I also made some suggestion on the argument names (repo_id, --repo-type, --private, --token)... Let's try to be as consistent as possible with the other CLIs we are building (especially upload/download).

In general, please let me know if you have any questions or want to challenge the comments below. If you think some things can be improved in my suggestion, please let me know! Rebooting this CLI is a great opportunity to (re-)design it and do what we feel is best for the users, knowing the current state of huggingface_hub. The previous iteration dates back ~2y ago so the expectation/usage were quite different! 🤗

@@ -36,6 +37,7 @@ def main():
LfsCommands.register_subcommand(commands_parser)
ScanCacheCommand.register_subcommand(commands_parser)
DeleteCacheCommand.register_subcommand(commands_parser)
RepoCommands.register_subcommand(commands_parser)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this line up between UserCommand and UploadCommand? Looks more important than the lfs and cache commands IMO

default="model",
help='Optional: type: set to "dataset" or "space" if creating a dataset or space, default is model.',
)
repo_create_parser.add_argument("--organization", type=str, help="Optional: organization namespace.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Organization is a bit of a legacy argument. Let's remove it as it is now possible to directly set a repo_id.

Comment on lines +59 to +63
repo_create_parser.add_argument(
"name",
type=str,
help="Name for your repo. Will be namespaced under your username to build the repo id.",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would renamed it repo_id for consistency with the create_repo method.

Comment on lines +64 to +69
repo_create_parser.add_argument(
"--type",
choices=["model", "dataset", "space"],
default="model",
help='Optional: type: set to "dataset" or "space" if creating a dataset or space, default is model.',
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename it --repo-type for consistency as well (same for other parsers)

repo_create_parser.add_argument(
"--space_sdk",
type=str,
help='Optional: Hugging Face Spaces SDK type. Required when --type is set to "space".',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
help='Optional: Hugging Face Spaces SDK type. Required when --type is set to "space".',
help='Choice of SDK to use if --repo-type="space"',

No need to say it's optional (-- args are basically all optional). Same for the other descriptions :)

Comment on lines +214 to +218
print("\nYour repo now lives at:")
print(f" {ANSI.bold(url)}")
print("\nYou can clone it locally with the command below, and commit/push as usual.")
print(f"\n git clone {url}")
print("")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print("\nYour repo now lives at:")
print(f" {ANSI.bold(url)}")
print("\nYou can clone it locally with the command below, and commit/push as usual.")
print(f"\n git clone {url}")
print("")
print("\nYour repo now lives at:")
print(f" {ANSI.bold(url)}")
print("\nYou can clone it locally with the command below, and commit/push as usual.")
print(f"\n git clone {url}")
print("")

That's very legacy as well. Who wants to clone models nowadays? 😄

Comment on lines +223 to +225
self.type = self.args.type
self.author = self.args.author
self.search = self.args.search
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be moved to __init__ and typed.

Comment on lines +237 to +240
print("\nYour repos:")
for repo in repos:
print(f" {ANSI.bold(repo.id)}")
print("")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print("\nYour repos:")
for repo in repos:
print(f" {ANSI.bold(repo.id)}")
print("")
for repo in repos:
print(repo.id)

I would go straight to the point => only print the ids. The advantage of not being verbose is that the output can be piped to another command directly.

For example:

huggingface-cli repo list --author=Wauplin | wc -l

=> counts the number of models on the Hub for author "Wauplin".

This is much harder to do when there's more colors/formatting.

Comment on lines +276 to +280
class RepoToggleCommand(BaseRepoCommand):
def run(self):
self.private = self.args.private
user = self._api.whoami(self.token)["name"]
namespace = self.args.organization if self.args.organization is not None else user
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking of this... maybe huggingface-cli repo toggle should be huggingface-cli repo edit [] --visibility public / private like in GitHub gh repo edit?

That's a good question. I would personally be in favor of

huggingface-cli repo set-public Wauplin/my-cool-model
huggingface-cli repo set-private Wauplin/my-cool-dataset --repo-type="dataset"

This way set-public and set-private are explicit commands. Both commands can share the same logic I think. Looking at gh CLI is a good idea to take inspiration from. In this special case I still think we'll have less possible settings to play so being explicit is fine.

user = self._api.whoami(self.token)["name"]
namespace = self.args.organization if self.args.organization is not None else user

repo_id = f"{namespace}/{self.args.name}"
Copy link
Contributor

@Wauplin Wauplin Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing a whoami to get the namespace (if not provided) by the user is really useful here as it cannot be implicitly used (in opposition to create_repo/delete_repo)! So good to do it this way :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants