Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "use huggingface" option like a WANDB #2283

Closed
forestsource opened this issue Apr 14, 2024 · 6 comments
Closed

Add "use huggingface" option like a WANDB #2283

forestsource opened this issue Apr 14, 2024 · 6 comments

Comments

@forestsource
Copy link
Contributor

It is necessary to solve the issue that was discussed with the kohya_ss/sd-scripts.

When logging to wandb is enabled, the entire command line is exposed. Therefore, it is recommended to write wandb API key and HuggingFace token in the configuration file (.toml). Thanks to bghira for raising the issue.
A warning is displayed at the start of training if such information is included in the command line.
Also, if there is an absolute path, the path may be exposed, so it is recommended to specify a relative path or write it in the configuration file. In such cases, an INFO log is displayed.
See kohya-ss/sd-scripts#1123 and PR kohya-ss/sd-scripts#1240 for details.

We need seven options.

  • use Huggingface (bool)
  • huggingface_repo_id (str)
  • huggingface_path_in_repo (str)
  • huggingface_repo_type ('model' or 'dataset')
  • huggingface_repo_visibility ('public' or 'private')
  • huggingface_token (str)
  • async_upload (bool)
@bmaltais
Copy link
Owner

Where do you need those options? In the GUI? How are they linked to the wandb issue? I could possibly add an option to allow to provide a toml config file to as-scripts so you could specify options in there?

@bmaltais
Copy link
Owner

From what I can tell the following huggingface related options are:

  --huggingface_repo_id HUGGINGFACE_REPO_ID
                        huggingface repo name to upload / huggingfaceにアップロードするリポジトリ名
  --huggingface_repo_type HUGGINGFACE_REPO_TYPE
                        huggingface repo type to upload / huggingfaceにアップロードするリポジトリの種類
  --huggingface_path_in_repo HUGGINGFACE_PATH_IN_REPO
                        huggingface model path to upload files / huggingfaceにアップロードするファイルのパス
  --huggingface_token HUGGINGFACE_TOKEN
                        huggingface token / huggingfaceのトークン
  --huggingface_repo_visibility HUGGINGFACE_REPO_VISIBILITY
                        huggingface repository visibility ('public' for public, 'private' or None for private) / huggingfaceにアップロードするリポジトリの公開設定('public'で公開、'private'またはNoneで非公開)
  --save_state_to_huggingface
                        save state to huggingface / huggingfaceにstateを保存する
  --resume_from_huggingface
                        resume from huggingface (ex: --resume {repo_id}/{path_in_repo}:{revision}:{repo_type}) / huggingfaceから学習を再開する(例: --resume {repo_id}/{path_in_repo}:{revision}:{repo_type})
  --async_upload        upload to huggingface asynchronously / huggingfaceに非同期でアップロードする

I could add support for those in the GUI...

@forestsource
Copy link
Contributor Author

How are they linked to the wandb issue?

Currently, to use this option, it is necessary to specify the arguments in "Additional parameters".
Then, it will be treated as a command line option.
Due to the issue mentioned above with wandb, the Hugging Face token becomes publicly exposed to wandb.

Where do you need those options? In the GUI?

When adding fields to the GUI, there is a chance that the GUI may become chaotic.
It would be great if you could add advanced settings from an additional toml file, similar to a "Dataset config file".

@bmaltais
Copy link
Owner

I see, so adding the GUI elements would require passing them to the sd-scripts and would make them visible… so not the best… perhaps creating a toml out of them and passing it to as-scripts as a toml config file parameter would be best… I will investigate this avenue…

@bmaltais
Copy link
Owner

OK, great news. I have implemented the HuggingFace section to all the training tabs. I also switched most of the sd-scripts parameters to use the sd-scripts configuration toml file... This will ensure parameters are no longer exposed via the CLI.

You can test it in the dev branch.

@forestsource
Copy link
Contributor Author

Thank you for the wonderful fix.
I will test it as soon as I finish my tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants