Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultipleInvalid: extra keys not allowed @ data['datasets'][0]['subsets'][1]['is_reg'] #2647

Open
voplica-git opened this issue Jul 16, 2024 · 0 comments

Comments

@voplica-git
Copy link

voplica-git commented Jul 16, 2024

For some reason I cannot use is_reg parameter for DreamBooth type training.
I'm using the latest commit from dev branch.
My dataset config is the next:

[general]
shuffle_caption = false
caption_extension = ".txt"
keep_tokens = 1

# This is a DreamBooth-style dataset
[[datasets]]
resolution = [1024, 1280]
batch_size = 1
enable_bucket = true
bucket_no_upscale = true

  [[datasets.subsets]]
  image_dir = "/path/to/images/"
  conditioning_data_dir = "/path/to/masks/"
  num_repeats = 63

  [[datasets.subsets]]
  is_reg = true
  image_dir = "/path/to/reg_images/"
  conditioning_data_dir = "/path/to/reg_masks/"
  cache_info = true
  num_repeats = 1

When I hit "Start training" I get the following error:

                    WARNING  clip_skip will be unexpected /                    sdxl_train_util.py:352
                             SDXL学習ではclip_skipは動作しません                                     
2024-07-16 23:41:45 INFO     prepare tokenizers                                sdxl_train_util.py:138
2024-07-16 23:41:46 INFO     update token length: 75                           sdxl_train_util.py:163
                    INFO     Load dataset config from                               sdxl_train.py:133
                             /srv/shared/mirandakerrProject/SOURCE/KOHYA/combined_d                  
                             ataset_dreambooth.toml                                                  
                    WARNING  ignore following options because config file is found: sdxl_train.py:137
                             train_data_dir, in_json /                                               
                             設定ファイルが利用されるため以下のオプションは無視され                  
                             ます: train_data_dir, in_json                                           
                    ERROR    Invalid user config /                                 config_util.py:373
                             ユーザ設定の形式が正しくないようです                                    
Traceback (most recent call last):
  File "/srv/shared/AI/LoraTraining/kohya_ss/sd-scripts/sdxl_train.py", line 948, in <module>
    train(args)
  File "/srv/shared/AI/LoraTraining/kohya_ss/sd-scripts/sdxl_train.py", line 169, in train
    blueprint = blueprint_generator.generate(user_config, args, tokenizer=[tokenizer1, tokenizer2])
  File "/srv/shared/AI/LoraTraining/kohya_ss/sd-scripts/library/config_util.py", line 407, in generate
    sanitized_user_config = self.sanitizer.sanitize_user_config(user_config)
  File "/srv/shared/AI/LoraTraining/kohya_ss/sd-scripts/library/config_util.py", line 370, in sanitize_user_config
    return self.user_config_validator(user_config)
  File "/srv/shared/AI/LoraTraining/kohya_ss/venv/lib/python3.10/site-packages/voluptuous/schema_builder.py", line 272, in __call__
    return self._compiled([], data)
  File "/srv/shared/AI/LoraTraining/kohya_ss/venv/lib/python3.10/site-packages/voluptuous/schema_builder.py", line 595, in validate_dict
    return base_validate(path, iteritems(data), out)
  File "/srv/shared/AI/LoraTraining/kohya_ss/venv/lib/python3.10/site-packages/voluptuous/schema_builder.py", line 433, in validate_mapping
    raise er.MultipleInvalid(errors)
voluptuous.error.MultipleInvalid: extra keys not allowed @ data['datasets'][0]['subsets'][1]['is_reg']
E0716 23:41:50.868000 130445689476160 torch/distributed/elastic/multiprocessing/api.py:826] failed (exitcode: 1) local_rank: 0 (pid: 200042) of binary: /srv/shared/AI/LoraTraining/kohya_ss/venv/bin/python

However, if I remove is_reg option and hit "Start training" I get the following error:

                    INFO     11395 train images with repeating.                    train_util.py:1678
                    INFO     0 reg images.                                         train_util.py:1681
                    WARNING  no regularization images /                            train_util.py:1686
                             正則化画像が見つかりませんでした                                        
Traceback (most recent call last):
  File "/srv/shared/AI/LoraTraining/kohya_ss/sd-scripts/sdxl_train.py", line 948, in <module>
    train(args)
  File "/srv/shared/AI/LoraTraining/kohya_ss/sd-scripts/sdxl_train.py", line 170, in train
    train_dataset_group = config_util.generate_dataset_group_by_blueprint(blueprint.dataset_group)
  File "/srv/shared/AI/LoraTraining/kohya_ss/sd-scripts/library/config_util.py", line 487, in generate_dataset_group_by_blueprint
    dataset = dataset_klass(subsets=subsets, **asdict(dataset_blueprint.params))
  File "/srv/shared/AI/LoraTraining/kohya_ss/sd-scripts/library/train_util.py", line 2038, in __init__
    len(missing_imgs) == 0
AssertionError: missing conditioning data for 5662 images / 制御用画像が見つかりませんでした: ['s2_000000001', 's2_000000002', 's2_000000003', 's2_000000004', 's2_000000005', 's2_000000006', 's2_000000007', 's2_000000008', 's2_000000009', 's2_000000010', 's2_000000011', 's2_000000012', 's2_000000013', 's2_000000014', 's2_000000015', 's2_000000016', 's2_000000017', 's2_000000018', 
...

I can't figure out why is_reg parameter is not supported. Any help is really appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant