Skip to content

Latest commit

 

History

History
74 lines (65 loc) · 2.59 KB

DA.md

File metadata and controls

74 lines (65 loc) · 2.59 KB

Training and Inference with DreamArtist++

DreamArtist can perform one-shot prompt-tuning training that use only one image. DreamArtist++, on the other hand, adds the Lora module to improve the stability of training and controllability of the model. With only a single image, it is possible to train a pair of Lora model with high generalization and controllability.

Note:

Although the training phase trains both positive and negative words, using only positive prompts in the generation phase yields better results.

positive: positive word + positive lora
negative: negative lora

DreamArtist++

In DreamArtist, positive and negative branches are trained jointly, each with its corresponding Lora and word embedding. Therefore, it is necessary to define the hyperparameters of positive and negative parts in the configuration file.

lora_unet:
  - lr: 1e-4
    rank: 3
    branch: p # positive branch
    layers:
      - 're:.*\.attn.?$'
      #- 're:.*\.ff\.net\.0$' # Increases the fitness, but potentially reduces generalizability and controllability
  - lr: 2e-5 # Low negative unet lr prevents image collapse
    rank: 3
    branch: n # negative branch
    layers:
      - 're:.*\.attn.?$'
      #- 're:.*\.ff\.net\.0$'

lora_text_encoder:
  - lr: 1e-5
    rank: 1
    branch: p
    layers:
      - 're:.*self_attn$'
      - 're:.*mlp$'
  - lr: 1e-5
    rank: 1
    branch: n
    layers:
      - 're:.*self_attn$'
      - 're:.*mlp$'

These two Lora branches share the same basic model but require different trigger words. The trigger words for each branch are defined in the data file (which needs to be created in advance).

data:
  text_transforms:
    transforms:
      - _target_: hcpdiff.utils.caption_tools.TemplateFill
        word_names:
          pt1: [my-cat, my-cat-neg] # Trigger words require a positive or negative pair

When there are no trigger words, the model should act as consistently as possible with the original artworks, in order to reduce style pollution. Therefore, it is also possible to let the model learn some images generated by itself.

data_class:
  caption_file: dataset/image_captions.json # Training text-image pair
  text_transforms:
    transforms:
      - _target_: hcpdiff.utils.caption_tools.TemplateFill
        word_names:
          pt1: ['', ''] # Due to the use of DreamArtist++, all fill words need to be positive and negative pairs.

The prompt used for generation can be randomly selected from the prompt database.

prompt dataset can be selected from here: prompt dataset