Skip to content

Create a song of any length with Riffusion by outpainting until the desired length has been reached!

Notifications You must be signed in to change notification settings

gitmylo/sd-webui-riffusion-outpaint

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 

Repository files navigation

sd-webui-riffusion-outpaint

Table of contents

  1. What does it do?
  2. Requirements
  3. Known bugs
  4. Recommended settings
  5. Script prompts
  6. How does it work?

What does it do?

Requirements

  • You have to have the riffusion model loaded. (Using it with a different model will yield unexpected results).
  • You could also use a 2 or 4gb riffusion model, by pruning the model, or changing precision. This can easily be done with this extension. This will heavily lower your RAM and VRAM usage, while still retaining good results.

Known bugs

  • Does not work with VladMandic's fork, as the img2img method is changed and is missing the seed_enable_extras parameter. I opened a pull request for this, since this missing causes the plugin to not work anymore. for now, if you want to use VladMandic's fork, but also want to use this extension, use This fork. - merged
  • Shows errors in console when running on fixed VladMandic fork. These can be ignored but are also caused by changes made in the VladMandic fork.

Recommended settings

  • Main settings:
    • Height: 512, Width: 512 (This is the optimal size for Riffusion)
    • Sampling steps: >16 recommended for Euler a (This is the minimum it will generate good results at)
  • Script settings:
    • Img2Img masked content: Fill (The image that is being masked is a repeat of the previously generated section)
      • From best to worst (in my testing, results may vary mostly depending on expand amount):
        • Fill: fill the extra generated space with colors from the previously generated parts (generally the best)
        • Latent nothing: fill the extra generated space with nothing (Can cause inconsistent transitions, slightly better than Latent noise)
        • Latent noise: fill the extra generated space with random noise (Can cause inconsistent transitions)
        • Original: fill the extra generated space with what was there originally (The repeated image, can cause double sounds on transitions)
    • Length: 2 (this will generate a 10-second clip (as 5 * 2 == 10) when using 512x512 resolution)
    • Denoising strength: 1 (This will use the full denoising strength)
    • Expand amount: 64 (This will smooth out transitions somewhat, at 0 you are more likely to hear the transition, works by spreading the mask out to the previously generated parts of the image, to make them smoother. (This will extend the mask further to the left as well))
    • Fast mode (faster generation):
      • Expand amount: 1 (1 means it will inpaint at original width*(keep amount + 1), note: higher values use more VRAM and are not always better.)
      • Keep amount (memory): 1 (1 means it will keep the same width as the starter image for outpainting. Recommended to be the same as Expand amount)
    • Precision mode (smaller chunks):
      • Expand amount: 0.5 (0.5 means it will inpaint at original width*(keep amount + 0.5), note: higher values use more VRAM and are not always better.)
      • Keep amount (memory): 0.5 (0.5 means it will keep the last half of the starting image for outpainting. Recommended to be about half of Expand amount)

Script prompts

Script prompts (\{(eval)} and \{{exec}}) are explained in the extension UI's info section.

How does it work?

  1. Generate the initial image.
  2. Expand the image.
  3. Create an outpainting mask on the expanded area (and spread it).
  4. Generate the masked area. Now the song has been extended, Also replaces some of the old area to improve the transition.
  5. Cut off the newly generated part, and the updated older part.
  6. Repeat from step 2 until the specified length (count) has been reached.
  7. Combine the generated chunks into one big image.

About

Create a song of any length with Riffusion by outpainting until the desired length has been reached!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages