Skip to content

Extremely simple GUI app for running diffusion models on your local machine.

License

Notifications You must be signed in to change notification settings

outoftolerance/diffusion_studio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Diffusion GUI

Diffusion GUI is a simple user interface that wraps the diffusers library and allows you to more easily run diffusion models without needing to install more complex systems such as AUTOMATIC1111. Diffusion GUI supports basic image generation, image-2-image remixing, iterative image-2-image remixing (for animation output), and upscaling (if you have a large enough GPU). The user can specify a prompt, negative prompt, the guidance scale, number of iterations, and the noise scale for remixes.

The application will save all user inputs as metadata in the output PNG files, this allows the user to load a previously generated or processed image and remix it with the same prompts without needing to store/remember the prompts elsewhere. This dramatically simplifies output file and prompt management.

Multiple diffision models are supported from the HuggingFace repository, technically you can use any you like, I've just selected some neat looking ones after browsing around for a few minutes.

Installation

Firstly, you need to install CUDA 11.8 and cuDNN for CUDA 11.8. Head to Nvidia's website to get these packages.

Once that's done, just install with pip and you're good to go.

pip install -r requirements.txt
python diffusion_gui.py

Functionality

Actions

  • Generate: Creates a brand new image or images (determined by the Output Image Count setting) from your model selection, prompts and other parameters.
  • Remix: Remixes an image by taking the loaded input iamge and passing it to the selected model as an image2image input along with the prompts and oahter parameters.
  • Iterative Remix: This action will remix the loaded input image, then load the remixed image back in for more remixing, and repeat this for a given number of iterations. This is cool if you want to make an animation/video of the AI changing the image slightly with each iteration. It works with the Output Image Count input too, so you can have up to 4 diverging loops going at once.
  • Upscale: Do this to upscale an the loaded input image to a higher resolution. Supports the Stable Diffusion x4 model only.

Model Setup

  • Diffusion Model: Dropdown selection box of some popular diffusion models from the HuggingFace repository.
  • Scheduler: Dropdown selection box of some popular diffusion schedulers. This is a subset of what the diffusers library supports but is what most people seem to be using most of the time.

Prompts

  • Prompt: What you want.
  • Negative Prompt: What you don't want.

Diffusion Settings

  • Seed: The input seed, integer or hex format, both will work. Will auto-increment the seed by 1 for each of the output images in a batch so as to have variation in the outputs.
  • Lock Seed: If checked, will lock the seed for each of the output images in a batch. Implemented for future prompt mixing with the same seed to get variation based on prompt rather than seed.
  • Image Size: Defines the size of the image to be generated for new generations. Remixes will use the size of the loaded input image.
  • Guidance Scale: How close you want the model to stick to your prompt, if it's generating output that you like then lower this value to stick closely to it when remixing.
  • Inference Step Count: The number of inference steps to perform during Generation or Remixing
  • Output Image Count: Sets the number of output images from all actions except upscaling. This is limited from 1-4 to prevent processing times shooting off to the moon with high output image counts.

Remix Settings

The following settings only apply to remixes and not to new image generation.

  • Noise Strength: Sets the strength of the noise applied to the input image when Remixing. The more noise applied, the more the output will diverge from the input.
  • Iterative Remix Count: The number of iterations that you want to loop for when performing the Iterative Remixing action.

Image Saving

Images are saved in the /output/ directory. Each image is assigned a unique auto-incrementing integer ID based on the images already contained within the output folder at the time the image is generated.

The image PNG metadata contains parameters about how the image was generated, including the prompts and other configuration settings. This can then be loaded to reproduce/tweak an image at a later time.

Image Loading

Previously generated images, or any other input image, can be loaded by the app. If you run the Remix action the loaded image will be used as input to the image2image diffusion pipeline.

Loading an image that was previously generated by Diffusion GUI will automatically load the Diffusion Model, Prompt, Negative Prompt, Guidance Scale, and Inference Step Count that were used to Generate/Remix the loaded image. These values are loaded from the PNG metadata that is embeded in the output images from Diffusion GUI.

Future Work

The following are ideas I've had on how to improve the app and will be working on:

  • Showing the output images in the GUI before saving them to disk. Most of the time the output isn't good and it's not worth saving them just yet.
  • Add support for Control Nets.
  • Add Instruct Pix2Pix support for easier modification of images without the added randomness of re-diffusing them completely.
  • Support prompt mixing with a fixed seed. This will allow easier generation of alternates based on a known good output and allow easier exploration of how prompts affect the output.

Contribution & Collaboration

Reach out! I would love some help and ideas on how to improve the app.

About

Extremely simple GUI app for running diffusion models on your local machine.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages