Skip to content

v0.6543210 - Diffusers!

Latest
Compare
Choose a tag to compare
@soten355 soten355 released this 06 Aug 05:06
· 20 commits to master since this release
767f8a3

v0.6543210 - The Diffusers Update!

After a couple of months of debugging, and with the release of PyTorch 2.0, I'm excited to offer Diffusers as a render engine for MetalDiffusion! This includes a whole host of new features including more samplers, LoRAs, Token Merging, and so much more.

There's also a major overhaul of the Gradio GUI, setting the groundwork for future changes to the GUI options.

Animation has taken over for video and makes a huge debut with an improved camera movement system.

Additionally, there are some quality of life improvements, including a restructuring of the folders, import via PNG, and some memory efficiency.

Finally, I give my thoughts into the future of TensorFlow for MetalDiffusion.

Diffusers

This is the biggest update to MetalDiffusion. Diffusers is a python module developed by Hugging Face that utilizes "pipelines" to create images with diffusion (Stable Diffusion in particular). Diffusers, up until May, was reliant only on PyTorch 1.x, which wasn't as stable to use on Intel Macs. However, PyTorch 2.0 has been incredibly stable during my tests and I can safely say it's faster than TensorFlow for image generation on an Intel Mac.

Incorporating Diffusers was rather easy, thanks to their excellent documentation. As a user, you'll notice almost no difference aside from some new features. To switch between "render engines", simply go to Advanced Settings and select the render engine you want. Gradio will automatically hide/unhide options specific to the render engine, including weights that only work with either engine.

LoRA's

LoRA's, an incredibly popular tool, is available to MetalDiffusion, exclusively in the Diffusers render engine. Place them in the models/LoRA folder.

Make sure, when using them, to select DPM Solver for best results.

Samplers

Diffusers allows MetalDiffusion to finally use the most common and popular samplers, especially ones like DPM Solver, Euler, and DDIM. These samplers are for the Diffusers render engine only. Selecting them for the TensorFlow engine will do nothing.

Token Merging

By default, Token Merging is activated at 50%. This gives a huge speed increase, but you can also deactivate this for change the token merging strength in the Advanced Settings. Setting Token Merging to 0% deactivates it.

CLIP Skip

Some weights work better with skipping certain layers of the CLIP Text Model. You can now select how many layers to skip in the Advanced Settings.

Animation, the new Video

The video section of MetalDiffusion has been renamed Animation and has a major overhaul in code, specifically with camera movment. The new features are:

  • When you have an input image, you can preview what will change from frame to frame in the preview tab of the main GUI.
  • XYZ movement and rotation
  • Focal length
  • Zoom and angle are no longer an option, instead they are replaced with xyz rotation/movement.

GUI

The GUI has been recoded to be a module that dream.py imports. My goal is to move away from Gradio and use PyQT. Ultimately, this would mean I can create a MacOS app that can be installed just like any other app and launched by double clicking, instead of the current method of using terminal heavily.

Redesign

Inspired by Blender's layout, I've redesigned the GUI to favor bigger preview and result images. This means a big result/preview area on the left with all of the settings on the right.

Previews

ControlNet and Animation previews are now bigger and easier to access. They're under the Preview tab at the top.

Quality of Life

  • Import prior MetalDiffusion settings via .png files
  • Convert safetensors weights into either Keras .h5 folders or Diffusers/HuggingFace folders. Really useful!
  • Some memory efficiency regarding freeing up variables
  • Rich console text utilizing the Rich python module
  • Major restructure of folders, particularly models.

Model Folder Restructure

The models folder, where the weights for Stable Diffusion are stored, has been reorganized to reflect the different render engines. See the ReadMe.md for more information

Future of TensorFlow

I'm quite proud of the work I did with getting TensorFlow to utilize Textural Inversion and ControlNet, but I've been underwhelmed by it's performance on Intel Macs (and performance overall). There are more speed increases I can implement, particularly Token Merging, but it still pales in comparison to PyTorch.

However, I won't remove the TensorFlow render engine because I do think it has it's uses, especially once the TensorFlow team figures out how to use multiple GPUs on Metal.

I hope I can implement LoRA's for the TensorFlow engine by the end of the year, but my main focus will now shift to implementing more popular features and, most importantly, SDXL. Thankfully, Diffusers will help with that!