-
Notifications
You must be signed in to change notification settings - Fork 2.7k
3.1 Documentation Updates #4318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
0be3f4c
Updating Nodes documentation
Millu 7e367d4
Restructured nodes docs
Millu cd1347c
Merge branch 'invoke-ai:main' into main
Millu f666ca8
Comfy to Invoke Overview
Millu 4bc0b6d
Merge branch 'invoke-ai:main' into main
Millu 07e153c
Corrections to Comfy -> Invoke Mappings
Millu 697710c
Adding GA4 to docs
Millu cafd7af
Hiding CLI status
Millu db10a27
Merge branch 'invoke-ai:main' into main
Millu 003ad31
Node doc updates
Millu 6373b1e
Merge branch 'main' into main
Millu f80af87
Merge branch 'main' into main
Millu 631772a
File path updates
Millu 3dd1591
Merge branch 'invoke-ai:main' into main
Millu b019de9
Updates based on lstein's feedback
Millu d56a40c
Fix broken links
Millu ac68b44
Fix broken links
Millu ed453a3
Update comfy to invoke nodes list
Millu 95811a3
Merge branch 'invoke-ai:main' into main
Millu 446ed8e
Updated prompts documenation
Millu a881cb6
Merge pull request #1 from Millu/docs/prompting
Millu bd5bfe4
Fix formatting
Millu 49c1833
Merge branch 'main' into main
Millu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified
BIN
-79.7 KB
(84%)
docs/assets/prompt-blending/blue-sphere-0.25-red-cube-0.75-hybrid.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified
BIN
+112 KB
(120%)
docs/assets/prompt-blending/blue-sphere-0.5-red-cube-0.5-hybrid.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified
BIN
+41.5 KB
(110%)
docs/assets/prompt-blending/blue-sphere-0.75-red-cube-0.25-hybrid.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified
BIN
+97.1 KB
(130%)
docs/assets/prompt-blending/blue-sphere-red-cube-hybrid.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
Taking the time to understand the diffusion process will help you to understand how to more effectively use InvokeAI. | ||
|
||
There are two main ways Stable Diffusion works - with images, and latents. | ||
|
||
Image space represents images in pixel form that you look at. Latent space represents compressed inputs. It’s in latent space that Stable Diffusion processes images. A VAE (Variational Auto Encoder) is responsible for compressing and encoding inputs into latent space, as well as decoding outputs back into image space. | ||
|
||
To fully understand the diffusion process, we need to understand a few more terms: UNet, CLIP, and conditioning. | ||
|
||
A U-Net is a model trained on a large number of latent images with with known amounts of random noise added. This means that the U-Net can be given a slightly noisy image and it will predict the pattern of noise needed to subtract from the image in order to recover the original. | ||
|
||
CLIP is a model that tokenizes and encodes text into conditioning. This conditioning guides the model during the denoising steps to produce a new image. | ||
|
||
The U-Net and CLIP work together during the image generation process at each denoising step, with the U-Net removing noise in such a way that the result is similar to images in the U-Net’s training set, while CLIP guides the U-Net towards creating images that are most similar to the prompt. | ||
|
||
|
||
When you generate an image using text-to-image, multiple steps occur in latent space: | ||
1. Random noise is generated at the chosen height and width. The noise’s characteristics are dictated by seed. This noise tensor is passed into latent space. We’ll call this noise A. | ||
2. Using a model’s U-Net, a noise predictor examines noise A, and the words tokenized by CLIP from your prompt (conditioning). It generates its own noise tensor to predict what the final image might look like in latent space. We’ll call this noise B. | ||
3. Noise B is subtracted from noise A in an attempt to create a latent image consistent with the prompt. This step is repeated for the number of sampler steps chosen. | ||
4. The VAE decodes the final latent image from latent space into image space. | ||
|
||
Image-to-image is a similar process, with only step 1 being different: | ||
1. The input image is encoded from image space into latent space by the VAE. Noise is then added to the input latent image. Denoising Strength dictates how may noise steps are added, and the amount of noise added at each step. A Denoising Strength of 0 means there are 0 steps and no noise added, resulting in an unchanged image, while a Denoising Strength of 1 results in the image being completely replaced with noise and a full set of denoising steps are performance. The process is then the same as steps 2-4 in the text-to-image process. | ||
|
||
Millu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Furthermore, a model provides the CLIP prompt tokenizer, the VAE, and a U-Net (where noise prediction occurs given a prompt and initial noise tensor). | ||
Millu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
A noise scheduler (eg. DPM++ 2M Karras) schedules the subtraction of noise from the latent image across the sampler steps chosen (step 3 above). Less noise is usually subtracted at higher sampler steps. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.