Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic prompt generation script for parameter scans #2831

Merged
merged 14 commits into from
Mar 10, 2023

Conversation

lstein
Copy link
Collaborator

@lstein lstein commented Feb 27, 2023

Programatically generate a large number of images varying by prompt and other image generation parameters

This is a little standalone script named dynamic_prompting.py that enables the generation of dynamic prompts. Using YAML syntax, you specify a template of prompt phrases and lists of generation parameters, and the script will generate a cross product of prompts and generation settings for you. You can save these prompts to disk for later use, or pipe them to the invokeai CLI to generate the images on the fly.

Typical uses are testing step and CFG values systematically while holding the seed and prompt constant, testing out various artist's styles, and comparing the results of the same prompt across different models.

A typical template will look like this:

model: stable-diffusion-1.5
steps: 30;50;10
seed: 50
dimensions: 512x512
cfg:
  - 7
  - 12
sampler:
  - k_euler_a
  - k_lms
prompt:
  style:
       - greg rutkowski
       - gustav klimt
  location:
       - the mountains
       - a desert
  object:
       - luxurious dwelling
       - crude tent
  template: a {object} in {location}, in the style of {style}

This will generate 96 different images, each of which varies by one of the dimensions specified in the template. For example, the prompt axis will generate a cross product list like:

a luxurious dwelling in the mountains, in the style of greg rutkowski
a luxurious dwelling in the mountains, in the style of gustav klimt
a luxious dwelling in a desert, in the style of greg rutkowski
... etc

A typical usage would be:

python scripts/dynamic_prompts.py --invoke --outdir=/tmp/scanning my_template.yaml

This will populate /tmp/scanning with each of the requested images, and also generate a log.md file which you can open with an e-book reader to show something like this:

image

Full instructions can be obtained using the --instructions switch, and an example template can be printed out using --example:

python scripts/dynamic_prompts.py --instructions
python scripts/dynamic_prompts.py --example > my_first_template.yaml

Simple script to generate a file of InvokeAI prompts and settings
that scan across steps and other parameters.

To use, create a file named "template.yaml" (or similar) formatted like this
>>> cut here <<<
steps: "30:50:1"
seed: 50
cfg:
  - 7
  - 8
  - 12
sampler:
  - ddim
  - k_lms
prompt:
  - a sunny meadow in the mountains
  - a gathering storm in the mountains
>>> cut here <<<

Create sections named "steps", "seed", "cfg", "sampler" and "prompt".
- Each section can have a constant value such as this:
     steps: 50
- Or a range of numeric values in the format:
     steps: "<start>:<stop>:<step>"
- Or a list of values in the format:
     - value1
     - value2
     - value3

Be careful to: 1) put quotation marks around numeric ranges; 2) put a
space between the "-" and the value in a list of values; and 3) use spaces,
not tabs, at the beginnings of indented lines.

When you run this script, capture the output into a text file like this:

    python generate_param_scan.py template.yaml > output_prompts.txt

"output_prompts.txt" will now contain an expansion of all the list
values you provided. You can examine it in a text editor such as
Notepad.

Now start the CLI, and feed the expanded prompt file to it using the
"!replay" command:

   !replay output_prompts.txt

Alternatively, you can directly feed the output of this script
by issuing a command like this from the developer's console:

   python generate_param_scan.py template.yaml | invokeai

You can use the web interface to view the resulting images and their
metadata.
@hipsterusername
Copy link
Member

Very cool! Is this effectively the “dynamic prompt” feature? I.e., it’s creating every possible permutation of all those settings?

@lstein
Copy link
Collaborator Author

lstein commented Feb 28, 2023

Yeah, that's what this is. Hang on a bit and I'll add a syntax for combining prompt fragments like this:

style:
       - greg rutkowski
       - gustav klimt
       - renoir
       - donetello
subject:
       - two friends walking in the park
       - two dogs playing in the dogpark
prompt: {subject} in the style of {style}

- ability to cycle through models and dimensions
- process automatically through invokeai
- create an .md file to display the grid results
@lstein lstein changed the title add a simple parameter scanning script to the scripts directory Dynamic prompt generation script for parameter scans Feb 28, 2023
@JPPhoto
Copy link
Contributor

JPPhoto commented Feb 28, 2023

I believe that dynamic prompting with some syntax ((a,b).allOf()/(a,b).oneOf()) is on the compel roadmap, but that's a @damian0815 question.

@lstein
Copy link
Collaborator Author

lstein commented Feb 28, 2023

I believe that dynamic prompting with some syntax ((a,b).allOf()/(a,b).oneOf()) is on the compel roadmap, but that's a @damian0815 question.

That's why this is going into the v2.3 branch as a standalone script that doesn't touch the main code base. I know it will be superseded by @damian0815 's compel work.

@JPPhoto
Copy link
Contributor

JPPhoto commented Feb 28, 2023

I believe that dynamic prompting with some syntax ((a,b).allOf()/(a,b).oneOf()) is on the compel roadmap, but that's a @damian0815 question.

That's why this is going into the v2.3 branch as a standalone script that doesn't touch the main code base. I know it will be superseded by @damian0815 's compel work.

Part of it, yes... but compel doesn't handle parameters and I think being able to vary those will be very helpful on main in the CLI or via a script like this.

@lstein
Copy link
Collaborator Author

lstein commented Mar 1, 2023

I believe that dynamic prompting with some syntax ((a,b).allOf()/(a,b).oneOf()) is on the compel roadmap, but that's a @damian0815 question.

That's why this is going into the v2.3 branch as a standalone script that doesn't touch the main code base. I know it will be superseded by @damian0815 's compel work.

Part of it, yes... but compel doesn't handle parameters and I think being able to vary those will be very helpful on main in the CLI or via a script like this.

Nothing preventing this from going into main as well, but I'm already getting more feature requests; for instance, it doesn't do img2img

Copy link
Member

@hipsterusername hipsterusername left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My review won't unblock this, but I'll give it the performative 👍 to support this being merged.

@lstein
Copy link
Collaborator Author

lstein commented Mar 5, 2023

Update: To allow for faster dynamic prompt generation, the script now renders in parallel across multiple processes and GPUs. By default, it will launch one process per CUDA GPU, but this can be controlled using --processes_per_gpu. For example this will run 4 invokeai instances on each GPU, assuming there is enough VRAM to support the rendering:

scripts/dynamic_prompts.py example2.yaml --invoke --outdir=~/outputs --processes=4
>> Spawning 8 invokeai processes across 2 CUDA gpus
>> Process 2262144 running on GPU 1
>> Process 2262141 running on GPU 0
>> Process 2262139 running on GPU 0
>> Process 2262145 running on GPU 0
>> Process 2262140 running on GPU 1
>> Process 2262143 running on GPU 0
>> Process 2262142 running on GPU 1
>> Process 2262146 running on GPU 1
[133] /home/lstein/outputs/dp.0001.a sunny meadow in the mountains in the style of greg rutkowski.png
[134] /home/lstein/outputs/dp.0009.a gathering storm in the mountains in the style of greg rutkowski.png
[133] /home/lstein/outputs/dp.0002.a sunny meadow in the mountains in the style of greg rutkowski.png
[135] /home/lstein/outputs/dp.0010.a gathering storm in the mountains in the style of greg rutkowski.png
[134] /home/lstein/outputs/dp.0011.a gathering storm in the mountains in the style of gustav klimt.png
[135] /home/lstein/outputs/dp.0013.a gathering storm in the mountains in the style of renoir.png
[136] /home/lstein/outputs/dp.0012.a gathering storm in the mountains in the style of gustav klimt.png
[133] /home/lstein/outputs/dp.0003.a sunny meadow in the mountains in the style of gustav klimt.png
[133] /home/lstein/outputs/dp.0007.a sunny meadow in the mountains in the style of donetello.png
[133] /home/lstein/outputs/dp.0005.a sunny meadow in the mountains in the style of renoir.png
[137] /home/lstein/outputs/dp.0015.a gathering storm in the mountains in the style of donetello.png
[136] /home/lstein/outputs/dp.0014.a gathering storm in the mountains in the style of renoir.png
[133] /home/lstein/outputs/dp.0008.a sunny meadow in the mountains in the style of donetello.png
[133] /home/lstein/outputs/dp.0006.a sunny meadow in the mountains in the style of renoir.png
[133] /home/lstein/outputs/dp.0004.a sunny meadow in the mountains in the style of gustav klimt.png
[134] /home/lstein/outputs/dp.0016.a gathering storm in the mountains in the style of donetello.png

Note that the images come out in higgledy-piggledy order, but on the filesystem they will sort in the order specified in the prompt template.

I am considering a simple extension to allow this script to run across multiple nodes of a compute cluster, provided that the user has installed invokeai each of the nodes, but I should get back to working on nodes first.

Copy link
Member

@ebr ebr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worked great with up to 3 parallel processes on a 24GB Tesla P40, after a few tries to get it going. I noted the issues inline.
Also, because stdout from individual threads isn't captured, it wasn't easy to determine their root causes.
But overall works great in parallel on a single GPU (with the expected performance hit).

scripts/dynamic_prompts.py Outdated Show resolved Hide resolved
scripts/dynamic_prompts.py Outdated Show resolved Hide resolved
@ebr
Copy link
Member

ebr commented Mar 7, 2023

I am considering a simple extension to allow this script to run across multiple nodes of a compute cluster, provided that the user has installed invokeai each of the nodes, but I should get back to working on nodes first.

@lstein interestingly enough, the way you've structured the YAML is exactly how I'd write this as an input of an Argo workflow, which could easily parallelize it across hundreds of GPU instances. What's the approach you're thinking of? I did take a stab at it again using nodes over the weekend, but didn't get very far yet.

@lstein lstein requested a review from mauwii as a code owner March 7, 2023 15:10
@lstein
Copy link
Collaborator Author

lstein commented Mar 7, 2023 via email

@ebr
Copy link
Member

ebr commented Mar 8, 2023

Cool! sounds similar to the SLURM scheduler, right? haven't worked with it it myself, but i know it's widely used in academia.
I started looking into doing this on Nodes, but no significant progress yet.

Yes, i did test multiprocessing (single GPU, 3 threads) - worked very well with the exception of a couple stumbling blocks - see my previous comments. Main issues were the prompt-in-filename and the lack of log output. Worked great once I figured out my prompts were not filename-friendly.

obscured logging output IS indeed a problem for troubleshooting though. I haven't gotten a CUDA OOM (will try!), but overall it would be helpful to capture and print stdout/stderr from the threads, even at the expense of pretty output, at least for now.

I can approve to unblock this, but please ping me to re-test if you'd like.

Copy link
Member

@ebr ebr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worked well with simple prompts (3 processes / 1 GPU)

filenames should not contain the prompt - this breaks with more complex prompts

@lstein
Copy link
Collaborator Author

lstein commented Mar 9, 2023

Cool! sounds similar to the SLURM scheduler, right? haven't worked with it it myself, but i know it's widely used in academia. I started looking into doing this on Nodes, but no significant progress yet.

Yes, i did test multiprocessing (single GPU, 3 threads) - worked very well with the exception of a couple stumbling blocks - see my previous comments. Main issues were the prompt-in-filename and the lack of log output. Worked great once I figured out my prompts were not filename-friendly.

obscured logging output IS indeed a problem for troubleshooting though. I haven't gotten a CUDA OOM (will try!), but overall it would be helpful to capture and print stdout/stderr from the threads, even at the expense of pretty output, at least for now.

I can approve to unblock this, but please ping me to re-test if you'd like.

I think the lack of logging is a blocker, and I'll fix this by redirecting stderr to a file rather than trying to suppress it. The reason I tried to suppress it is that I couldn't stand the sight of 8 TQDM's fighting with each other for control of the console!

@ebr
Copy link
Member

ebr commented Mar 9, 2023

[..] I couldn't stand the sight of 8 TQDM's fighting with each other for control of the console!

agreed! I'm very tempted to solve this with rich.Live or try my hand at npyscreen, but must resist getting nerd-sniped at the moment. Redirecting stderr to a file would definitely do the trick for now

@lstein
Copy link
Collaborator Author

lstein commented Mar 9, 2023

@ebr Logging of stderr is now implemented correctly (I think). Here is what it looks like:

(2.3) lstein@gpu-nvidia:~/Projects/InvokeAI-2.3$ invokeai-batch --processes_per_gpu=3 --invoke --outdir=/tmp/outputs ./example2.yaml
>> Spawning 6 invokeai processes across 2 CUDA gpus
>> Outputs will be written into /tmp/outputs, and error logs will be written to /tmp/outputs/invokeai-batch-logs
>> Process 5053 running on GPU 1; logging to /tmp/outputs/invokeai-batch-logs/2023-03-09-11:52:38-pid=5053.txt
>> Process 5052 running on GPU 0; logging to /tmp/outputs/invokeai-batch-logs/2023-03-09-11:52:38-pid=5052.txt
>> Process 5051 running on GPU 1; logging to /tmp/outputs/invokeai-batch-logs/2023-03-09-11:52:38-pid=5051.txt
>> Process 5048 running on GPU 0; logging to /tmp/outputs/invokeai-batch-logs/2023-03-09-11:52:38-pid=5048.txt
>> Process 5049 running on GPU 1; logging to /tmp/outputs/invokeai-batch-logs/2023-03-09-11:52:38-pid=5049.txt
>> Process 5050 running on GPU 0; logging to /tmp/outputs/invokeai-batch-logs/2023-03-09-11:52:38-pid=5050.txt

The log files themselves have the TQDM progress bars, but I've separated them by process so that they'll be internally coherent. Runtime errors get logged to the end of the log file as expected.

Copy link
Contributor

@keturn keturn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the rebuild of stats.html an intentional part of this PR?

Overall this looks like a super handy script we'll be glad to have, and it's well-documented. 👍

A few thoughts for things we can do later (certainly don't need to be in this iteration):

  • direct HTML output instead of Markdown.
  • adding proper batch support to the backend will be much more efficient than having more than one process per GPU in many cases. (We can vary many things within a parallel batch, but not all of the template fields, so you still may end up using this mode of parallelization if you want to compare models, for example.)

ldm/invoke/dynamic_prompts.py Show resolved Hide resolved
@lstein lstein merged commit 8323169 into v2.3 Mar 10, 2023
@lstein lstein deleted the enhance/simple-param-scanner-script branch March 10, 2023 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants