Dynamic prompt generation script for parameter scans #2831

lstein · 2023-02-27T22:32:58Z

Programatically generate a large number of images varying by prompt and other image generation parameters

This is a little standalone script named dynamic_prompting.py that enables the generation of dynamic prompts. Using YAML syntax, you specify a template of prompt phrases and lists of generation parameters, and the script will generate a cross product of prompts and generation settings for you. You can save these prompts to disk for later use, or pipe them to the invokeai CLI to generate the images on the fly.

Typical uses are testing step and CFG values systematically while holding the seed and prompt constant, testing out various artist's styles, and comparing the results of the same prompt across different models.

A typical template will look like this:

model: stable-diffusion-1.5
steps: 30;50;10
seed: 50
dimensions: 512x512
cfg:
  - 7
  - 12
sampler:
  - k_euler_a
  - k_lms
prompt:
  style:
       - greg rutkowski
       - gustav klimt
  location:
       - the mountains
       - a desert
  object:
       - luxurious dwelling
       - crude tent
  template: a {object} in {location}, in the style of {style}

This will generate 96 different images, each of which varies by one of the dimensions specified in the template. For example, the prompt axis will generate a cross product list like:

a luxurious dwelling in the mountains, in the style of greg rutkowski
a luxurious dwelling in the mountains, in the style of gustav klimt
a luxious dwelling in a desert, in the style of greg rutkowski
... etc

A typical usage would be:

python scripts/dynamic_prompts.py --invoke --outdir=/tmp/scanning my_template.yaml

This will populate /tmp/scanning with each of the requested images, and also generate a log.md file which you can open with an e-book reader to show something like this:

Full instructions can be obtained using the --instructions switch, and an example template can be printed out using --example:

python scripts/dynamic_prompts.py --instructions
python scripts/dynamic_prompts.py --example > my_first_template.yaml

Simple script to generate a file of InvokeAI prompts and settings that scan across steps and other parameters. To use, create a file named "template.yaml" (or similar) formatted like this >>> cut here <<< steps: "30:50:1" seed: 50 cfg: - 7 - 8 - 12 sampler: - ddim - k_lms prompt: - a sunny meadow in the mountains - a gathering storm in the mountains >>> cut here <<< Create sections named "steps", "seed", "cfg", "sampler" and "prompt". - Each section can have a constant value such as this: steps: 50 - Or a range of numeric values in the format: steps: "<start>:<stop>:<step>" - Or a list of values in the format: - value1 - value2 - value3 Be careful to: 1) put quotation marks around numeric ranges; 2) put a space between the "-" and the value in a list of values; and 3) use spaces, not tabs, at the beginnings of indented lines. When you run this script, capture the output into a text file like this: python generate_param_scan.py template.yaml > output_prompts.txt "output_prompts.txt" will now contain an expansion of all the list values you provided. You can examine it in a text editor such as Notepad. Now start the CLI, and feed the expanded prompt file to it using the "!replay" command: !replay output_prompts.txt Alternatively, you can directly feed the output of this script by issuing a command like this from the developer's console: python generate_param_scan.py template.yaml | invokeai You can use the web interface to view the resulting images and their metadata.

hipsterusername · 2023-02-27T22:55:11Z

Very cool! Is this effectively the “dynamic prompt” feature? I.e., it’s creating every possible permutation of all those settings?

lstein · 2023-02-28T13:55:23Z

Yeah, that's what this is. Hang on a bit and I'll add a syntax for combining prompt fragments like this:

style:
       - greg rutkowski
       - gustav klimt
       - renoir
       - donetello
subject:
       - two friends walking in the park
       - two dogs playing in the dogpark
prompt: {subject} in the style of {style}

- ability to cycle through models and dimensions - process automatically through invokeai - create an .md file to display the grid results

JPPhoto · 2023-02-28T20:40:47Z

I believe that dynamic prompting with some syntax ((a,b).allOf()/(a,b).oneOf()) is on the compel roadmap, but that's a @damian0815 question.

lstein · 2023-02-28T20:56:37Z

I believe that dynamic prompting with some syntax ((a,b).allOf()/(a,b).oneOf()) is on the compel roadmap, but that's a @damian0815 question.

That's why this is going into the v2.3 branch as a standalone script that doesn't touch the main code base. I know it will be superseded by @damian0815 's compel work.

JPPhoto · 2023-02-28T23:57:43Z

I believe that dynamic prompting with some syntax ((a,b).allOf()/(a,b).oneOf()) is on the compel roadmap, but that's a @damian0815 question.

That's why this is going into the v2.3 branch as a standalone script that doesn't touch the main code base. I know it will be superseded by @damian0815 's compel work.

Part of it, yes... but compel doesn't handle parameters and I think being able to vary those will be very helpful on main in the CLI or via a script like this.

lstein · 2023-03-01T04:16:22Z

I believe that dynamic prompting with some syntax ((a,b).allOf()/(a,b).oneOf()) is on the compel roadmap, but that's a @damian0815 question.

That's why this is going into the v2.3 branch as a standalone script that doesn't touch the main code base. I know it will be superseded by @damian0815 's compel work.

Part of it, yes... but compel doesn't handle parameters and I think being able to vary those will be very helpful on main in the CLI or via a script like this.

Nothing preventing this from going into main as well, but I'm already getting more feature requests; for instance, it doesn't do img2img

hipsterusername

My review won't unblock this, but I'll give it the performative 👍 to support this being merged.

lstein · 2023-03-05T07:11:12Z

Update: To allow for faster dynamic prompt generation, the script now renders in parallel across multiple processes and GPUs. By default, it will launch one process per CUDA GPU, but this can be controlled using --processes_per_gpu. For example this will run 4 invokeai instances on each GPU, assuming there is enough VRAM to support the rendering:

scripts/dynamic_prompts.py example2.yaml --invoke --outdir=~/outputs --processes=4
>> Spawning 8 invokeai processes across 2 CUDA gpus
>> Process 2262144 running on GPU 1
>> Process 2262141 running on GPU 0
>> Process 2262139 running on GPU 0
>> Process 2262145 running on GPU 0
>> Process 2262140 running on GPU 1
>> Process 2262143 running on GPU 0
>> Process 2262142 running on GPU 1
>> Process 2262146 running on GPU 1
[133] /home/lstein/outputs/dp.0001.a sunny meadow in the mountains in the style of greg rutkowski.png
[134] /home/lstein/outputs/dp.0009.a gathering storm in the mountains in the style of greg rutkowski.png
[133] /home/lstein/outputs/dp.0002.a sunny meadow in the mountains in the style of greg rutkowski.png
[135] /home/lstein/outputs/dp.0010.a gathering storm in the mountains in the style of greg rutkowski.png
[134] /home/lstein/outputs/dp.0011.a gathering storm in the mountains in the style of gustav klimt.png
[135] /home/lstein/outputs/dp.0013.a gathering storm in the mountains in the style of renoir.png
[136] /home/lstein/outputs/dp.0012.a gathering storm in the mountains in the style of gustav klimt.png
[133] /home/lstein/outputs/dp.0003.a sunny meadow in the mountains in the style of gustav klimt.png
[133] /home/lstein/outputs/dp.0007.a sunny meadow in the mountains in the style of donetello.png
[133] /home/lstein/outputs/dp.0005.a sunny meadow in the mountains in the style of renoir.png
[137] /home/lstein/outputs/dp.0015.a gathering storm in the mountains in the style of donetello.png
[136] /home/lstein/outputs/dp.0014.a gathering storm in the mountains in the style of renoir.png
[133] /home/lstein/outputs/dp.0008.a sunny meadow in the mountains in the style of donetello.png
[133] /home/lstein/outputs/dp.0006.a sunny meadow in the mountains in the style of renoir.png
[133] /home/lstein/outputs/dp.0004.a sunny meadow in the mountains in the style of gustav klimt.png
[134] /home/lstein/outputs/dp.0016.a gathering storm in the mountains in the style of donetello.png

Note that the images come out in higgledy-piggledy order, but on the filesystem they will sort in the order specified in the prompt template.

I am considering a simple extension to allow this script to run across multiple nodes of a compute cluster, provided that the user has installed invokeai each of the nodes, but I should get back to working on nodes first.

ebr

Worked great with up to 3 parallel processes on a 24GB Tesla P40, after a few tries to get it going. I noted the issues inline.
Also, because stdout from individual threads isn't captured, it wasn't easy to determine their root causes.
But overall works great in parallel on a single GPU (with the expected performance hit).

scripts/dynamic_prompts.py

ebr · 2023-03-07T04:54:14Z

I am considering a simple extension to allow this script to run across multiple nodes of a compute cluster, provided that the user has installed invokeai each of the nodes, but I should get back to working on nodes first.

@lstein interestingly enough, the way you've structured the YAML is exactly how I'd write this as an input of an Argo workflow, which could easily parallelize it across hundreds of GPU instances. What's the approach you're thinking of? I did take a stab at it again using nodes over the weekend, but didn't get very far yet.

lstein · 2023-03-07T18:39:17Z

That's an interesting coincidence! The approach I'm thinking of is the same one as is used by `gnu-parallel`. Basically it requires that each worker node has a public key-based ssh login for the current user and that each worker has the same set of InvokeAI models installed (or they may all use a shared directory). The master node runs an ssh session across each worker which launches InvokeAI, feeds the expanded prompts to the worker via its pseudo-tty standard in, and accepts the resulting image via stdout. It's not at all elegant, but it is actually quite scaleable and benefits from the built-in strong authentication and encryption that ssh provides. This almost can work now, but it requires a modification to the CLI to encapsulate the image and its metadata as JSON-encoded objects rather than writing to local disk. Did you have a chance to test the multiprocessing on a local machine? I haven't done much in the way of exception handling and am wondering whether the errors are interpretable when there is a crash due to out of RAM or out of VRAM conditions.

…

On Mon, Mar 6, 2023 at 11:54 PM Eugene Brodsky ***@***.***> wrote: I am considering a simple extension to allow this script to run across multiple nodes of a compute cluster, provided that the user has installed invokeai each of the nodes, but I should get back to working on nodes first. @lstein <https://github.com/lstein> interestingly enough, the way you've structured the YAML is exactly how I'd write this as an input of an Argo workflow, which could easily parallelize it across hundreds of GPU instances. What's the approach you're thinking of? I did take a stab at it again using nodes over the weekend, but didn't get very far yet. — Reply to this email directly, view it on GitHub <#2831 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAA3EVJIJSKMGIMMNY6DFCTW225QFANCNFSM6AAAAAAVJ7KXU4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

ebr · 2023-03-08T04:58:38Z

Cool! sounds similar to the SLURM scheduler, right? haven't worked with it it myself, but i know it's widely used in academia.
I started looking into doing this on Nodes, but no significant progress yet.

Yes, i did test multiprocessing (single GPU, 3 threads) - worked very well with the exception of a couple stumbling blocks - see my previous comments. Main issues were the prompt-in-filename and the lack of log output. Worked great once I figured out my prompts were not filename-friendly.

obscured logging output IS indeed a problem for troubleshooting though. I haven't gotten a CUDA OOM (will try!), but overall it would be helpful to capture and print stdout/stderr from the threads, even at the expense of pretty output, at least for now.

I can approve to unblock this, but please ping me to re-test if you'd like.

ebr

worked well with simple prompts (3 processes / 1 GPU)

filenames should not contain the prompt - this breaks with more complex prompts

lstein · 2023-03-09T07:28:21Z

Cool! sounds similar to the SLURM scheduler, right? haven't worked with it it myself, but i know it's widely used in academia. I started looking into doing this on Nodes, but no significant progress yet.

Yes, i did test multiprocessing (single GPU, 3 threads) - worked very well with the exception of a couple stumbling blocks - see my previous comments. Main issues were the prompt-in-filename and the lack of log output. Worked great once I figured out my prompts were not filename-friendly.

obscured logging output IS indeed a problem for troubleshooting though. I haven't gotten a CUDA OOM (will try!), but overall it would be helpful to capture and print stdout/stderr from the threads, even at the expense of pretty output, at least for now.

I can approve to unblock this, but please ping me to re-test if you'd like.

I think the lack of logging is a blocker, and I'll fix this by redirecting stderr to a file rather than trying to suppress it. The reason I tried to suppress it is that I couldn't stand the sight of 8 TQDM's fighting with each other for control of the console!

ebr · 2023-03-09T14:42:39Z

[..] I couldn't stand the sight of 8 TQDM's fighting with each other for control of the console!

agreed! I'm very tempted to solve this with rich.Live or try my hand at npyscreen, but must resist getting nerd-sniped at the moment. Redirecting stderr to a file would definitely do the trick for now

lstein · 2023-03-09T17:02:32Z

@ebr Logging of stderr is now implemented correctly (I think). Here is what it looks like:

(2.3) lstein@gpu-nvidia:~/Projects/InvokeAI-2.3$ invokeai-batch --processes_per_gpu=3 --invoke --outdir=/tmp/outputs ./example2.yaml
>> Spawning 6 invokeai processes across 2 CUDA gpus
>> Outputs will be written into /tmp/outputs, and error logs will be written to /tmp/outputs/invokeai-batch-logs
>> Process 5053 running on GPU 1; logging to /tmp/outputs/invokeai-batch-logs/2023-03-09-11:52:38-pid=5053.txt
>> Process 5052 running on GPU 0; logging to /tmp/outputs/invokeai-batch-logs/2023-03-09-11:52:38-pid=5052.txt
>> Process 5051 running on GPU 1; logging to /tmp/outputs/invokeai-batch-logs/2023-03-09-11:52:38-pid=5051.txt
>> Process 5048 running on GPU 0; logging to /tmp/outputs/invokeai-batch-logs/2023-03-09-11:52:38-pid=5048.txt
>> Process 5049 running on GPU 1; logging to /tmp/outputs/invokeai-batch-logs/2023-03-09-11:52:38-pid=5049.txt
>> Process 5050 running on GPU 0; logging to /tmp/outputs/invokeai-batch-logs/2023-03-09-11:52:38-pid=5050.txt

The log files themselves have the TQDM progress bars, but I've separated them by process so that they'll be internally coherent. Runtime errors get logged to the end of the log file as expected.

keturn

Is the rebuild of stats.html an intentional part of this PR?

Overall this looks like a super handy script we'll be glad to have, and it's well-documented. 👍

A few thoughts for things we can do later (certainly don't need to be in this iteration):

direct HTML output instead of Markdown.
adding proper batch support to the backend will be much more efficient than having more than one process per GPU in many cases. (We can vary many things within a parallel batch, but not all of the template fields, so you still may end up using this mode of parallelization if you want to compare models, for example.)

ldm/invoke/dynamic_prompts.py

lstein requested a review from hipsterusername February 27, 2023 22:32

lstein requested review from ebr and blessedcoolant as code owners February 27, 2023 22:32

add multiple enhancements

4c61f3a

- ability to cycle through models and dimensions - process automatically through invokeai - create an .md file to display the grid results

lstein changed the title ~~add a simple parameter scanning script to the scripts directory~~ Dynamic prompt generation script for parameter scans Feb 28, 2023

add support for templates written in JSON

c7e4daf

hipsterusername approved these changes Mar 1, 2023

View reviewed changes

lstein added 4 commits March 2, 2023 08:11

Merge branch 'v2.3' into enhance/simple-param-scanner-script

3c64fad

implement locking when acquiring next output file prefix

117f70e

add perlin, init_img, threshold & strength

6d0e782

implemented multiprocessing across multiple GPUs

45aa770

lstein requested review from CapableWeb, keturn and psychedelicious as code owners March 5, 2023 06:52

updated template styles.

fc164d5

ebr reviewed Mar 7, 2023

View reviewed changes

scripts/dynamic_prompts.py Outdated Show resolved Hide resolved

scripts/dynamic_prompts.py Outdated Show resolved Hide resolved

install the script as "invokeai-batch"

d912bab

lstein requested a review from mauwii as a code owner March 7, 2023 15:10

Merge branch 'v2.3' into enhance/simple-param-scanner-script

d669e69

mauwii approved these changes Mar 7, 2023

View reviewed changes

ebr approved these changes Mar 8, 2023

View reviewed changes

fix documentation of range syntax

84dfd20

add logging, support for prompts with shell metachars

142ba8c

Merge branch 'v2.3' into enhance/simple-param-scanner-script

252f222

blessedcoolant approved these changes Mar 9, 2023

View reviewed changes

Merge branch 'v2.3' into enhance/simple-param-scanner-script

bf5cd1b

keturn approved these changes Mar 10, 2023

View reviewed changes

ldm/invoke/dynamic_prompts.py Show resolved Hide resolved

lstein merged commit 8323169 into v2.3 Mar 10, 2023

lstein deleted the enhance/simple-param-scanner-script branch March 10, 2023 01:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic prompt generation script for parameter scans #2831

Dynamic prompt generation script for parameter scans #2831

lstein commented Feb 27, 2023 •

edited

Loading

hipsterusername commented Feb 27, 2023

lstein commented Feb 28, 2023

JPPhoto commented Feb 28, 2023

lstein commented Feb 28, 2023

JPPhoto commented Feb 28, 2023

lstein commented Mar 1, 2023

hipsterusername left a comment

lstein commented Mar 5, 2023

ebr left a comment

ebr commented Mar 7, 2023

lstein commented Mar 7, 2023 via email

ebr commented Mar 8, 2023 •

edited

Loading

ebr left a comment

lstein commented Mar 9, 2023

ebr commented Mar 9, 2023

lstein commented Mar 9, 2023

keturn left a comment •

edited

Loading

Dynamic prompt generation script for parameter scans #2831

Dynamic prompt generation script for parameter scans #2831

Conversation

lstein commented Feb 27, 2023 • edited Loading

Programatically generate a large number of images varying by prompt and other image generation parameters

hipsterusername commented Feb 27, 2023

lstein commented Feb 28, 2023

JPPhoto commented Feb 28, 2023

lstein commented Feb 28, 2023

JPPhoto commented Feb 28, 2023

lstein commented Mar 1, 2023

hipsterusername left a comment

Choose a reason for hiding this comment

lstein commented Mar 5, 2023

ebr left a comment

Choose a reason for hiding this comment

ebr commented Mar 7, 2023

lstein commented Mar 7, 2023 via email

ebr commented Mar 8, 2023 • edited Loading

ebr left a comment

Choose a reason for hiding this comment

lstein commented Mar 9, 2023

ebr commented Mar 9, 2023

lstein commented Mar 9, 2023

keturn left a comment • edited Loading

Choose a reason for hiding this comment

lstein commented Feb 27, 2023 •

edited

Loading

ebr commented Mar 8, 2023 •

edited

Loading

keturn left a comment •

edited

Loading