Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node Expansion, While Loops, Components, and Lazy Evaluation #931

Closed
wants to merge 28 commits into from

Conversation

guill
Copy link
Contributor

@guill guill commented Jul 19, 2023

This PR inverts the execution model -- from recursively calling nodes to
using a topological sort of the nodes. This change allows for
modification of the node graph during execution. This allows for two
major advantages:

  1. The implementation of lazy evaluation in nodes. For example, if a
    "Mix Images" node has a mix factor of exactly 0.0, the second image
    input doesn't even need to be evaluated (and visa-versa if the mix
    factor is 1.0).
  2. Dynamic expansion of nodes. This allows for the creation of dynamic
    "node groups". Specifically, custom nodes can return subgraphs that
    replace the original node in the graph. This is an incredibly
    powerful concept. Using this functionality, it was easy to
    implement:
    a. Components (a.k.a. node groups)
    b. Flow control (i.e. while loops) via tail recursion
    c. All-in-one nodes that replicate the WebUI functionality
    d. and more
    All of those were able to be implemented entirely via custom nodes
    without hooking or replacing any core functionality. Within this PR,
    I've included all of these proof-of-concepts within a custom node pack.
    In reality, I would expect some number of them to be merged into the
    core node set (with the rest left to be implemented by custom nodes).

I made very few changes to the front-end, so there are probably some
easy UX wins for someone who is more willing to wade into .js land. The
user experience is a lot better than I expected though -- progress shows
correctly in the UI over the nodes that are being expanded.

This PR inverts the execution model -- from recursively calling nodes to
using a topological sort of the nodes. This change allows for
modification of the node graph during execution. This allows for two
major advantages:
1. The implementation of lazy evaluation in nodes. For example, if a
   "Mix Images" node has a mix factor of exactly 0.0, the second image
   input doesn't even need to be evaluated (and visa-versa if the mix
   factor is 1.0).
2. Dynamic expansion of nodes. This allows for the creation of dynamic
   "node groups". Specifically, custom nodes can return subgraphs that
   replace the original node in the graph. This is an *incredibly*
   powerful concept. Using this functionality, it was easy to
   implement:
   a. Components (a.k.a. node groups)
   b. Flow control (i.e. while loops) via tail recursion
   c. All-in-one nodes that replicate the WebUI functionality
   d. and more
All of those were able to be implemented entirely via custom nodes
without hooking or replacing any core functionality. Within this PR,
I've included all of these proof-of-concepts within a custom node pack.
In reality, I would expect some number of them to be merged into the
core node set (with the rest left to be implemented by custom nodes).

I made very few changes to the front-end, so there are probably some
easy UX wins for someone who is more willing to wade into .js land. The
user experience is a lot better than I expected though -- progress shows
correctly in the UI over the nodes that are being expanded.
@guill
Copy link
Contributor Author

guill commented Jul 19, 2023

Here's an example workflow and screenshot of using While Loop nodes to perform iterative outpaint. Note that this example does not use "For Loops" which were added later (scroll down to see that example).
IterativeOutpaintExample

iterative_outpaint.json.txt

@guill
Copy link
Contributor Author

guill commented Jul 19, 2023

Here's an example workflow and screenshot of lazy evaluation. Nodes can skip evaluating entire subtrees if they're unnecessary.
LayzEvaluationExample
lazy_evaluation.json.txt

@guill
Copy link
Contributor Author

guill commented Jul 19, 2023

Here's an example of an "all-in-one" component. To use it, put the json file in a components folder located directly inside the ComfyUI folder (next to server.py) and restart ComfyUI.

Here's what using it looks like. While it's generating, it displays a preview due to the Sampler node it expands out to. Once it has completed running, it displays the final results due to the SaveImage node contained within. (PreviewImage would work the same way for that.):
AllInOneComponentUsage

Here's what creating it looks like. Definite work that can be done here on the front-end to improve this:
AllInOneComponentCreation

Here's the component itself. (Note that it has to be opened via the new LoadComponent button. It just contains the parsed graph output because I didn't want to try to interpret a LiteGraph workflow from Python.)

all_in_one.json.txt

@WASasquatch
Copy link
Contributor

This looks pretty nice. I'll have to get a portable going to test the PR tomorrow.

@guill
Copy link
Contributor Author

guill commented Jul 19, 2023

Does anyone have a suggestion of a node group that makes heavy use of list inputs and/or outputs? That's one place where I think things work, but haven't actually tested much beyond WAS' image-to-seed/noise nodes. My two concerns are:

  1. Cases where one item in a list causes a node to expand but another returns immediately.
  2. There are a number of places where I'm checking isinstance(input, list) and len(input) == 2 to determine whether a particular input is actually a node link. Because people may directly assign values to inputs in ephemeral subgraphs, I'm concerned about misinterpreting an input. I think I may want to either create a wrapper to specify that an input is a link or create a wrapper to specify that an input is an array literal.

@ltdrdata
Copy link
Collaborator

SDXL_Upscaled_00013_
Try this workflow.
Above workflow contains this style disconnected switch.

cause

When creating an unreachable point through disconnection like this, the execution of the main will execute all the executable parts and produce error messages while executing the unreachable parts. However, in the current PR, there is an issue where it executes the unreachable parts without completing the executable parts, resulting in an error.

In #731, when encountering an unreachable part, it adopts a strategy of gracefully interrupting only the corresponding execution path. It would be advisable to consider such an approach.

@guill
Copy link
Contributor Author

guill commented Jul 20, 2023

@ltdrdata I'm not sure I understand. I ran your workflow with both that switch disconnected and connected (albeit with the default 1.5 checkpoint since I don't have SDXL yet) and saw the result I expected each time.

With the switch disconnected, the parts of the graph that could run did (properly displaying the outputs) and the parts that couldn't displayed errors about why they couldn't run.
DisconnectedSwitch

When the switch was connected, the entire graph ran successfully.
ConnectedSwitch

When I disconnected the switch again, I saw the same result as in the first image.

Am I misunderstanding your concern? Are you looking to explicitly make failures silent when they're due to missing required inputs rather than displaying the red outlines?

@ltdrdata
Copy link
Collaborator

ltdrdata commented Jul 20, 2023

@ltdrdata I'm not sure I understand. I ran your workflow with both that switch disconnected and connected (albeit with the default 1.5 checkpoint since I don't have SDXL yet) and saw the result I expected each time.

With the switch disconnected, the parts of the graph that could run did (properly displaying the outputs) and the parts that couldn't displayed errors about why they couldn't run. DisconnectedSwitch

When the switch was connected, the entire graph ran successfully. ConnectedSwitch

When I disconnected the switch again, I saw the same result as in the first image.

Am I misunderstanding your concern? Are you looking to explicitly make failures silent when they're due to missing required inputs rather than displaying the red outlines?

I conducted the experiment with the following steps:

  1. Restart ComfyUI.
  2. Loaded the Workflow.
  3. Modified the model to use only what I had.
  4. Queue prompt.

Symptoms:
Before the execution of the Ultimate SD Upscaler part, a part beyond the switch is executed, resulting in an error and halting the execution.

If it is not reproducible then it means there is random execution order.

@ltdrdata
Copy link
Collaborator

ltdrdata commented Jul 20, 2023

I've put together a checklist for execution model migration for Workflow Component.

  • For cases other than a loop, implement cache-based minimal execution for the workflow inside the component.
  • Implement functionalities for executionOneOf, executionBlocker, executionSwitch, and executionControlString.
  • Allow the ability to halt execution for the corresponding execution path instead of raising an error for return None.

Once I have tested the individual sub-functionalities and confirmed that there are no issues, I can proceed with the migration task.

Previously, nodes that were connected to output nodes only via optional
inputs were never validated, but were still run. This would cause a
Python error and stop execution wherever we happen to be in the
execution list. This bug exists on master, but may be more noticeable in
this branch because execution order is non-deterministic.

Other minor change this commit introduces: `raw_link` can be specified
as an option on an input to receive the raw link (in the standard form
of [node_id, output_index]) rather than a resolved value.
@guill
Copy link
Contributor Author

guill commented Jul 20, 2023

Note for future me (or others): Reference code for the nodes @ltdrdata is mentioning are located here:

https://github.com/ltdrdata/ComfyUI-Workflow-Component/blob/27c3be78499337c26e0043e6d93bbf3678beb4e1/workflow_component/custom_nodes.py

Note that For loops still require WAS-ns. They just expand to using the
integer operation node internally.

Also adds some nodes for accumulation operations.
@guill
Copy link
Contributor Author

guill commented Jul 20, 2023

Since everybody seems to want to implement For loops anyway, I've added some syntactic sugar for it. Note that they still require WAS-ns -- they just expand to the nodes in the example above (which also helped me test that loops work when expanded themselves).

Here's an example of what that looks like. This example also includes the new accumulation nodes:
ForLoops

To demo a more complex operation that uses accumulations, the following performs pairwise differences on the above images:
PairwiseDifference

Here's the workflow for those:
accumulation_for.json.txt

@ltdrdata
Copy link
Collaborator

ltdrdata commented Jul 20, 2023

In my opinion, there is a need for basic condition nodes to be built-in to support loops. Although the loop functionality itself seems to work well, it would be great to have a more concise way to elaborate on it.

I think it would be necessary to conduct some research while playing spaghetti play once.

@guill
Copy link
Contributor Author

guill commented Jul 20, 2023

Yeah, I totally agree. I just wanted to be careful about how many new nodes I introduce, so I figured it made since to add the minimal number to get things functional and then add more based on feedback once people (including myself) were actually using the thing.

@ltdrdata
Copy link
Collaborator

ltdrdata commented Jul 22, 2023

error.json.txt

This is more easier reproducible issue.
On first queue, it will working well. But if you hit 2nd queue it will spit out error message.

Capture

@PGadoury
Copy link

PGadoury commented Jul 23, 2023

I'm trying out the custom components workflow you described; however, I'm getting the following error when I try to run my graph.

!!! Exception during processing !!!
Traceback (most recent call last):
  File "D:\ComfyUI_windows_portable\ComfyUI\execution.py", line 338, in non_recursive_execute
    new_graph, node_outputs = comfy.graph_utils.add_graph_prefix(new_graph, node_outputs, "%s.%d." % (unique_id, i))
  File "D:\ComfyUI_windows_portable\ComfyUI\comfy\graph_utils.py", line 89, in add_graph_prefix
    new_node["inputs"][input_name] = [prefix + input_value[0], input_value[1]]
TypeError: can only concatenate str (not "list") to str

(I'm including the graph that triggers this error)

Euler Single Step.json.txt

Honestly, I'm still a little concerned here. There's nothing stopping a
custom node from having a data type of ["str",int]. I've improved
recognition to at least prevent the detection of other types, but we
may still want a more systemic fix (e.g. wrapping literals within a
class when using them as inputs to nodes in subgraphs).
@guill
Copy link
Contributor Author

guill commented Jul 23, 2023

Thanks for the reproduction case @PGadoury. This specific error was caused by the concern I mentioned in one of my other comments -- most of the core code differentiates links from literals by assuming that all lists are links. For now I've committed a fix that limits recognition to lists of the form ["str", int] which should at least prevent 99% of false recognition, but there may still be bugs when dealing with custom nodes.

Rather than displaying live feedback on the "End For" node, the feedback
will be displayed in the UI on the original node.
@PGadoury
Copy link

PGadoury commented Jul 23, 2023

Your fix did indeed solve the problem, thanks!

I've been working on an implementation of the PerpNeg/Neutral Prompt latent composition method. It's making heavy use of nested custom components from your branch.

(Attached)

update:

I removed any dependencies I could find. I might have missed some literals from some other extensions, though?
I'm also including a new implementation that uses the SVD to find the principal component of all the provided (weighted) prompts in latent space without the need to base everything around a reference prompt like A1111's implementation.
If this conditional execution/flow control branch gets merged with main, I'll probably end up putting these in my own extension, but since it relies heavily on nodes in here I'll just leave these as a zip file here...

Perp neg workflow:
ComfyUI_00009_

Perp neg (SVD) workflow:
ComfyUI_00015_
(Note that the negative prompt applies to every provided conditional, so if the weights on the conditional are negative, you'll end up with the negative of a negative for that aspect - If you want multiple negative prompts, just add them into the accumulator as positive prompts with negative weights)

PerpNeg.zip

@Pixelatedza
Copy link

Could I chime in with a thought? Perhaps we could mimic what they do with blueprints in Unreal Engine for the control structures? https://docs.unrealengine.com/4.26/en-US/ProgrammingAndScripting/Blueprints/UserGuide/FlowControl/

@comfyanonymous
Copy link
Owner

I just need to take some time to properly look at it and test it. Knowing that other people are using it without issues is very useful because it means I need to do less testing.

@DaemonAlchemist
Copy link

Yes, I've been using this fork for my project and it's been working great! I've only come across one issue, which is more of an annoyance than a bug:

It would be nice if the node parameter validation was also lazy. For example, if a checkpoint loader node won't be used with the current workflow settings, I shouldn't be required to select a checkpoint in order for the workflow to validate. It can be a bit annoying to put in several "required" parameters when they really won't be needed.

@guill
Copy link
Contributor Author

guill commented Nov 25, 2023

I am also happy to continue making changes/fixes in it if someone identifies something that needs to be changed.

It's been discussed a couple times on the matrix channel, but I think it would be appropriate to merge the execution model changes (i.e. everything not in the custom_nodes folder in this PR) even if we don't immediately want to add nodes for loops/etc. in the base ComfyUI until the front-end experience is improved (specifically related to "*" sockets).

Once the execution model is updated, the rest of the example use-cases in this PR can be added piecemeal as @comfyanonymous feels the UX for each one reaches an appropriate level.

In the meantime, power-users would continue being able to use those nodes as custom nodes without working off of a totally separate fork.

@BuildBackBuehler
Copy link

I am also happy to continue making changes/fixes in it if someone identifies something that needs to be changed.

It's been discussed a couple times on the matrix channel, but I think it would be appropriate to merge the execution model changes (i.e. everything not in the custom_nodes folder in this PR) even if we don't immediately want to add nodes for loops/etc. in the base ComfyUI until the front-end experience is improved (specifically related to "*" sockets).

Once the execution model is updated, the rest of the example use-cases in this PR can be added piecemeal as @comfyanonymous feels the UX for each one reaches an appropriate level.

In the meantime, power-users would continue being able to use those nodes as custom nodes without working off of a totally separate fork.

This is one power-user who could desperately benefit from this.

I loved the enhancements so I had to give these a try. My merge worked at first, but I decided I wanted to update ComfyUI and ended up breaking everything :/. Just had to reset and try to do it all over again. It was a bit of a pain to get things working in the first place as I was dealing with a 3rd handful of files (had to fiddle with RG3's custom node to get it up-and-running).

Guill, any good resources I could read regarding how to do this sorta thing? And/or is there something I'm missing as far as these files reconciling with the ComfyUI updates?

Just seems to me that I have no way of knowing if when I patch in #931's version of files, that any unrelated tweaks/enhancements made to the master branch will (or won't) stick around after I patch the repo. I'd like to keep up with the master branch!

@Simn
Copy link

Simn commented Dec 14, 2023

I really like the inverted control from an architectural point of view. It's conceptually good and will open quite a few possibilities.

In order to make progress, I concur that the execution model changes should be factored out so that it can be tested and reviewed without all the "noise" from the other additions. That also makes it easier to test things and not be distracted by shiny objects. I know it's boring to have a PR where, ideally, nobody notices any change, but for the sake of stability this is a good thing.

@narukaze132
Copy link

I second the idea of pulling in the execution model changes now. The current version of ComfyUI has drifted a lot from what this branch is based on, and as someone who's tried merging them together, it's very hard to merge the two together; it'll only get worse with time. Pulling in the execution model changes should at least make sure the version drift isn't a problem. (Plus I know at least one custom node developer has cited this pull request as solving a lot of problems for his nodes.)

That being said, I've noticed a minor UX problem: the Save/Load Component buttons take up a lot of screen real estate, and frankly, I don't think they need to be there. I'm pretty sure you can just use the API format to save components instead, since the API format is designed to make it easy to parse the actual workflow structure. You'd need to expose the Save (API Format) button outside of developer mode, but I don't think that's a huge issue, and you'd remove the need for a separate Load Component button entirely, since the default Load button can load in API-formatted workflows just fine.

@Chaoses-Ib
Copy link
Contributor

I've found another route that can partially implement these features: ComfyScript: A Python front end for ComfyUI. The basic idea is that one can generate workflows dynamically in Python. This way, control flows that don't rely on intermediate results can be implemented naturally. And for control flows that rely on intermediate results, it's still feasible to mimic them by expanding the workflow dynamically and run the workflow multiple times. This is not user friendly for most users as it requires some knowledge of Python, but it could be a usable option for power users.

@ltdrdata
Copy link
Collaborator

ltdrdata commented Jan 2, 2024

I second the idea of pulling in the execution model changes now. The current version of ComfyUI has drifted a lot from what this branch is based on, and as someone who's tried merging them together, it's very hard to merge the two together; it'll only get worse with time. Pulling in the execution model changes should at least make sure the version drift isn't a problem. (Plus I know at least one custom node developer has cited this pull request as solving a lot of problems for his nodes.)

That being said, I've noticed a minor UX problem: the Save/Load Component buttons take up a lot of screen real estate, and frankly, I don't think they need to be there. I'm pretty sure you can just use the API format to save components instead, since the API format is designed to make it easy to parse the actual workflow structure. You'd need to expose the Save (API Format) button outside of developer mode, but I don't think that's a huge issue, and you'd remove the need for a separate Load Component button entirely, since the default Load button can load in API-formatted workflows just fine.

You can ignore the Save/Load Component button. It's just a PoC. It will be removed when merge.

@comfyanonymous
Copy link
Owner

There seems to be a lot of interest in this so if you split out the execution part I will merge it assuming it doesn't break anything.

@longgui0318
Copy link

Very much looking forward to this PR being merged into master, @guill can we continue to discuss plans on how to merge this PR in?

@guill
Copy link
Contributor Author

guill commented Jan 25, 2024

Awesome! Going through a major launch at work-work this week, but this weekend I should have time to merge main back into this branch and get the core pieces separated out into a separate branch 👍

@ltdrdata
Copy link
Collaborator

Awesome! Going through a major launch at work-work this week, but this weekend I should have time to merge main back into this branch and get the core pieces separated out into a separate branch 👍

Finally!!!

@Trung0246
Copy link

Trung0246 commented Jan 26, 2024

workflow_loop_demo

Here's my implementation that works right now by abusing internal stuff for anyone cannot wait. When this got merged I've probably change the code to support this :)

@Trung0246
Copy link

Trung0246 commented Jan 27, 2024

Also @guill I think the current mechanism of introducing new key lazy for INPUT_TYPES is not flexible enough for dynamic input insertion that occur at runtime (since litegraph on js side support adding extra input/output). Not sure how to achieve unknown input count given that in the example nodes there's many constant being used just to expand inputs, which actually limits to user have to manual increase these constant by modifying the source code which is not ideal.

Maybe one solution is to pass optional current node instance id for INPUT_TYPES or introduce something like get_inputs function event callback beside checking for lazy in TopologicalSort.add_node like how check_lazy_status are being added. One way to implement this would probably call get_inputs in TopologicalSort.get_input_info.

@guill
Copy link
Contributor Author

guill commented Jan 27, 2024

@Trung0246 Yeah, there are quite a few issues with dynamic node typing (both with and without this PR). At least last time I discussed it with people in Element, there was a desire to take a more holistic look at both dynamically changing the number of inputs and dynamically changing the types of inputs.

Let me know if I'm missing something, but I don't think this PR brings us any further away from supporting those two cases, so it should be ok to make that as a separate change after the execution model is updated.

@Trung0246
Copy link

Trung0246 commented Jan 28, 2024

@guill sounds goods. But I'm not sure if executing event will be triggered for dynamic prompt dict returned by node such as WhileLoopClose and therefore calling onExecute on js side for things such as logging and display text after reviewing the code. I did see the node id being reused so maybe there's that.

@guill
Copy link
Contributor Author

guill commented Jan 29, 2024

Closing this PR as I have separated out the core changes to ComfyUI to a new PR: #2666

The custom nodes from this PR have been relocated to https://github.com/BadCafeCode/execution-inversion-demo-comfyui

Note that you will still have to manually enable "*" socket types to be able to use some of the nodes from that custom node pack.

If you have been using this branch locally, I would appreciate your assistance in testing that new PR. While it passes all unit tests, I haven't had a chance to experiment with SDXL or video myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet