Skip to content

graph.py refactoring and If node optimization#9030

Merged
lstein merged 24 commits intoinvoke-ai:mainfrom
JPPhoto:if-node-optimization
Apr 10, 2026
Merged

graph.py refactoring and If node optimization#9030
lstein merged 24 commits intoinvoke-ai:mainfrom
JPPhoto:if-node-optimization

Conversation

@JPPhoto
Copy link
Copy Markdown
Collaborator

@JPPhoto JPPhoto commented Apr 8, 2026

Summary

Feature and refactor.

This PR changes IfInvocation execution from eager branch evaluation to lazy branch selection. The scheduler now resolves the condition first, executes only the selected branch, and skips branch-exclusive nodes on the unselected side while still executing shared ancestors that are needed elsewhere. This avoids wasted work in graphs with expensive true/false subgraphs.

The PR also adds targeted graph-execution tests for simple, shared-ancestor, and iterated If cases, and refactors graph.py into smaller internal helpers to make the scheduling and validation logic easier to follow. The shared graph design README was updated to match the current runtime behavior.

Related Issues / Discussions

QA Instructions

Run:

  • pytest tests/test_graph_execution_state.py
  • pytest tests/test_node_graph.py
  • pytest tests/test_session_queue.py

For manual review, inspect an If graph where both branches have upstream work. Confirm that only the selected branch executes and that shared ancestors still execute when needed by another live path.

Merge Plan

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

@github-actions github-actions bot added python PRs that change python files services PRs that change app services python-tests PRs that change python tests labels Apr 8, 2026
@JPPhoto JPPhoto added the backend PRs that change backend files label Apr 8, 2026
@JPPhoto JPPhoto requested a review from Pfannkuchensack as a code owner April 8, 2026 14:34
@github-actions github-actions bot added the frontend PRs that change frontend files label Apr 8, 2026
@JPPhoto JPPhoto changed the title If node optimization graph.py refactoring and If node optimization Apr 9, 2026
@JPPhoto JPPhoto added the docs PRs that change docs label Apr 9, 2026
@JPPhoto JPPhoto requested a review from joshistoast April 9, 2026 15:00
@JPPhoto JPPhoto force-pushed the if-node-optimization branch 3 times, most recently from 88c7e7c to 0098df4 Compare April 9, 2026 22:33
@lstein lstein self-assigned this Apr 10, 2026
@lstein lstein added the v6.13.x label Apr 10, 2026
@lstein lstein moved this to 6.13.x Theme: MODELS in Invoke - Community Roadmap Apr 10, 2026
@lstein
Copy link
Copy Markdown
Collaborator

lstein commented Apr 10, 2026

Code Review: if-node-optimization — Lazy IfInvocation Branch Evaluation

Files reviewed: invokeai/app/services/shared/graph.py, tests/test_graph_execution_state.py, invokeai/app/services/shared/README.md, invokeai/frontend/web/src/services/api/schema.ts


Test Results

Both test suites pass cleanly:

tests/test_graph_execution_state.py  — 29 passed, 2 xfailed
tests/test_node_graph.py             — 60 passed

The two xfailed tests are intentional: they document the old eager-evaluation behavior that this PR removes. Ruff reports no lint issues.


Overall Assessment

This is a well-structured refactor. The monolithic GraphExecutionState has been decomposed into four focused internal helpers with clear responsibilities. The lazy If semantics are implemented correctly — I traced the full execution flow and the invariants hold. The README is thorough and accurate.

The issues below are mostly minor, with one bug in a dead code path and a couple of real gaps in test coverage.


Findings

Important

1. mark_exec_node_skipped does not propagate to executed_history

graph.py, line 249–258

When an exec node is skipped, its source node is added to self._state.executed (if all prepared copies are done), but it is never appended to executed_history. Compare with _mark_source_node_complete (line 567–573) which explicitly calls executed_history.append(source_node_id). If any downstream consumer depends on executed_history to discover which source nodes have run, skipped nodes will be invisible.

def mark_exec_node_skipped(self, exec_node_id: str) -> None:
    self._state._remove_from_ready_queues(exec_node_id)
    self._state._set_prepared_exec_state(exec_node_id, "skipped")
    self._state.executed.add(exec_node_id)

    registry = self._state._prepared_registry()
    source_node_id = registry.get_source_node_id(exec_node_id)
    prepared_nodes = registry.get_prepared_ids(source_node_id)
    if all(n in self._state.executed for n in prepared_nodes):
        self._state.executed.add(source_node_id)
        # Missing: self._state.executed_history.append(source_node_id)

2. Two dead wrapper methods on GraphExecutionState

graph.py, lines 1772 and 1781

_get_if_branch_exclusive_sources and _mark_exec_node_skipped are defined on GraphExecutionState but called nowhere (verified with grep -rn). The analogous calls go directly through self._if_scheduler(). These wrappers are dead code that add noise. They should either be removed or — if they are intended as extension points — documented with a comment explaining their purpose.

3. Sentinel string mixed into typed return value

graph.py, line 1572

_get_collector_input_root_type_from_resolved_types is declared as returning Any | None but returns the literal string "multiple" to signal an error. The caller checks if input_root_type == "multiple". This is an anti-pattern: the return type is a lie and any caller that calls this helper independently will silently mishandle the sentinel.

The fix is either to return a tuple[str | None, Any | None] or (simpler) raise a dedicated exception or use Literal["multiple"] | type | None in the annotation.


Suggestions

4. _PreparedExecRegistry.get_iteration_path creates a throwaway object

graph.py, line 117

return self._metadata.get(exec_node_id, _PreparedExecNodeMetadata(source_node_id="")).iteration_path

This allocates and immediately discards a _PreparedExecNodeMetadata dataclass on every cache miss. A cleaner equivalent:

m = self._metadata.get(exec_node_id)
return m.iteration_path if m is not None else None

5. state: str in _PreparedExecNodeMetadata should use Literal

graph.py, line 74

The state field accepts any string but only "pending", "ready", "executed", and "skipped" are ever written. Using Literal["pending", "ready", "executed", "skipped"] would make the contract explicit and allow type checkers to catch typos.

6. _apply_branch_resolution iterates prepared_source_mapping.items() under mutation

graph.py, lines 207–215

The loop iterates self._state.prepared_source_mapping.items() and calls mark_exec_node_skipped in the body. That method adds to self._state.executed (a separate set) and modifies self._state._prepared_exec_metadata (a separate dict), so prepared_source_mapping itself is never structurally changed during the loop. This is safe, but it is non-obvious. A brief comment would help the next reader.

7. get_next_node uses recursion where a loop would be clearer

graph.py, lines 615–636

_ExecutionScheduler.get_next_node recurses into itself after setting _active_class. Since the call depth is bounded (at most 2 levels — once to set class, once to drain), there is no stack risk, but the pattern is unusual for a scheduler. A while True with continue would make the logic linear and easier to read.


Missing Test Coverage

8. No test for an IfInvocation with only one branch wired

A graph where condition=False but false_input has no input edge (so the selected branch is None) is handled correctly by the code — the if-node gets enqueued when the condition arrives — but there is no test exercising this through GraphExecutionState.

9. No test for two independent IfInvocation nodes in one graph

The is_deferred_by_unresolved_if method iterates all IfInvocation nodes and checks each one against the candidate node's exclusive-source membership. A graph with two independent If nodes — where a branch-exclusive node under If_A should not be blocked by the unrelated If_B — is not tested. The code is correct (the membership check prevents false positives), but a test would lock in that behavior.

10. No test for nested IfInvocation (If inside a branch of an outer If)

The deferral logic handles this transitively: inner branch nodes are blocked by the inner if-node being unresolved, and the inner if-node itself is blocked by the outer branch not having been released yet. No test covers this path.


Positive Observations

  • The decomposition into _ExecutionMaterializer, _ExecutionScheduler, _ExecutionRuntime, _IfBranchScheduler, and _PreparedExecRegistry is clean and each class has a single clear responsibility. The lazy-init PrivateAttr pattern for the helper objects keeps serialization unaffected.

  • The fixpoint algorithm in _prune_nonexclusive_branch_nodes is correctly implemented and well-named.

  • Using xfail(strict=True) to document the old eager behavior as tests that must fail is an excellent pattern for documenting a behavioral change.

  • The README update in section 4.7 accurately describes the new semantics and is a helpful addition to the codebase documentation.


Summary

Severity Count Description
Important 3 Missing executed_history for skipped nodes; dead wrapper methods; sentinel return value
Suggestion 4 Minor style/clarity improvements
Missing tests 3 One-sided branch, two independent Ifs, nested Ifs
Test results 89 passed, 2 xfailed (intentional)

🤖 Generated with Claude Code

@JPPhoto JPPhoto force-pushed the if-node-optimization branch from 0098df4 to b31ef6d Compare April 10, 2026 00:41
@JPPhoto
Copy link
Copy Markdown
Collaborator Author

JPPhoto commented Apr 10, 2026

The get_next_node() issue looks like it's been there for a while. The rest of the issues are valid and I'll address them (plus the prior issue).

@JPPhoto
Copy link
Copy Markdown
Collaborator Author

JPPhoto commented Apr 10, 2026

@lstein I believe all of that feedback has been addressed.

Copy link
Copy Markdown
Collaborator

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@lstein lstein merged commit d4104be into invoke-ai:main Apr 10, 2026
13 checks passed
@JPPhoto JPPhoto deleted the if-node-optimization branch April 10, 2026 01:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend PRs that change backend files docs PRs that change docs frontend PRs that change frontend files python PRs that change python files python-tests PRs that change python tests services PRs that change app services v6.13.x

Projects

Status: 6.13.x Theme: MODELS

Development

Successfully merging this pull request may close these issues.

2 participants