Skip to content

Conversation

Kyle0654
Copy link
Contributor

@Kyle0654 Kyle0654 commented Dec 1, 2022

This PR adds the core of the node-based invocation system first discussed in https://github.com/invoke-ai/InvokeAI/discussions/597 and implements it through a basic CLI and API. This supersedes #1047, which was too far behind to rebase.

Architecture

Invocations

The core of the new system is invocations, found in /ldm/invoke/app/invocations. These represent individual nodes of execution, each with inputs and outputs. Core invocations are already implemented (txt2img, img2img, upscale, face_restore) as well as a debug invocation (show_image). To implement a new invocation, all that is required is to add a new implementation in this folder (there is a markdown document describing the specifics, though it is slightly out-of-date).

Sessions

Invocations and links between them are maintained in a session. These can be queued for invocation (either the next ready node, or all nodes). Some notes:

  • Sessions may be added to at any time (including after invocation), but may not be modified.
  • Links are always added with a node, and are always links from existing nodes to the new node. These links can be relative "history" links, e.g. -1 to link from a previously executed node, and can link either specific outputs, or can opportunistically link all matching outputs by name and type by using *.
  • There are no iteration/looping constructs. Most needs for this could be solved by either duplicating nodes or cloning sessions. This is open for discussion, but is a difficult problem to solve in a way that doesn't make the code even more complex/confusing (especially regarding node ids and history).

Services

These make up the core the invocation system, found in /ldm/invoke/app/services. One of the key design philosophies here is that most components should be replaceable when possible. For example, if someone wants to use cloud storage for their images, they should be able to replace the image storage service easily.

The services are broken down as follows (several of these are intentionally implemented with an initial simple/naïve approach):

  • Invoker: Responsible for creating and executing sessions and managing services used to do so.
  • Session Manager: Manages session history. An on-disk implementation is provided, which stores sessions as json files on disk, and caches recently used sessions for quick access.
  • Image Storage: Stores images of multiple types. An on-disk implementation is provided, which stores images on disk and retains recently used images in an in-memory cache.
  • Invocation Queue: Used to queue invocations for execution. An in-memory implementation is provided.
  • Events: An event system, primarily used with socket.io to support future web UI integration.

Apps

Apps are available through the /scripts/invoke-new.py script (to-be integrated/renamed).

CLI

python scripts/invoke-new.py

Implements a simple CLI. The CLI creates a single session, and automatically links all inputs to the previous node's output. Commands are automatically generated from all invocations, with command options being automatically generated from invocation inputs. Help is also available for the cli and for each command, and is very verbose. Additionally, the CLI supports command piping for single-line entry of multiple commands. Example:

> txt2img --prompt "a cat eating sushi" --steps 20 --seed 1234 | upscale | show_image

API

python scripts/invoke-new.py --api --host 0.0.0.0

Implements an API using FastAPI with Socket.io support for signaling. API documentation is available at http://localhost:9090/docs or http://localhost:9090/redoc. This includes OpenAPI schema for all available invocations, session interaction APIs, and image APIs. Socket.io signals are per-session, and can be subscribed to by session id. These aren't currently auto-documented, though the code for event emission is centralized in /ldm/invoke/app/services/events.py.

A very simple test html and script are available at http://localhost:9090/static/test.html This demonstrates creating a session from a graph, invoking it, and receiving signals from Socket.io.

What's left?

  • There are a number of features not currently covered by invocations. I kept the set of invocations small during core development in order to simplify refactoring as I went. Now that the invocation code has stabilized, I'd love some help filling those out!
  • There's no image metadata generated. It would be fairly straightforward (and would make good sense) to serialize either a session and node reference into an image, or the entire node into the image. There are a lot of questions to answer around source images, linked images, etc. though. This history is all stored in the session as well, and with complex sessions, the metadata in an image may lose its value. This needs some further discussion.
  • We need a list of features (both current and future) that would be difficult to implement without looping constructs so we can have a good conversation around it. I'm really hoping we can avoid needing looping/iteration in the graph execution, since it'll necessitate separating an execution of a graph into its own concept/system, and will further complicate the system.
  • The API likely needs further filling out to support the UI. I think using the new API for the current UI is possible, and potentially interesting, since it could work like the new/demo CLI in a "single operation at a time" workflow. I don't know how compatible that will be with our UI goals though. It would be nice to support only a single API though.
  • Deeper separation of systems. I intentionally tried to not touch Generate or other systems too much, but a lot could be gained by breaking those apart. Even breaking apart Args into two pieces (command line arguments and the parser for the current CLI) would make it easier to maintain. This is probably in the future though.

Copy link
Member

@ebr ebr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good and nothing looks broken from the package perspective. Items below can be fixed now or later:

  • environments-and-requirements can be nuked - no longer in use
  • pyproject.toml is looking good except pytest and pytest-cov are already set in the optional test block, so can be removed from the main dependencies list
  • .pytest.ini and .coveragerc - these should be defined in pyproject.toml, but that's not critical and can be fixed anytime
  • static/dream_web - this may have crept in from a past commit - pretty sure it's obsolete
  • scripts/invoke_new.py - we could define a proper entrypoint into ldm/invoke/app/cli_app for this, but not a blocker either

@blessedcoolant
Copy link
Collaborator

Manually had to install python-multipart even though I had fastapi installed. Not sure if it was some bug on my end or if python-multipart needs to be a part of dependencies.

@Kyle0654
Copy link
Contributor Author

  • environments-and-requirements can be nuked - no longer in use

Done

  • pyproject.toml is looking good except pytest and pytest-cov are already set in the optional test block, so can be removed from the main dependencies list

Fixed

  • .pytest.ini and .coveragerc - these should be defined in pyproject.toml, but that's not critical and can be fixed anytime

Not sure where/how to add these or I would. If you'd like to tell me where, I can fix it up now. Also, I'm pretty sure Python 3.10 is now required as a minimum due to some Pydantic type-hinting.

  • static/dream_web - this may have crept in from a past commit - pretty sure it's obsolete

There's a test.html in there meant to illustrate new API usage for frontend devs. We can delete it once we've rebuilt the UI.

  • scripts/invoke_new.py - we could define a proper entrypoint into ldm/invoke/app/cli_app for this, but not a blocker either

I figured we'd want to rearrange this and/or replace the old one with this (though this has existed since invoke.py was around, so I don't actually know how to do that).

@Kyle0654
Copy link
Contributor Author

Manually had to install python-multipart even though I had fastapi installed. Not sure if it was some bug on my end or if python-multipart needs to be a part of dependencies.

Hrm... I appear to have it installed in my environment. Doesn't hurt to add it to the list I guess.

author Kyle Schouviller <kyle0654@hotmail.com> 1669872800 -0800
committer Kyle Schouviller <kyle0654@hotmail.com> 1676240900 -0800

Adding base node architecture

Fix type annotation errors

Runs and generates, but breaks in saving session

Fix default model value setting. Fix deprecation warning.

Fixed node api

Adding markdown docs

Simplifying Generate construction in apps

[nodes] A few minor changes (#2510)

* Pin api-related requirements

* Remove confusing extra CORS origins list

* Adds response models for HTTP 200

[nodes] Adding graph_execution_state to soon replace session. Adding tests with pytest.

Minor typing fixes

[nodes] Fix some small output query hookups

[node] Fixing some additional typing issues

[nodes] Move and expand graph code. Add base item storage and sqlite implementation.

Update startup to match new code

[nodes] Add callbacks to item storage

[nodes] Adding an InvocationContext object to use for invocations to provide easier extensibility

[nodes] New execution model that handles iteration

[nodes] Fixing the CLI

[nodes] Adding a note to the CLI

[nodes] Split processing thread into separate service

[node] Add error message on node processing failure

Removing old files and duplicated packages

Adding python-multipart
@blessedcoolant blessedcoolant merged commit c22d529 into main Feb 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants