feat(nodes,ui): fix soft locks on session/invocation retrieval #3910

psychedelicious · 2023-07-23T03:01:08Z

What type of PR is this? (check all applicable)

Have you discussed this change with the InvokeAI team?

Yes
No, because:

Have you updated all relevant documentation?

Yes
No, n/a

Description

When a queue item is popped for processing, we need to retrieve its session from the DB. Pydantic serializes the graph at this stage.

It's possible for a graph to have been made invalid during the graph preparation stage (e.g. an ancestor node executes, and its output is not valid for its successor node's input field).

When this occurs, the session in the DB will fail validation, but we don't have a chance to find out until it is retrieved and parsed by pydantic.

This logic was previously not wrapped in any exception handling.

Just after retrieving a session, we retrieve the specific invocation to execute from the session. It's possible that this could also have some sort of error, though it should be impossible for it to be a pydantic validation error (that would have been caught during session validation). There was also no exception handling here.

When either of these processes fail, the processor gets soft-locked because the processor's cleanup logic is never run. (I didn't dig deeper into exactly what cleanup is not happening, because the fix is to just handle the exceptions.)

This PR adds exception handling to both the session retrieval and node retrieval and events for each: session_retrieval_error and invocation_retrieval_error.

These events are caught and displayed in the UI as toasts, along with the type of the python exception (e.g. Validation Error). The events are also logged to the browser console.

Related Tickets & Documents

Closes #3860 , #3412

QA Instructions, Screenshots, Recordings

Create an valid graph that will become invalid during execution. Here's an example:

This is valid before execution, but the width field of the Noise node will end up with an invalid value (0). Previously, this would soft-lock the app and you'd have to restart it.

Now, with this graph, you will get an error toast, and the app will not get locked up.

Added/updated tests?

Yes (ish)
No

@Kyle0654 @brandonrising
It seems because the processor runs in its own thread, pytest cannot catch exceptions raised in the processor.

I added a test that does work, insofar as it does recreate the issue. But, because the exception occurs in a separate thread, the test doesn't see it. The result is that the test passes even without the fix.

So when running the test, we see the exception:

Exception in thread invoker_processor:
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/home/bat/Documents/Code/InvokeAI/invokeai/app/services/processor.py", line 50, in __process
    self.__invoker.services.graph_execution_manager.get(
  File "/home/bat/Documents/Code/InvokeAI/invokeai/app/services/sqlite.py", line 79, in get
    return self._parse_item(result[0])

  File "/home/bat/Documents/Code/InvokeAI/invokeai/app/services/sqlite.py", line 52, in _parse_item
    return parse_raw_as(item_type, item)
  File "pydantic/tools.py", line 82, in pydantic.tools.parse_raw_as
  File "pydantic/tools.py", line 38, in pydantic.tools.parse_obj_as
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__

But pytest doesn't actually see it as an exception. Not sure how to fix this, it's a bit beyond me.

[optional] Are there any post deployment tasks we need to perform?

nope don't think so

Kyle0654

LGTM.

Only thing I'd have concern about is exposing full stack traces to the client. Probably fine for now, but you may want to make that optional via configuration.

psychedelicious · 2023-07-23T11:40:05Z

I've just removed the test because the issue is lack of exception handling at a high level in the processor loop, as opposed to any deeper internal logic. Don't think we really need to test this.

When a queue item is popped for processing, we need to retrieve its session from the DB. Pydantic serializes the graph at this stage. It's possible for a graph to have been made invalid during the graph preparation stage (e.g. an ancestor node executes, and its output is not valid for its successor node's input field). When this occurs, the session in the DB will fail validation, but we don't have a chance to find out until it is retrieved and parsed by pydantic. This logic was previously not wrapped in any exception handling. Just after retrieving a session, we retrieve the specific invocation to execute from the session. It's possible that this could also have some sort of error, though it should be impossible for it to be a pydantic validation error (that would have been caught during session validation). There was also no exception handling here. When either of these processes fail, the processor gets soft-locked because the processor's cleanup logic is never run. (I didn't dig deeper into exactly what cleanup is not happening, because the fix is to just handle the exceptions.) This PR adds exception handling to both the session retrieval and node retrieval and events for each: `session_retrieval_error` and `invocation_retrieval_error`. These events are caught and displayed in the UI as toasts, along with the type of the python exception (e.g. `Validation Error`). The events are also logged to the browser console.

psychedelicious requested review from Kyle0654, blessedcoolant, brandonrising and maryhipp as code owners July 23, 2023 03:01

psychedelicious mentioned this pull request Jul 23, 2023

Cancel the session on the invocation queue when an invocation error i… #3875

Closed

12 tasks

Kyle0654 approved these changes Jul 23, 2023

View reviewed changes

psychedelicious force-pushed the feat/fix-soft-locks branch from f25e71b to f53e039 Compare July 23, 2023 11:37

psychedelicious force-pushed the feat/fix-soft-locks branch from f53e039 to 4b334be Compare July 23, 2023 11:41

blessedcoolant approved these changes Jul 24, 2023

View reviewed changes

Merge branch 'main' into feat/fix-soft-locks

1969afd

blessedcoolant enabled auto-merge July 24, 2023 08:12

blessedcoolant merged commit d42c394 into main Jul 24, 2023

blessedcoolant deleted the feat/fix-soft-locks branch July 24, 2023 08:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(nodes,ui): fix soft locks on session/invocation retrieval #3910

feat(nodes,ui): fix soft locks on session/invocation retrieval #3910

Uh oh!

psychedelicious commented Jul 23, 2023

Uh oh!

Kyle0654 left a comment

Uh oh!

psychedelicious commented Jul 23, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat(nodes,ui): fix soft locks on session/invocation retrieval #3910

feat(nodes,ui): fix soft locks on session/invocation retrieval #3910

Uh oh!

Conversation

psychedelicious commented Jul 23, 2023

What type of PR is this? (check all applicable)

Have you discussed this change with the InvokeAI team?

Have you updated all relevant documentation?

Description

Related Tickets & Documents

QA Instructions, Screenshots, Recordings

Added/updated tests?

[optional] Are there any post deployment tasks we need to perform?

Uh oh!

Kyle0654 left a comment

Choose a reason for hiding this comment

Uh oh!

psychedelicious commented Jul 23, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants