Handle multi-leaf state in cron workflows

## The Problem

When a cron job runs, it takes the state from the previous run and passes it into the next run.

We recently changed the cron behaviour in lightning: instead of  using the state of the first job, we now give users the option to pick which step's state to use, defaulting  to the final state. 

So the workflow runs, returns `{}`, and then `{}` is passed as the input to the next state.

Final state works a bit differently for workflows with multiple leaf nodes. Instead of returning state, it returns an object with the state for each step, Like: `{ 'step-a': { x: 1}, 'step-b' : { x: 2 } }`. This is also true for an empty workflow.

It's a bit hard to wrap the brain around but I think what happens is this:

first run input:  `{}`
first run output:  `{ a: {}, b: {} }`
second run input:  `{ a: {}, b: {} }`
second run output: 
```
{
 a:  { a: {}, b: {} },
 b: { a: {}, b: {} }
}
```
Because each step just returns its input, and we return multiple state objects.

That leads to this sort of thing in production:

<img width="675" height="513" alt="Image" src="https://github.com/user-attachments/assets/77bf1917-af94-4e27-b946-3d74752d0ae4" />

Eventually, the state object get huge, and costs a ton of memory and processing to ship the dataclip between the worker and app, resulting in performance degredation and even server crashs.

An escalating factor is that AI generated cron workflows with no code and multiple leaves (which is common!) will run infinitely on the platform, slowly building up bigger and bigger state objects until things start blowing up.

## The temporary fix

In #1371 I added a rough fix which identifies empty state returned by a leaf node, and removes it from the final state object. So an empty workflow returns `{}` as its final state. This should neutralise the problem.

But it's not a great fix really.

## Solutions

1. The runtime will always return, at a minimum, `{ data: {} }` from a step. Like the default state object include a `data` key. This was a decision made by default very early on in the new runtime. I think we should drop this. Now steps will naturally return an empty state object, which is a bit easier to return no leaf state for
2. We need to think holistically about final state for leaf nodes and how this affects cron workflows. What  I think we really need is a thing  called a state reconciler: this takes multiple objects and merges them together (the simplest  being a basic spread (probably a deep spread actually). You then attach a reconciler to the workflow and that gives you a single final state, which sort of removes this whole problem
3. Can we do something to detect state recursion generally? This feels like a problem that might affect users, even with single-exit workflows, who happen to be building state objects poorly

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle multi-leaf state in cron workflows #1375

The Problem

The temporary fix

Solutions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Handle multi-leaf state in cron workflows #1375

Description

The Problem

The temporary fix

Solutions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions