Workflows are ephemeral by default, meaning that once the `run()` method returns its result, the workflow state is lost. A subsequent call to `run()` on the same workflow instance will start from a fresh state.

If the use case requires to persist the workflow state  across multiple runs and possibly different processes, there are a few strategies that can be used to make workflows more durable.

In [None]:
!pip install llama-index-workflows

## Storing data in the workflow instance

Workflows are regular Python classes, and data can be stored in class or instance variables, so that subsequent `run()` invocations can access it.

In [None]:
from workflows import Workflow, step
from workflows.events import StartEvent, StopEvent


class MyWorkflow(Workflow):
    def __init__(self, *args, **kwargs):
        self.counter = 0
        super().__init__(*args, **kwargs)

    @step
    def count(self, ev: StartEvent) -> StopEvent:
        self.counter += 1
        return StopEvent(result=f"The step ran {self.counter} times")


w = MyWorkflow()
for _ in range(3):
    print(await w.run())

The step ran 1 times
The step ran 2 times
The step ran 3 times


## Storing data in the context object

Each workflow comes with a special object responsible for its runtime operations called `Context`. The context instance is available to any step of a workflow and comes with a `store` property that can be used to store and load state data. Using the state store has two major advantages compared to class and instance variables:

- It’s async safe and supports concurrent access
- It can be serialized

In [None]:
from workflows import Workflow, step, Context
from workflows.events import StartEvent, StopEvent


class MyWorkflow(Workflow):
    @step
    async def count(self, ctx: Context, ev: StartEvent) -> StopEvent:
        async with ctx.store.edit_state() as state:
            counter = state.get("counter", 1)
            retval = StopEvent(result=f"The step ran {counter} times")
            state["counter"] = counter + 1
        return retval


w = MyWorkflow()
handler = w.run()
print(await handler)

w = MyWorkflow()
handler = w.run(ctx=handler.ctx)
print(await handler)

The step ran 1 times
The step ran 2 times


## Using external resources to checkpoint execution

To avoid any overhead, workflows don’t take snapshots of the current state automatically, so they can’t survive a fatal error on their own. However, any step can rely on some external database like Redis and snapshot the current context on sensitive parts of the code.

For example, given a long running workflow processing hundreds of documents, we could save the id of the last document successfully processed in the state store:

In [None]:
import sqlite3
import json
from typing import Annotated

from workflows import Workflow, step, Context
from workflows.events import StartEvent, StopEvent
from workflows.resource import Resource
from workflows.context import JsonSerializer


def get_db() -> sqlite3.Connection:
    return sqlite3.connect("mydb.db")


class MyWorkflow(Workflow):
    @step
    async def count(
        self,
        ctx: Context,
        ev: StartEvent,
        db: Annotated[sqlite3.Connection, Resource(get_db)],
    ) -> StopEvent:
        async with ctx.store.edit_state() as state:
            counter = state.get("counter", 1)
            retval = StopEvent(result=f"The step ran {counter} times")
            state["counter"] = counter + 1

        cursor = db.cursor()
        ctx_dict = ctx.to_dict(serializer=JsonSerializer())
        cursor.execute(
            "INSERT OR REPLACE INTO state VALUES (?, ?)",
            ("last_ctx", json.dumps(ctx_dict)),
        )
        db.commit()

        return retval


# Create a simple key-value table
db = get_db()
db.cursor().execute(
    "CREATE TABLE IF NOT EXISTS state (key TEXT PRIMARY KEY, value TEXT)"
)
db.commit()


w = MyWorkflow()
print(await w.run())

# State is stored in a DB now, we could restart the process here...

w = MyWorkflow()
cursor = db.cursor()
cursor.execute("SELECT value FROM state WHERE key=?", ("last_ctx",))
ctx_json = cursor.fetchone()[0]
restored_ctx = Context.from_dict(w, json.loads(ctx_json), serializer=JsonSerializer())
print(await w.run(ctx=restored_ctx))

The step ran 1 times
The step ran 2 times
