# Checkpointing Workflow Runs

In this notebook, we demonstrate how to checkpoint `Workflow` runs via a `WorkflowCheckpointer` object. We also show how we can view all of the checkpoints that are stored in this object and finally how we can use a checkpoint as the starting point of a new run.

## Define a Workflow

In [None]:
import os

api_key = os.environ.get("OPENAI_API_KEY")

In [None]:
from llama_index.core.workflow import (
    Workflow,
    step,
    StartEvent,
    StopEvent,
    Event,
    Context,
)
from llama_index.llms.openai import OpenAI


class JokeEvent(Event):
    joke: str


class JokeFlow(Workflow):
    llm = OpenAI(api_key=api_key)

    @step
    async def generate_joke(self, ev: StartEvent) -> JokeEvent:
        topic = ev.topic

        prompt = f"Write your best joke about {topic}."
        response = await self.llm.acomplete(prompt)
        return JokeEvent(joke=str(response))

    @step
    async def critique_joke(self, ev: JokeEvent) -> StopEvent:
        joke = ev.joke

        prompt = f"Give a thorough analysis and critique of the following joke: {joke}"
        response = await self.llm.acomplete(prompt)
        return StopEvent(result=str(response))

## Define a WorkflowCheckpointer Object

In [None]:
from llama_index.core.workflow.checkpointer import WorkflowCheckpointer

In [None]:
# instantiate Jokeflow
workflow = JokeFlow()
wflow_ckptr = WorkflowCheckpointer(workflow=workflow)

## Run the Workflow from the WorkflowCheckpointer

The `WorkflowCheckpointer.run()` method is a wrapper over the `Workflow.run()` method, which injects a checkpointer callback in order to create and store checkpoints. Note that checkpoints are created at the completion of a step, and that the data stored in checkpoints are:

- `last_completed_step`: The name of the last completed step
- `input_event`: The input event to this last completed step
- `output_event`: The event outputted by this last completed step
- `ctx_state`: a snapshot of the attached `Context`

In [None]:
handler = wflow_ckptr.run(
    topic="chemistry",
    store_checkpoints=False,
)
await handler

'This joke plays on the double meaning of the word "rates," which can refer to both the cost of something and the passage of time. In this case, the joke suggests that chemists prefer nitrates because they are less expensive than day rates, implying that chemists are frugal or cost-conscious.\n\nOverall, the joke is clever and plays on a pun that is likely to be appreciated by those familiar with chemistry and the concept of nitrates. However, the humor may be lost on those who are not well-versed in chemistry or who do not understand the specific reference to nitrates.\n\nOne potential critique of the joke is that it relies heavily on wordplay and may not be universally understood or appreciated. Additionally, some may find the humor to be somewhat niche or esoteric, limiting its appeal to a broader audience.\n\nIn conclusion, while the joke is clever and plays on a pun related to chemistry, its niche appeal and reliance on wordplay may limit its overall effectiveness as a joke.'

We can view all of the checkpoints via the `.checkpoints` attribute, which is dictionary with keys representing the `run_id` of the run and whose values are the list of checkpoints stored for the run.

In [None]:
wflow_ckptr.checkpoints

{'51eece8c-2405-49f9-8386-976dd549e1ec': [Checkpoint(id_='4897a201-0450-4e09-9a6b-ac96428a5a2a', last_completed_step='generate_joke', input_event=StartEvent(), output_event=JokeEvent(joke="Why do chemists like nitrates so much?\n\nBecause they're cheaper than day rates!"), ctx_state={'globals': {}, 'streaming_queue': '[]', 'queues': {'_done': '[]', 'critique_joke': '[]', 'generate_joke': '[]'}, 'stepwise': False, 'events_buffer': {}, 'in_progress': {'generate_joke': []}, 'accepted_events': [('generate_joke', 'StartEvent'), ('critique_joke', 'JokeEvent')], 'broker_log': ['{"__is_pydantic": true, "value": {"_data": {"topic": "chemistry", "store_checkpoints": false}}, "qualified_name": "llama_index.core.workflow.events.StartEvent"}'], 'is_running': True}),
  Checkpoint(id_='937100d0-a3ed-4238-bec4-0db3e0d88438', last_completed_step='critique_joke', input_event=JokeEvent(joke="Why do chemists like nitrates so much?\n\nBecause they're cheaper than day rates!"), output_event=StopEvent(result

In [None]:
for run_id, ckpts in wflow_ckptr.checkpoints.items():
    print(f"Run: {run_id} has {len(ckpts)} stored checkpoints")

Run: 51eece8c-2405-49f9-8386-976dd549e1ec has 2 stored checkpoints


## Filtering the Checkpoints

The `WorkflowCheckpointer` object also has a `.filter_checkpoints()` method that allows us to filter via:

- The name of the last completed step by speciying the param `last_completed_step`
- The event type of the last completed step's output event by specifying `output_event_type`
- Similarly, the event type of the last completed step's input event by specifying `input_event_type`

Specifying multiple of these filters will be combined by the "AND" operator.

Let's test this functionality out, but first we'll make things a bit more interesting by running a couple of more runs with our `Workflow`.

In [None]:
additional_topics = ["biology", "history"]

for topic in additional_topics:
    handler = wflow_ckptr.run(topic=topic)
    await handler

In [None]:
for run_id, ckpts in wflow_ckptr.checkpoints.items():
    print(f"Run: {run_id} has {len(ckpts)} stored checkpoints")

Run: 51eece8c-2405-49f9-8386-976dd549e1ec has 2 stored checkpoints
Run: 753d60fc-182b-4213-bd09-83a68c966045 has 2 stored checkpoints
Run: 7f8bc8ac-d92b-48fc-a38c-a1f869efe7f3 has 2 stored checkpoints


In [None]:
# Filter by the name of last completed step
checkpoints_right_after_generate_joke_step = wflow_ckptr.filter_checkpoints(
    last_completed_step="generate_joke",
)

# checkpoint ids
[ckpt for ckpt in checkpoints_right_after_generate_joke_step]

[Checkpoint(id_='4897a201-0450-4e09-9a6b-ac96428a5a2a', last_completed_step='generate_joke', input_event=StartEvent(), output_event=JokeEvent(joke="Why do chemists like nitrates so much?\n\nBecause they're cheaper than day rates!"), ctx_state={'globals': {}, 'streaming_queue': '[]', 'queues': {'_done': '[]', 'critique_joke': '[]', 'generate_joke': '[]'}, 'stepwise': False, 'events_buffer': {}, 'in_progress': {'generate_joke': []}, 'accepted_events': [('generate_joke', 'StartEvent'), ('critique_joke', 'JokeEvent')], 'broker_log': ['{"__is_pydantic": true, "value": {"_data": {"topic": "chemistry", "store_checkpoints": false}}, "qualified_name": "llama_index.core.workflow.events.StartEvent"}'], 'is_running': True}),
 Checkpoint(id_='31fa2439-dcbf-4541-bb55-35c9e8c906ba', last_completed_step='generate_joke', input_event=StartEvent(), output_event=JokeEvent(joke="Why did the biologist break up with the mathematician? \nBecause they couldn't find a common denominator!"), ctx_state={'globals'

## Re-Run Workflow from a specific checkpoint

To run from a chosen `Checkpoint` we can use the `WorkflowCheckpointer.run_from()` method. NOTE that doing so will lead to a new `run` and it's checkpoints if enabled will be stored under the newly assigned `run_id`.

In [None]:
# can work with a new instance
new_workflow_instance = JokeFlow()
wflow_ckptr.workflow = new_workflow_instance

ckpt = checkpoints_right_after_generate_joke_step[0]

handler = wflow_ckptr.run_from(checkpoint=ckpt)
await handler

'This joke plays on the double meaning of the word "nitrates," which can refer to both a chemical compound and a form of payment for services rendered. The humor lies in the unexpected twist of associating a scientific concept with a financial transaction.\n\nOne strength of this joke is its clever wordplay and the way it combines two unrelated concepts in a humorous way. The punchline is unexpected and plays on the audience\'s knowledge of chemistry and economics.\n\nHowever, some may find this joke to be a bit niche or esoteric, as it requires a basic understanding of chemistry to fully appreciate the humor. Additionally, the joke may not be as universally relatable or accessible to a general audience.\n\nOverall, while this joke may appeal to those with a background in chemistry or a love for puns, it may not resonate with a wider audience. It is a clever play on words, but its niche appeal may limit its effectiveness as a joke for a broader audience.'

In [None]:
for run_id, ckpts in wflow_ckptr.checkpoints.items():
    print(f"Run: {run_id} has {len(ckpts)} stored checkpoints")

Run: 51eece8c-2405-49f9-8386-976dd549e1ec has 2 stored checkpoints
Run: 753d60fc-182b-4213-bd09-83a68c966045 has 2 stored checkpoints
Run: 7f8bc8ac-d92b-48fc-a38c-a1f869efe7f3 has 2 stored checkpoints
Run: 09720e33-8226-49ed-bb2e-9b81d39907ea has 1 stored checkpoints


Since we've executed from the checkpoint that represents the end of "generate_joke" step, there is only one additional checkpoint (i.e., that for the completion of step "critique_joke") that gets stored in the last partial run.