# Checkpointing Workflow Runs

A `Checkpoint` is a snapshot taken during a `Workflow` run that can be inspected and also be used as a starting point in future `Workflow` runs. These checkpoints can become quite helpful when debugging a `Workflow`. For example, if your `Workflow` has many steps and you want to test out the later steps (likely after making some modifications to them), you can save a lot of time by running from the appropriate `Checkpoint` to skip the execution of those earlier steps.

What get's stored in a `Checkpoint`?

- `Checkpoint` objects are centered around the last completed step of the workflow run. They contain the name of the last completed step, that step's input event as well as it's output event, and finally a snapshot of the run's `Context`.

When do `Checkpoint`'s happen?

- When enabled, `Checkpoints` are automatically created and stored in the `Workflow.checkpoints` attribute after every completed step of the `Workflow`.

In the rest of this notebook, we demonstrate:

1. How to enable checkpoints
2. Filter the stored checkpoints, and
3. Finally run from a chosen checkpoint

## Define a Workflow

In [None]:
import os

api_key = os.environ.get("OPENAI_API_KEY")

In [None]:
from llama_index.core.workflow import (
    Workflow,
    step,
    StartEvent,
    StopEvent,
    Event,
    Context,
)
from llama_index.llms.openai import OpenAI


class JokeEvent(Event):
    joke: str


class JokeFlow(Workflow):
    llm = OpenAI(api_key=api_key)

    @step
    async def generate_joke(self, ev: StartEvent) -> JokeEvent:
        topic = ev.topic

        prompt = f"Write your best joke about {topic}."
        response = await self.llm.acomplete(prompt)
        return JokeEvent(joke=str(response))

    @step
    async def critique_joke(self, ev: JokeEvent) -> StopEvent:
        joke = ev.joke

        prompt = f"Give a thorough analysis and critique of the following joke: {joke}"
        response = await self.llm.acomplete(prompt)
        return StopEvent(result=str(response))

### Running With Checkpointing Disabled (Default)

By default, automatic checkpointing is disabled. However, this can be enabled for any run via the `store_checkpoints` parameter.

In [None]:
# instantiate Jokeflow
workflow = JokeFlow()

handler = workflow.run(
    topic="chemistry",
    store_checkpoints=False,
)
await handler

'This joke plays on the double meaning of the word "rates," which can refer to both the cost of something and the passage of time. The punchline suggests that chemists prefer nitrates because they are cheaper than day rates, implying that chemists are frugal or cost-conscious individuals.\n\nOverall, the joke is clever and plays on a pun that is likely to be appreciated by those familiar with chemistry and the concept of nitrates. However, the humor may be lost on those who are not well-versed in chemistry terminology. Additionally, the joke relies on a somewhat niche subject matter, which may limit its appeal to a broader audience.\n\nIn terms of structure, the joke follows a classic setup and punchline format, with the punchline providing a surprising twist that subverts the listener\'s expectations. The wordplay is effective in creating humor, and the joke is concise and to the point.\n\nIn conclusion, while the joke may not be universally appealing, it is a clever play on words tha

In [None]:
workflow.checkpoints

{}

As we can see there are no entries in our `Workflow.checkpoints` attribute.

### Running the Workflow With Checkpointing Enabled

In [None]:
# run the workflow again, but this time with checkpointing enabled
handler = workflow.run(topic="math", store_checkpoints=True)
await handler

'Analysis:\nThis joke plays on the mathematical concept of the equal sign, which is used to show that two quantities are the same. The humor comes from personifying the equal sign and attributing human emotions and characteristics to it. The joke relies on the double meaning of "humble," which can mean both modest and also not thinking of oneself as better or worse than others.\n\nCritique:\nThis joke is clever and plays on a common mathematical symbol in a humorous way. It is a light-hearted and punny joke that is likely to elicit a chuckle from those who understand the mathematical reference. However, the joke may not be universally understood by all audiences, particularly those who are not familiar with mathematical concepts. Additionally, the humor may be considered somewhat simplistic or corny by some, as it relies on a basic pun and wordplay. Overall, while the joke is amusing in its simplicity, it may not be considered particularly sophisticated or original.'

In [None]:
workflow.checkpoints

{'f1e6733f-9bea-4ef7-8f84-db7c0e344217': [Checkpoint(id_='3a85a574-4334-4305-8d54-67bc760c290f', last_completed_step=None, input_event=None, output_event=StartEvent(run_id='f1e6733f-9bea-4ef7-8f84-db7c0e344217'), ctx_state={'globals': {}, 'streaming_queue': '[]', 'queues': {'_done': '[]', 'critique_joke': '[]', 'generate_joke': '[]'}, 'stepwise': False, 'events_buffer': {}, 'accepted_events': [('generate_joke', 'StartEvent'), ('critique_joke', 'JokeEvent')], 'broker_log': [], 'is_running': False}),
  Checkpoint(id_='99b941c0-f887-4752-8500-7066ea1b5c9d', last_completed_step='generate_joke', input_event=StartEvent(run_id='f1e6733f-9bea-4ef7-8f84-db7c0e344217'), output_event=JokeEvent(run_id='f1e6733f-9bea-4ef7-8f84-db7c0e344217', joke="Why was the equal sign so humble?\n\nBecause he knew he wasn't less than or greater than anyone else."), ctx_state={'globals': {}, 'streaming_queue': '[]', 'queues': {'_done': '[]', 'critique_joke': '[]', 'generate_joke': '[]'}, 'stepwise': False, 'events

After enabling checkpointing with specifying `store_checkpoints=True` in the `run()` call, we see that there is a new entry within the `checkpoints` dict. Checkpoints are organized by `run_id`'s and each invocation of a `run()` or `run_from()` (demo'ed later in this notebook) kicks off a new run with it's unique `run_id`.

We can see here that there are 3 stored checkpoints after this first run. The first of these checkpoints are created just before the emission of the `StartEvent`. This can be thought of there being a fictitious "startup" step that is completed (and thus checkpointed), which contains a null input `Event` and whose output event is the `StartEvent`. The remaining two of these checkpoints are made at the completion of each of the two steps—`generate_joke` and `critique_joke`—in this Workflow. 

In [None]:
for run_id, ckpts in workflow.checkpoints.items():
    print(f"Run: {run_id} has {len(ckpts)} stored checkpoints")

Run: f1e6733f-9bea-4ef7-8f84-db7c0e344217 has 3 stored checkpoints


### Filtering the Checkpoints

With checkpointing enabled, every run of a `Workflow`, will create a new entry in the `Workflow.checkpoints` dict attribute. To assist in navigating through these store checkpoints, we provide the `Workflow.filter_checkpoints()` method.

Before showcasing how to filter through checkpoints using this method, let's first make things a bit more interesting by executing the workflow a few more times. Here, we'll run the workflow for two more additional topics. Since each entire run will result in 3 stored checkpoints each, we should have a total of 9 checkpoints after executing the below cell. 

In [None]:
additional_topics = ["biology", "history"]

for topic in additional_topics:
    handler = workflow.run(topic=topic, store_checkpoints=True)
    await handler

In [None]:
for run_id, ckpts in workflow.checkpoints.items():
    print(f"Run: {run_id} has {len(ckpts)} stored checkpoints")

Run: f1e6733f-9bea-4ef7-8f84-db7c0e344217 has 3 stored checkpoints
Run: 12329630-81ba-4244-b00f-9972c5c642bb has 3 stored checkpoints
Run: 9a6ff65f-f619-45dc-82c2-b100787ba165 has 3 stored checkpoints


At this point, we can filter by:

- The name of the last completed step by speciying the param `last_completed_step`
- The event type of the last completed step's output event by specifying `output_event_type`
- Similarly, the event type of the last completed step's input event by specifying `input_event_type`

Specifying multiple of these filters will be combined by the "AND" operator.

In [None]:
# Filter by the name of last completed step
checkpoints_right_after_generate_joke_step = workflow.filter_checkpoints(
    last_completed_step="generate_joke",
    run_id=list(workflow.checkpoints.keys())[1],
)

# checkpoint ids
[ckpt for ckpt in checkpoints_right_after_generate_joke_step]

[Checkpoint(id_='d6d5f38b-f031-43b5-a08d-69551a9ad70d', last_completed_step='generate_joke', input_event=StartEvent(run_id='12329630-81ba-4244-b00f-9972c5c642bb'), output_event=JokeEvent(run_id='12329630-81ba-4244-b00f-9972c5c642bb', joke='Why did the biologist break up with the mathematician?\n\nBecause they couldn\'t find a common "cell"!'), ctx_state={'globals': {}, 'streaming_queue': '[]', 'queues': {'_done': '[]', 'critique_joke': '[]', 'generate_joke': '[]'}, 'stepwise': False, 'events_buffer': {}, 'accepted_events': [('generate_joke', 'StartEvent'), ('critique_joke', 'JokeEvent')], 'broker_log': ['{"__is_pydantic": true, "value": {"run_id": "12329630-81ba-4244-b00f-9972c5c642bb", "_data": {"topic": "biology"}}, "qualified_name": "llama_index.core.workflow.events.StartEvent"}'], 'is_running': True})]

In [None]:
# Filter by output event StopEvent
checkpoints_that_emit_stop_event = workflow.filter_checkpoints(
    output_event_type=StopEvent
)

# checkpoint ids
[ckpt for ckpt in checkpoints_that_emit_stop_event]

[Checkpoint(id_='5e6528d7-0907-4113-a2f6-78043bae6624', last_completed_step='critique_joke', input_event=JokeEvent(run_id='f1e6733f-9bea-4ef7-8f84-db7c0e344217', joke="Why was the equal sign so humble?\n\nBecause he knew he wasn't less than or greater than anyone else."), output_event=StopEvent(run_id='f1e6733f-9bea-4ef7-8f84-db7c0e344217', result='Analysis:\nThis joke plays on the mathematical concept of the equal sign, which is used to show that two quantities are the same. The humor comes from personifying the equal sign and attributing human emotions and characteristics to it. The joke relies on the double meaning of "humble," which can mean both modest and also not thinking of oneself as better or worse than others.\n\nCritique:\nThis joke is clever and plays on a common mathematical symbol in a humorous way. It is a light-hearted and punny joke that is likely to elicit a chuckle from those who understand the mathematical reference. However, the joke may not be universally underst

### Re-Run Workflow from a specific checkpoint

To run from a chosen `Checkpoint` we can use the `Workflow.run_from()` method. NOTE that doing so will lead to a new `run` and it's checkpoints if enabled will be stored under the newly assigned `run_id`.

In [None]:
ckpt = checkpoints_right_after_generate_joke_step[0]
num_runs_from_checkpoint = 2  # to make things interesting

for _ in range(num_runs_from_checkpoint):
    handler = workflow.run_from(checkpoint=ckpt, store_checkpoints=True)
    await handler

In [None]:
for run_id, ckpts in workflow.checkpoints.items():
    print(f"Run: {run_id} has {len(ckpts)} stored checkpoints")

Run: f1e6733f-9bea-4ef7-8f84-db7c0e344217 has 3 stored checkpoints
Run: 12329630-81ba-4244-b00f-9972c5c642bb has 3 stored checkpoints
Run: 9a6ff65f-f619-45dc-82c2-b100787ba165 has 3 stored checkpoints
Run: 15e12fc8-2508-4e77-b6c2-2052d5484c7a has 1 stored checkpoints
Run: 4aefa694-62f1-4f0a-a7e9-55525cf79a5a has 1 stored checkpoints


Since we've executed from the checkpoint that represents the end of "generate_joke" step, there is only one additional checkpoint that gets stored in these "partial" runs.