Skip to content

MapStates replays stale sub-state on repeated invocations when parent has a cascading state_initializer #761

@hippalectryon-0

Description

@hippalectryon-0

When a MapStates (or any MapActionsAndStates / TaskBasedParallelAction) is invoked more than once in the same parent application's lifetime, and the parent application was built with a state_initializer (e.g. via initialize_from(...) for a resumed/forked run), every invocation after the first silently replays the first invocation's sub-state instead of executing the sub-actions. The cause is that sub-application IDs are deterministic in (parent_app_id, i, j) only, so they collide across invocations, and the cascaded initializer then loads the persisted state of the prior call.

This does not surface on fresh runs (no initialize_from on the parent), because the cascaded initializer is None and sub-apps never load existing state — even though their IDs still collide, each call still runs fresh.

Reproduction

import hashlib
import random
import tempfile

from burr.core import ApplicationBuilder, State, action
from burr.core.action import create_action
from burr.core.parallelism import MapStates
from burr.tracking.client import LocalTrackingClient


@action(reads=[], writes=["x"])
def pick(state: State) -> State:
    return state.update(x=random.random())


@action(reads=[], writes=[])
def back(state: State) -> State:
    return state


class Fan(MapStates):
    def action(self, state, inputs):
        return create_action(pick, "pick")

    def states(self, state, context, inputs):
        for _ in range(3):
            yield state

    def reduce(self, state, results):
        return state.update(xs=[s["x"] for s in results])

    @property
    def reads(self):
        return []

    @property
    def writes(self):
        return ["xs"]


class FixedFan(Fan):
    """Workaround: mix context.sequence_id into the sub-app id so the per-
    invocation sub-apps get distinct ids and the cascaded initializer can't
    collide them onto a prior call's persisted state."""

    def tasks(self, state, context, inputs):
        for task in super().tasks(state, context, inputs):
            task.application_id = hashlib.sha256(
                f"{task.application_id}:{context.sequence_id}".encode()
            ).hexdigest()
            yield task


def run(fan: MapStates) -> list[list[float]]:
    tracker = LocalTrackingClient(project="test", storage_dir=tempfile.mkdtemp())
    app = (
        ApplicationBuilder()
        .with_actions(fan=fan, back=back)
        .with_transitions(("fan", "back"), ("back", "fan"))
        .with_tracker(tracker)
        .initialize_from(
            tracker,
            resume_at_next_action=True,
            default_state={},
            default_entrypoint="fan",
        )
        .build()
    )
    results = []
    for _ in range(3):
        app.run(halt_after=["fan"])
        results.append(list(app.state["xs"]))
    return results


def main() -> int:
    rc = 0
    for label, fan in [("buggy Fan", Fan()), ("FixedFan", FixedFan())]:
        results = run(fan)
        print(f"\n[{label}]")
        for i, xs in enumerate(results):
            print(f"  invocation {i}: {xs}")
        replayed = all(xs == results[0] for xs in results[1:])
        print(f"  -> {'STALE REPLAY' if replayed else 'fresh outputs'}")
        if (label == "buggy Fan") != replayed:
            rc = 1
    return rc


if __name__ == "__main__":
    raise SystemExit(main())

output:

❯ uv run repro_mapstates_bug.py
[buggy Fan]
  invocation 0: [0.3601082508430954, 0.47278782334648684, 0.052613734197709316]
  invocation 1: [0.3601082508430954, 0.47278782334648684, 0.052613734197709316]
  invocation 2: [0.3601082508430954, 0.47278782334648684, 0.052613734197709316]
  -> STALE REPLAY

[FixedFan]
  invocation 0: [0.07898183874474951, 0.9919101156042253, 0.5423793059392615]
  invocation 1: [0.5005441699503071, 0.3534622090272228, 0.17871141506633137]
  invocation 2: [0.4519801578765156, 0.07313228344734668, 0.9619878312667838]
  -> fresh outputs

Library & System Information

Burr 0.40.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions