Skip to content

Prune scheduling sometimes crashing #135

@davramov

Description

@davramov

When move.py schedules pruning via prune_controller.prune(), the Schedule Prefect Flow task occasionally crashes with Execution was cancelled by the runtime environment before the deferred prune run is registered with the Prefect server. The dispatcher flow itself completes successfully and logs "Scheduled delete from spot832 at ...", but no scheduled flow run actually exists. The files are never pruned.

This appears to have become more frequent since the reconstruction flow was turned off, presumably because the dispatcher now finishes faster and gives the submitted task less time to complete before teardown.

Failed:
Image

Image

Completed example:

Image

I think the solution is to ensure that the async submit completes before the task/flow is done in the eyes of Prefect.

In GlobusPruneController.prune(), block on the future:

future = schedule_prefect_flow.submit(...)
future.wait()  # or .result() to surface exceptions

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions