Add utilities to serialize context to a dictionary and hydrate context from a dictionary #13529

desertaxle · 2024-05-22T20:34:49Z

When running tasks on remote infrastructure like Dask and Ray with the task engine, we lose some of the context surrounding the current task run. This makes it difficult to link the task run back to its parent run. This can also make it difficult to ensure the correct settings are propagated to the remote infrastructure.

This PR introduces two mirror utilities: serialize_context and hydrated_context. When used in conjunction, we can serialize the current context to a pickleable dictionary, send it to the remote infrastructure, and recreate it. This helps to make the task engine highly portable and helps us maintain consistent dependency tracking.

The task run engine has been updated to accept a context kwargs, which consumes the output of serialize_context.

Example

Submit a task to Dask with the current context:

from uuid import uuid4

from distributed import Client

from prefect import flow, task
from prefect.context import serialize_context


@task
def square(x):
    return x**2


@flow
def my_flow():
    context = serialize_context()
    with Client() as client:
        future = client.submit(
            run_task,
            square,
            parameters={"x": 42},
            context=context,
        )

Checklist

This pull request references any related issue by including "closes <link to issue>"
- If no issue exists and your change is not a small fix, please create an issue first.
If this pull request adds new functionality, it includes unit tests that cover the changes
This pull request includes a label categorizing the change e.g. maintenance, fix, feature, enhancement, docs.

For documentation changes:

This pull request includes redirect settings in netlify.toml for files that are removed or renamed.

For new functions or classes in the Python SDK:

This pull request includes helpful docstrings.
If a new Python file was added, this pull request contains a stub page in the Python SDK docs and an entry in mkdocs.yml navigation.

… from a dictionary

cicdw

I like how explicit this approach is vs. making context natively pickleable (which would probably lead to some confusion around __enter__ and __exit__ mechanics. Nice.

desertaxle added 3 commits May 22, 2024 15:33

Add utlities to serialize context to a dictionary and hydrate context…

be49527

… from a dictionary

Adds tests

e47011e

Fixes failing tests

cbac245

desertaxle added the enhancement An improvement of an existing feature label May 23, 2024

desertaxle marked this pull request as ready for review May 23, 2024 01:38

desertaxle requested a review from a team as a code owner May 23, 2024 01:38

desertaxle requested review from cicdw and removed request for a team May 23, 2024 01:38

cicdw approved these changes May 23, 2024

View reviewed changes

desertaxle merged commit bb3d2c2 into main May 23, 2024
26 checks passed

desertaxle deleted the serialize-context branch May 23, 2024 13:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add utilities to serialize context to a dictionary and hydrate context from a dictionary #13529

Add utilities to serialize context to a dictionary and hydrate context from a dictionary #13529

desertaxle commented May 22, 2024 •

edited

Loading

cicdw left a comment

Add utilities to serialize context to a dictionary and hydrate context from a dictionary #13529

Add utilities to serialize context to a dictionary and hydrate context from a dictionary #13529

Conversation

desertaxle commented May 22, 2024 • edited Loading

Example

Checklist

cicdw left a comment

Choose a reason for hiding this comment

desertaxle commented May 22, 2024 •

edited

Loading