Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add utilities to serialize context to a dictionary and hydrate context from a dictionary #13529

Merged
merged 3 commits into from
May 23, 2024

Conversation

desertaxle
Copy link
Member

@desertaxle desertaxle commented May 22, 2024

When running tasks on remote infrastructure like Dask and Ray with the task engine, we lose some of the context surrounding the current task run. This makes it difficult to link the task run back to its parent run. This can also make it difficult to ensure the correct settings are propagated to the remote infrastructure.

This PR introduces two mirror utilities: serialize_context and hydrated_context. When used in conjunction, we can serialize the current context to a pickleable dictionary, send it to the remote infrastructure, and recreate it. This helps to make the task engine highly portable and helps us maintain consistent dependency tracking.

The task run engine has been updated to accept a context kwargs, which consumes the output of serialize_context.

Example

Submit a task to Dask with the current context:

from uuid import uuid4

from distributed import Client

from prefect import flow, task
from prefect.context import serialize_context


@task
def square(x):
    return x**2


@flow
def my_flow():
    context = serialize_context()
    with Client() as client:
        future = client.submit(
            run_task,
            square,
            parameters={"x": 42},
            context=context,
        )

Checklist

  • This pull request references any related issue by including "closes <link to issue>"
    • If no issue exists and your change is not a small fix, please create an issue first.
  • If this pull request adds new functionality, it includes unit tests that cover the changes
  • This pull request includes a label categorizing the change e.g. maintenance, fix, feature, enhancement, docs.

For documentation changes:

  • This pull request includes redirect settings in netlify.toml for files that are removed or renamed.

For new functions or classes in the Python SDK:

  • This pull request includes helpful docstrings.
  • If a new Python file was added, this pull request contains a stub page in the Python SDK docs and an entry in mkdocs.yml navigation.

@desertaxle desertaxle added the enhancement An improvement of an existing feature label May 23, 2024
@desertaxle desertaxle marked this pull request as ready for review May 23, 2024 01:38
@desertaxle desertaxle requested a review from a team as a code owner May 23, 2024 01:38
@desertaxle desertaxle requested review from cicdw and removed request for a team May 23, 2024 01:38
Copy link
Member

@cicdw cicdw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how explicit this approach is vs. making context natively pickleable (which would probably lead to some confusion around __enter__ and __exit__ mechanics. Nice.

@desertaxle desertaxle merged commit bb3d2c2 into main May 23, 2024
26 checks passed
@desertaxle desertaxle deleted the serialize-context branch May 23, 2024 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants