Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make context pickleable #1710

Closed
joshmeek opened this issue Nov 6, 2019 · 3 comments · Fixed by #1711
Closed

Make context pickleable #1710

joshmeek opened this issue Nov 6, 2019 · 3 comments · Fixed by #1711
Labels
enhancement An improvement of an existing feature

Comments

@joshmeek
Copy link

joshmeek commented Nov 6, 2019

Current behavior

In the following flow:

from prefect import context, Flow, task
from prefect.engine.executors import DaskExecutor

@task
def extract():
    """Get a list of data"""
    return [1, 2, 3]


@task
def transform(data):
    """Multiply the input by 10"""
    return [i * 10 for i in data]


@task
def load(data):
    """Print the data to indicate it was received"""
    logger = context.get("logger")
    logger.info("Here's your data: {}".format(data))

with Flow("ETL") as flow:
    e = extract()
    t = transform(e)
    l = load(t)

executor = DaskExecutor(address="localhost:8786"
flow.run(executor=executor)

We will see the error Unexpected error occured in FlowRunner: TypeError("can't pickle Context objects") because context is used as an attribute of prefect. This requires users to import prefect in their tasks in order to use context.

Proposed behavior

We could adjust the behavior so that context would be unpickled in the task and return prefect.context therefore avoiding this extra step of having to import prefect.

@joshmeek joshmeek added the enhancement An improvement of an existing feature label Nov 6, 2019
@cicdw
Copy link
Member

cicdw commented Nov 7, 2019

This is a good idea and in principle is trivial to implement. However, all implementations I've tried so far preserve context as it was at task creation, which means that all of the things we put into context during a task run (e.g., logger) aren't available.

The reason for this is that the serialization occurs at task submission time, and the deserialization occurs prior to the task runner actually running anything, so there's never really a chance to update context with the appropriate information.

I think instead of making context pickleable, for now I'm going to simply raise an informative error message.

@evanokeefe39
Copy link

As it is, if you follow documentation as a noob and put logger = context.get("logger") anywhere in your code - it breaks flow registration... my first thought was "But i'm not doing anything with context?" up seeing error message. Was only this issue that made me realise it was the logger!

@visualapproachanalytics

To follow up with this, I agree this is a trivial change, but an unimplemented trivial change that ultimately sent us away from Prefect. The flow runs locally, then after a myriad of problems using Prefect Cloud, we found this trivial issue and gave up. You should fix this. It's messy and caused an issue of confidence that we weren't going to constantly run into trivial, yet flow-breaking issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement of an existing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants