-
-
Notifications
You must be signed in to change notification settings - Fork 710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't serialize functions decorated with @span
#7954
Comments
I've reproduced the issue; it's a cloudpickle bug (cloudpipe/cloudpickle#509). This said, this is not how spans work. The correct way to write your example is import distributed
from distributed import span
client = distributed.Client()
def double(x):
return x * 2
with span("my-double"):
X = client.map(double, range(10)) or as a decorator: def double(x):
return x * 2
@span("my-double")
def gen_graph(client):
return client.map(double, range(10))
X = gen_graph(client) which of course in the above minimal example feels a bit silly, but makes a lot more sense if instead of a one-liner you have complicated code, for example: @span("load")
def load(url: str) -> dd.DataFrame:
...
@span("preprocess")
def preprocess(data: dd.DataFrame) -> dd.DataFrame:
...
@span("train")
def train(training_data: dd.DataFrame) -> xgboost.dask.Model:
...
raw_data = load("s3://mybucket/mydata.parquet")
training_data = preprocess(raw_data)
model = train(training_data)
def f():
client = distributed.get_client()
with span("bar"):
x = ... # define dask collection
fut = client.compute(x)
distributed.secede()
return fut.result()
def main():
client = distributed.Client()
with span("foo"):
client.submit(f).result() The above example will generate span foo and then a subspan foo->bar. I'm leaving this issue open (tracking the upstream cloudpickle ticket) as your usage is potentially useful in this latter use case; in other words it would make sense to write @span("bar")
def f():
client = distributed.get_client()
x = ... # define dask collection
fut = client.compute(x)
distributed.secede()
return fut.result()
def main():
client = distributed.Client()
with span("foo"):
client.submit(f).result() |
@span
I understand the reasoning why the decorator does what it does but I think this distinction is confusing for users. With this limitation I wonder if it is not best to remove the decorator entirely |
This gives me an error:
I get
This is with dask and distributed
2023.6.2a230627
, installed fromdask/label/dev
, as well as2023.6.1
and2023.6.0
.Am I supposed to be able to do this? Or am I doing something wrong? The same code minus the
@span
decorator works.The text was updated successfully, but these errors were encountered: