Hi,
I am using distributed to read a dataframe on a LocalCluster, which works well, and am then trying to client.map a function taking said dataframe as an argument, following the 'launching tasks from tasks' guidelines.
I get a cloudpickle error on the function I try to map pointing to the c.compute line below when data is a dask dataframe. If I persist the dataframe prior to mapping, this seems to work though
def function(key, data=None):
c = get_client()
df = c.compute(data.loc[key])
secede()
df = c.gather(df)
rejoin()
Is this a limitation of dask, a bug or just something I am missing?
Thanks
Olivier
Hi,
I am using distributed to read a dataframe on a LocalCluster, which works well, and am then trying to client.map a function taking said dataframe as an argument, following the 'launching tasks from tasks' guidelines.
I get a cloudpickle error on the function I try to map pointing to the c.compute line below when data is a dask dataframe. If I persist the dataframe prior to mapping, this seems to work though
Is this a limitation of dask, a bug or just something I am missing?
Thanks
Olivier