You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Initially found in #5713. Found while trying to use this function to wait for remote computation to finish for read functions.
# at the moment it is not possible to use `wait_partitions` function;# in a situation where the reading function is called in a row with the# same parameters, `wait_partitions` considers that we have waited for# the end of remote calculations, however, when trying to materialize the# received data, it is clear that the calculations have not yet ended.# for example, `test_io_exp.py::test_read_evaluated_dict` is failed because of that
For Dask, this can be solved by making pure parameter False by default. The problem was also observed for ray and unidist.
It seems that if engines cache the result of a function call with the same parameters, then features pointing to the same object (or to a copy of it) should be returned. But now it turns out that the features are in the state of the end of the calculations, however, the calculations are actually still going on (very similar to a bug). Further research is needed.
The text was updated successfully, but these errors were encountered:
The current workaround of materializing dtypes can be problematic: for example, if you load a dataset with a very large pd.Categorical that can't fit into memory of a single worker. This works fine in AsyncReadMode but not in the default, synchronous mode, because _ = query_compiler.dtypes will crash the worker.
This is obviously quite an edge case. However, I am a bit surprised that synchronous reading is the default; I see why it is necessary in the test suite but I can't imagine it is common to delete data files as soon as they have been loaded.
Initially found in #5713. Found while trying to use this function to wait for remote computation to finish for read functions.
For Dask, this can be solved by making
pure
parameterFalse
by default. The problem was also observed for ray and unidist.It seems that if engines cache the result of a function call with the same parameters, then features pointing to the same object (or to a copy of it) should be returned. But now it turns out that the features are in the state of the end of the calculations, however, the calculations are actually still going on (very similar to a bug). Further research is needed.
The text was updated successfully, but these errors were encountered: