New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raise in P2P if column
dtype
is wrong
#8167
Raise in P2P if column
dtype
is wrong
#8167
Conversation
def test_set_index_p2p_with_existing_index(): | ||
df = pd.DataFrame({"a": np.random.randint(0, 3, 20)}, index=np.random.random(20)) | ||
ddf = dd.from_pandas( | ||
df, | ||
npartitions=4, | ||
) | ||
with Client() as c: | ||
with pytest.raises(TypeError, match="_partitions.*integer"): | ||
ddf.set_index("a", shuffle="p2p") | ||
|
||
|
||
def test_sort_values_p2p_with_existing_divisions(): | ||
"Regression test for #8165" | ||
df = pd.DataFrame( | ||
{"a": np.random.randint(0, 3, 20), "b": np.random.randint(0, 3, 20)} | ||
) | ||
ddf = dd.from_pandas( | ||
df, | ||
npartitions=4, | ||
) | ||
with Client() as c: | ||
with dask.config.set({"dataframe.shuffle.method": "p2p"}): | ||
with pytest.raises(TypeError, match="_partitions.*integer"): | ||
ddf = ddf.set_index("a").sort_values("b") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two tests need to be updated once dask/dask#10493 is merged.
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 21 files ± 0 21 suites ±0 10h 40m 28s ⏱️ - 7m 53s For more details on these failures, see this check. Results for commit 6d3a76a. ± Comparison against base commit 20def28. ♻️ This comment has been updated with latest results. |
from dask.dataframe.core import new_dd_object | ||
|
||
meta = df._meta | ||
if not pd.api.types.is_integer_dtype(meta[column]): | ||
raise TypeError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This used to be tested before dask/dask#10493 has been merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple comments
with LocalCluster( | ||
n_workers=2, dashboard_address=":0", loop=loop | ||
) as cluster, Client(cluster) as c: | ||
ddf.set_index("a", shuffle="p2p") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you not using the result of this op?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoopsie, changed this test too many times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Co-authored-by: Patrick Hoefler <61934744+phofl@users.noreply.github.com>
pre-commit run --all-files