Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Test numeric edge case for repartition with npartitions. #5433
We had an issue with Dask 1.2.0 where repartitioning a certain number of source partitions into a certain number of target partitions led to one partition missing from the result. In the end, we found that this is because of a numeric instability when determining divisions (dividing and multiplying again with the same number might not always yield exactly the original number, but slightly less, which causes a problem here: https://github.com/dask/dask/blob/master/dask/dataframe/core.py#L5516).
With Dask >=2, the problem was fixed by introducing this check: c4b6770#diff-492da7893fa5fa6ad3a6ef6cdef985f2R4821
Nevertheless, it would be safer to preclude any chance of the same bug being introduced again by adding a test, as done by this PR.