New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compatibility with pandas 1.1.0rc0 #6429
Conversation
00b10d1
to
37e22d9
Compare
Thanks as always for doing this @TomAugspurger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be good to go.
@@ -16,6 +17,9 @@ | |||
from dask.dataframe.io.parquet.core import ParquetSubgraph | |||
from dask.utils import natural_sort_key, parse_bytes | |||
|
|||
|
|||
pytestmark = pytest.mark.skipif(da.numpy_compat._numpy_120, reason="Unsupported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We won't want this long-tem, since it's a blanket skip whenever we have NumPy 1.20. I've added it now just to keep the number of failures on this branch manageable.
None, | ||
None, | ||
None, | ||
marks=pytest.mark.xfail(PANDAS_GT_110, reason="upstream changes"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to do some work here, but it isn't especially time sensitive or high-priority. Right now you'll just see a warning for datetime columns that can't be silenced.
In [15]: df = pd.DataFrame({"A": pd.date_range("2000", periods=2)})
In [16]: ddf = dd.from_pandas(df, npartitions=1)
In [17]: df.describe()
/Users/taugspurger/.virtualenvs/dask-dev/bin/ipython:1: FutureWarning: Treating datetime data as categorical rather than numeric in `.describe` is deprecated and will be removed in a future version of pandas. Specify `datetime_is_numeric=True` to silence this warning and adopt the future behavior now.
#!/Users/taugspurger/Envs/dask-dev/bin/python
Out[17]:
A
count 2
unique 2
top 2000-01-01 00:00:00
freq 1
first 2000-01-01 00:00:00
last 2000-01-02 00:00:00
In [18]: ddf.describe()
/Users/taugspurger/sandbox/dask/dask/dataframe/core.py:2230: FutureWarning: Treating datetime data as categorical rather than numeric in `.describe` is deprecated and will be removed in a future version of pandas. Specify `datetime_is_numeric=True` to silence this warning and adopt the future behavior now.
meta = data._meta_nonempty.describe()
/Users/taugspurger/sandbox/dask/dask/dataframe/core.py:2128: FutureWarning: Treating datetime data as categorical rather than numeric in `.describe` is deprecated and will be removed in a future version of pandas. Specify `datetime_is_numeric=True` to silence this warning and adopt the future behavior now.
meta = self._meta_nonempty.describe(include=include, exclude=exclude)
Out[18]:
Dask DataFrame Structure:
A
npartitions=1
object
...
Dask Name: describe, 19 tasks
In [19]: _.compute()
Out[19]:
A
unique 2
count 2
top 2000-01-02 00:00:00
freq 1
first 2000-01-01 00:00:00
last 2000-01-02 00:00:00
Hey @TomAugspurger -- does this supersede #6156 ? |
It does, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's one test failure in the upstream-dev
env on Travis but it's a pyarrow error and seems unrelated to this:
https://travis-ci.org/github/dask/dask/jobs/710430328
Should be good to go @dask/maintenance
Thanks! |
No description provided.