Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with pandas 1.1.0rc0 #6429

Merged
merged 12 commits into from Jul 23, 2020

Conversation

TomAugspurger
Copy link
Member

No description provided.

@mrocklin
Copy link
Member

Thanks as always for doing this @TomAugspurger

Copy link
Member Author

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be good to go.

@@ -16,6 +17,9 @@
from dask.dataframe.io.parquet.core import ParquetSubgraph
from dask.utils import natural_sort_key, parse_bytes


pytestmark = pytest.mark.skipif(da.numpy_compat._numpy_120, reason="Unsupported")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We won't want this long-tem, since it's a blanket skip whenever we have NumPy 1.20. I've added it now just to keep the number of failures on this branch manageable.

None,
None,
None,
marks=pytest.mark.xfail(PANDAS_GT_110, reason="upstream changes"),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to do some work here, but it isn't especially time sensitive or high-priority. Right now you'll just see a warning for datetime columns that can't be silenced.

In [15]: df = pd.DataFrame({"A": pd.date_range("2000", periods=2)})

In [16]: ddf = dd.from_pandas(df, npartitions=1)

In [17]: df.describe()
/Users/taugspurger/.virtualenvs/dask-dev/bin/ipython:1: FutureWarning: Treating datetime data as categorical rather than numeric in `.describe` is deprecated and will be removed in a future version of pandas. Specify `datetime_is_numeric=True` to silence this warning and adopt the future behavior now.
  #!/Users/taugspurger/Envs/dask-dev/bin/python
Out[17]:
                          A
count                     2
unique                    2
top     2000-01-01 00:00:00
freq                      1
first   2000-01-01 00:00:00
last    2000-01-02 00:00:00

In [18]: ddf.describe()
/Users/taugspurger/sandbox/dask/dask/dataframe/core.py:2230: FutureWarning: Treating datetime data as categorical rather than numeric in `.describe` is deprecated and will be removed in a future version of pandas. Specify `datetime_is_numeric=True` to silence this warning and adopt the future behavior now.
  meta = data._meta_nonempty.describe()
/Users/taugspurger/sandbox/dask/dask/dataframe/core.py:2128: FutureWarning: Treating datetime data as categorical rather than numeric in `.describe` is deprecated and will be removed in a future version of pandas. Specify `datetime_is_numeric=True` to silence this warning and adopt the future behavior now.
  meta = self._meta_nonempty.describe(include=include, exclude=exclude)
Out[18]:
Dask DataFrame Structure:
                    A
npartitions=1
               object
                  ...
Dask Name: describe, 19 tasks

In [19]: _.compute()
Out[19]:
                          A
unique                    2
count                     2
top     2000-01-02 00:00:00
freq                      1
first   2000-01-01 00:00:00
last    2000-01-02 00:00:00

@gforsyth
Copy link
Contributor

gforsyth commented Jul 21, 2020

Hey @TomAugspurger -- does this supersede #6156 ?

@TomAugspurger
Copy link
Member Author

It does, thanks.

@TomAugspurger TomAugspurger mentioned this pull request Jul 23, 2020
Copy link
Contributor

@gforsyth gforsyth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one test failure in the upstream-dev env on Travis but it's a pyarrow error and seems unrelated to this:
https://travis-ci.org/github/dask/dask/jobs/710430328

Should be good to go @dask/maintenance

@TomAugspurger TomAugspurger merged commit f212b76 into dask:master Jul 23, 2020
@TomAugspurger
Copy link
Member Author

Thanks!

@TomAugspurger TomAugspurger deleted the pandas-compat-3 branch July 23, 2020 13:38
kumarprabhu1988 pushed a commit to kumarprabhu1988/dask that referenced this pull request Oct 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants