Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dask dataframe isna #3294

Merged
merged 16 commits into from Mar 25, 2018
4 changes: 4 additions & 0 deletions dask/dataframe/__init__.py
Expand Up @@ -16,3 +16,7 @@
from .io import read_parquet, to_parquet
except ImportError:
pass
try:
from .core import isna
except ImportError:
pass
6 changes: 6 additions & 0 deletions dask/dataframe/core.py
Expand Up @@ -4123,6 +4123,12 @@ def to_timedelta(arg, unit='ns', errors='raise'):
meta=meta)


if hasattr(pd, isna):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be if hasattr(pd, 'isna'): (as seen in the travis logs).


In general, I recommend trying to run the tests yourself before pushing to github. We use py.test for testing:

$ conda install pytest
# or
$ pip install pytest

$ py.test dask  # run the whole test suite
$ py.test dask/dataframe/tests/test_dataframe.py -k test_isna  # test just your added test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Should be fixed now.

@wraps(pd.isna)
def isna(arg):
return map_partitions(pd.isna, arg)


def _repr_data_series(s, index):
"""A helper for creating the ``_repr_data`` property"""
npartitions = len(index) - 1
Expand Down
10 changes: 10 additions & 0 deletions dask/dataframe/tests/test_dataframe.py
Expand Up @@ -2811,6 +2811,16 @@ def test_to_timedelta():
dd.to_timedelta(ds, errors='coerce'))


@pytest.mark.skipif(PANDAS_VERSION < '0.22.0',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrocklin, should this use the hasattr method of checking for the isna instead? Just occurred to me that isna could be deprecated in future versions of pandas.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't worry about isna being deprecated.

reason="No isna method")
@pytest.mark.parametrize('values', [[np.NaN, 0], [1, 1]])
def test_isna(values):
s = pd.Series(values)
ds = dd.from_pandas(s, npartitions=2)

assert_eq(pd.isna(s), dd.isna(ds))


@pytest.mark.parametrize('drop', [0, 9])
def test_slice_on_filtered_boundary(drop):
# https://github.com/dask/dask/issues/2211
Expand Down