Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame[sparse].__getitem__ should be Series, not SparseSeries #23559

Closed
TomAugspurger opened this issue Nov 8, 2018 · 2 comments

Comments

Projects
None yet
2 participants
@TomAugspurger
Copy link
Contributor

commented Nov 8, 2018

Can we break API here to return a Series[Sparse] instead?

In [12]: df = pd.DataFrame({"A": pd.SparseArray([1, 2, 3])})

In [13]: df.dtypes
Out[13]:
A    Sparse[int64, 0]
dtype: object

In [14]: type(df['A'])
Out[14]: pandas.core.sparse.series.SparseSeries

cc @rkern

@TomAugspurger TomAugspurger added the Sparse label Nov 8, 2018

@TomAugspurger TomAugspurger added this to the 0.24.0 milestone Nov 8, 2018

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Nov 8, 2018

API: DataFrame.__getitem__ returns Series for sparse column
Breaking API change for

```python
In [1]: import pandas as pd

In [2]: df = pd.DataFrame({"A": pd.SparseSeries([1, 0])})

In [3]: type(df['A'])
Out[3]: pandas.core.sparse.series.SparseSeries
```

Now Out[3] is a Series.

closes pandas-dev#23559

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Nov 8, 2018

API: DataFrame.__getitem__ returns Series for sparse column
Breaking API change for

```python
In [1]: import pandas as pd

In [2]: df = pd.DataFrame({"A": pd.SparseSeries([1, 0])})

In [3]: type(df['A'])
Out[3]: pandas.core.sparse.series.SparseSeries
```

Now Out[3] is a Series.

closes pandas-dev#23559
@TomAugspurger

This comment has been minimized.

Copy link
Contributor Author

commented Nov 8, 2018

FYI, these sparse-related PRs are all all working towards deprecating SparseDataFrame and SparseSeries. Being able to deprecate means we need to break API in a few places (like here) where the user has no control over the return type, unless we go with some kind of pd.options route. Do we want to do that?

pd.options.mode.slice_sparse : {None, 'Series', 'SparseSeries'}

the default of None warns, and returns a SparseSeries? Setting 'Series' will opt in to the future behavior? The strange thing about that is that if we're deprecating SparseSeries, you'll get a warning anyway with 'SparseSeries' :)

@jreback

This comment has been minimized.

Copy link
Contributor

commented Nov 8, 2018

yeah this seems fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.