Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame[sparse].__getitem__ should be Series, not SparseSeries #23559

Closed
TomAugspurger opened this issue Nov 8, 2018 · 2 comments · Fixed by #23561
Closed

DataFrame[sparse].__getitem__ should be Series, not SparseSeries #23559

TomAugspurger opened this issue Nov 8, 2018 · 2 comments · Fixed by #23561
Labels
Sparse Sparse Data Type
Milestone

Comments

@TomAugspurger
Copy link
Contributor

Can we break API here to return a Series[Sparse] instead?

In [12]: df = pd.DataFrame({"A": pd.SparseArray([1, 2, 3])})

In [13]: df.dtypes
Out[13]:
A    Sparse[int64, 0]
dtype: object

In [14]: type(df['A'])
Out[14]: pandas.core.sparse.series.SparseSeries

cc @rkern

@TomAugspurger TomAugspurger added the Sparse Sparse Data Type label Nov 8, 2018
@TomAugspurger TomAugspurger added this to the 0.24.0 milestone Nov 8, 2018
TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Nov 8, 2018
Breaking API change for

```python
In [1]: import pandas as pd

In [2]: df = pd.DataFrame({"A": pd.SparseSeries([1, 0])})

In [3]: type(df['A'])
Out[3]: pandas.core.sparse.series.SparseSeries
```

Now Out[3] is a Series.

closes pandas-dev#23559
TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Nov 8, 2018
Breaking API change for

```python
In [1]: import pandas as pd

In [2]: df = pd.DataFrame({"A": pd.SparseSeries([1, 0])})

In [3]: type(df['A'])
Out[3]: pandas.core.sparse.series.SparseSeries
```

Now Out[3] is a Series.

closes pandas-dev#23559
@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented Nov 8, 2018

FYI, these sparse-related PRs are all all working towards deprecating SparseDataFrame and SparseSeries. Being able to deprecate means we need to break API in a few places (like here) where the user has no control over the return type, unless we go with some kind of pd.options route. Do we want to do that?

pd.options.mode.slice_sparse : {None, 'Series', 'SparseSeries'}

the default of None warns, and returns a SparseSeries? Setting 'Series' will opt in to the future behavior? The strange thing about that is that if we're deprecating SparseSeries, you'll get a warning anyway with 'SparseSeries' :)

@jreback
Copy link
Contributor

jreback commented Nov 8, 2018

yeah this seems fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Sparse Sparse Data Type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants