Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: Series.view #20251

Closed
Tracked by #3
jreback opened this issue Mar 10, 2018 · 6 comments · Fixed by #56054
Closed
Tracked by #3

DEPR: Series.view #20251

jreback opened this issue Mar 10, 2018 · 6 comments · Fixed by #56054
Labels
Deprecate Functionality to remove in pandas

Comments

@jreback
Copy link
Contributor

jreback commented Mar 10, 2018

we do not fully control memory in pandas currently; its defered to numpy, so a view() rarely makes sense, esp since we will change the dtype appropriate upon setting, breaking the view association.

In [35]: s = pd.Series([1, 2, 3])

In [36]: s_view = s.view()

In [39]: s_view
Out[39]: 
0    4.940656e-324
1    9.881313e-324
2    1.482197e-323
dtype: float64

we should just remove .view() entirely.

@jreback jreback added Deprecate Functionality to remove in pandas Difficulty Intermediate labels Mar 10, 2018
@jreback jreback added this to the 0.23.0 milestone Mar 10, 2018
@jorisvandenbossche
Copy link
Member

Didn't see this one yet, but just opened: #20281.
Deprecating is also a solution, but just fixing the behaviour when no dtype is specified (the one you show above) is also an option.

But I agree it is hardly useful given the constraint on keeping the same number of elements.

@villasv
Copy link
Contributor

villasv commented Mar 11, 2018

The only meaningful use case I could think of is the one I chose for the example in the docstring: converting unsigned int from/to signed int. Basically every other byte conversion is very unlikely to be useful for regular pandas users.

Contrived scenario:

Let's say I have a really big Series of int8 integers that barely fits in memory (which is basically why I'm trying to use int8 instead of int64). I then realize that actually int8 isn't enough because I have some data points in the 127~255 range, and because none of the data points are negative I could have been using uint8 all along. Creating a uint8 view would be my salvation - assuming that for some reason I don't want to simply redo the computation steps this time using uint8 from the beginning.

@jreback jreback modified the milestones: 0.23.0, 0.24.0 Apr 24, 2018
@jreback jreback modified the milestones: 0.24.0, 0.25.0 Oct 23, 2018
@jreback jreback added the Blocker Blocking issue or pull request for an upcoming release label Jun 28, 2019
@jorisvandenbossche jorisvandenbossche modified the milestones: 0.25.0, Contributions Welcome Jun 30, 2019
@mroeschke mroeschke removed the Blocker Blocking issue or pull request for an upcoming release label Nov 2, 2019
@mroeschke
Copy link
Member

The original example has been fixed since

In [4]: In [35]: s = pd.Series([1, 2, 3])
   ...:
   ...: In [36]: s_view = s.view()

In [5]: s_view
Out[5]:
0    1
1    2
2    3
dtype: int64

And view is in the ExtensionArray interface as well so it would need deprecating there as well. Not sure how pervasively this is used internally as well cc @jbrockmendel

@jbrockmendel
Copy link
Member

We use .view a lot in the dt64/td64/period code. i dont think we use it often in the general EA case outside of (Series|Index).view

@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Mar 13, 2023

I would like to bring this up again, at least for the Series method. It's a method on our core object's precious namespace, but I don't think there is an actual use case for users where you need this (apart from the example of int8 -> uint8 given above in #20251 (comment), but I think that's a very advanced use case that can be worked around in other ways as well).
If you want to avoid the copy for the subset of casts where it is not needed, users can always do .astype(.., copy=False) (which will actually become the default with CoW enabled).

Deprecating / removing it for Series, doesn't mean that we need to do the same for the EAs, if that is something we use ourselves a lot.
I think it's fine to have this method on the array (np.ndarray has it as well), but not on Series (I think one of the reasons we have this method on Series in the first place is that at some point in time Series was an ndarray subclass, and so inherited this from np.ndarray)

@jbrockmendel
Copy link
Member

No objection to deprecating it on Series.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants