New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility issues with numpy's fromnumeric.py #12644

Closed
gfyoung opened this Issue Mar 16, 2016 · 12 comments

Comments

Projects
None yet
3 participants
@gfyoung
Member

gfyoung commented Mar 16, 2016

A recent spate of issues/PR's stemming from calling functions defined in numpy's fromnumeric.py module here that have identically-named but differently implemented methods/functions in pandas is indicative of a much larger compatibility issue between the two libraries with this module. A thorough overview of all of the functions from the fromnumeric.py module and cross-referencing them to implementations in pandas is needed to avoid similar issues.

Relevant PRs:
#12413 (issue: #12238)
#12603 (issue: #12600)
#12638

#7325 (from numpy)

@gfyoung

This comment has been minimized.

Show comment
Hide comment
@gfyoung

gfyoung Mar 16, 2016

Member

As mentioned in #12600, I'll tackle this as a follow-up once these PR's are landed.

Member

gfyoung commented Mar 16, 2016

As mentioned in #12600, I'll tackle this as a follow-up once these PR's are landed.

@jreback jreback added this to the 0.18.1 milestone Mar 16, 2016

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Mar 16, 2016

Contributor

as discussed we basically have 2 classes of issues:

  • like sorter a seemingly innocuous argument that numpy needs, but pandas does not. So soln is now to pass thru, with no checks (and its still a named argument, currently passed via position from numpy). we should note in the doc-string this behavior.
  • like .round,.idxmax,stat functions. mainly the out argument which is not needed (and confusing to pandas). soln is to allow **kwargs, but check them for invalid args (to avoid misspellings and such). and raise if this particular arg is not None (in this case out).
Contributor

jreback commented Mar 16, 2016

as discussed we basically have 2 classes of issues:

  • like sorter a seemingly innocuous argument that numpy needs, but pandas does not. So soln is now to pass thru, with no checks (and its still a named argument, currently passed via position from numpy). we should note in the doc-string this behavior.
  • like .round,.idxmax,stat functions. mainly the out argument which is not needed (and confusing to pandas). soln is to allow **kwargs, but check them for invalid args (to avoid misspellings and such). and raise if this particular arg is not None (in this case out).
@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback
Contributor

jreback commented Mar 16, 2016

@wesm

This comment has been minimized.

Show comment
Hide comment
@wesm

wesm Mar 16, 2016

Member

IMHO we should not be striving to make pandas API compatible with NumPy (except offering an __array__ API, of course), but we should avoid unnecessary / common conflicts if possible

Member

wesm commented Mar 16, 2016

IMHO we should not be striving to make pandas API compatible with NumPy (except offering an __array__ API, of course), but we should avoid unnecessary / common conflicts if possible

@gfyoung

This comment has been minimized.

Show comment
Hide comment
@gfyoung

gfyoung Mar 16, 2016

Member

@wesm: Agreed. I think in this case though trying to "align" the API with numpy's makes sense because it should be perfectly legal for example to call either np.searchsorted or Series.searchsorted without Python blowing up on the user.

Member

gfyoung commented Mar 16, 2016

@wesm: Agreed. I think in this case though trying to "align" the API with numpy's makes sense because it should be perfectly legal for example to call either np.searchsorted or Series.searchsorted without Python blowing up on the user.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Mar 16, 2016

Contributor

@gfyoung numpy's behavior is a bug really, in that it shouldn't just call _wrap_it and assume everything is a sub-class (like it does). I know you are fixing that, so this is really for compat.

Contributor

jreback commented Mar 16, 2016

@gfyoung numpy's behavior is a bug really, in that it shouldn't just call _wrap_it and assume everything is a sub-class (like it does). I know you are fixing that, so this is really for compat.

@gfyoung

This comment has been minimized.

Show comment
Hide comment
@gfyoung

gfyoung Mar 16, 2016

Member

@jreback : Right, I guess "align" gives the connotation that pandas is doing something wrong, when we're really just trying to "accommodate" numpy's buggy API.

Member

gfyoung commented Mar 16, 2016

@jreback : Right, I guess "align" gives the connotation that pandas is doing something wrong, when we're really just trying to "accommodate" numpy's buggy API.

@gfyoung

This comment has been minimized.

Show comment
Hide comment
@gfyoung

gfyoung Mar 21, 2016

Member

Well, now pandas is not alone. numpy's close cousin scipy has these exact same compatibility issues too as I filed just now.

Member

gfyoung commented Mar 21, 2016

Well, now pandas is not alone. numpy's close cousin scipy has these exact same compatibility issues too as I filed just now.

@gfyoung

This comment has been minimized.

Show comment
Hide comment
@gfyoung

gfyoung Mar 22, 2016

Member

My numpy PR has been merged. So now (hopefully) we can just worry about backwards compatibility.

Member

gfyoung commented Mar 22, 2016

My numpy PR has been merged. So now (hopefully) we can just worry about backwards compatibility.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Mar 22, 2016

Contributor

maybe add order arg from np.argsort as well (IOW we could remove from pandas)

Contributor

jreback commented Mar 22, 2016

maybe add order arg from np.argsort as well (IOW we could remove from pandas)

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Mar 25, 2016

Contributor

prob could start with seeing which functions call _wrap_it on the numpy side here.

Contributor

jreback commented Mar 25, 2016

prob could start with seeing which functions call _wrap_it on the numpy side here.

@gfyoung

This comment has been minimized.

Show comment
Hide comment
@gfyoung

gfyoung Apr 6, 2016

Member

A massive PR (#12810) addressing this issue is finally up. There were a lot more incompatibilities than I had expected. Hopefully this PR should address almost if not all of them.

Member

gfyoung commented Apr 6, 2016

A massive PR (#12810) addressing this issue is finally up. There were a lot more incompatibilities than I had expected. Hopefully this PR should address almost if not all of them.

@jreback jreback modified the milestones: 0.18.2, 0.18.1 Apr 27, 2016

@jreback jreback modified the milestones: 0.18.1, 0.18.2 Apr 30, 2016

gfyoung added a commit to gfyoung/pandas that referenced this issue May 1, 2016

COMPAT: Expand compatibility with fromnumeric.py
Augment pandas array-like methods with appropriate parameters
(generally, '*args' and '**kwargs') so that they can be called
via analogous functions in the numpy library they are defined in
'fromnumeric.py'.

Closes pandas-devgh-12638.
Closes pandas-devgh-12644.
Closes pandas-devgh-12687.

@jreback jreback closed this in 23eb483 May 1, 2016

nps added a commit to nps/pandas that referenced this issue May 17, 2016

COMPAT: Expand compatibility with fromnumeric.py
Closes pandas-dev#12638
Closes pandas-dev#12644
Closes pandas-dev#12687

Author: gfyoung <gfyoung17@gmail.com>

Closes pandas-dev#12810 from gfyoung/fromnumeric-compat and squashes the following commits:

429bc51 [gfyoung] COMPAT: Expand compatibility with fromnumeric.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment