New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandas: sort(columns=) is deprecated, use sort_values(by=) #1321

Merged
merged 3 commits into from Nov 27, 2015

Conversation

Projects
None yet
4 participants
@pratapvardhan
Contributor

pratapvardhan commented Nov 23, 2015

Pandas 0.17.0 gives a warning for the use of sort

FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)

Probably move to sort_values, as suggested?

@cpcloud cpcloud added this to the 0.9.0 milestone Nov 24, 2015

@cpcloud

This comment has been minimized.

Member

cpcloud commented Nov 24, 2015

We want to maintain compatibility with a few versions of pandas so we can't change to sort_values just yet.

@cpcloud

This comment has been minimized.

Member

cpcloud commented Nov 24, 2015

If you want to wrap up sort in an internal function that would be okay:

def sort(df, *args, **kwargs):
    try:
        return df.sort_values(*args, **kwargs)
    except AttributeError:
        return df.sort(*args, **kwargs)

Could also do

def sort(df, *args, **kwargs):
    return getattr(df, 'sort_values', df.sort)(*args, **kwargs)
@jreback

This comment has been minimized.

Contributor

jreback commented Nov 24, 2015

@cpcloud might be safer to do actual version detection (as you need to maybe handle the by kw)

e.g.

if LooseVersion(pd.__version__) < '0.17.0':
     return df.sort(*args, **kwargs)
return df.sort_values(...)
@llllllllll

This comment has been minimized.

Member

llllllllll commented Nov 24, 2015

ideally we would only do this check once at module scope and then conditionally define some sort function.

@cpcloud

This comment has been minimized.

Member

cpcloud commented Nov 24, 2015

@llllllllll that seems like a good idea @pratapvardhan up for implementing @llllllllll's suggestion?

@pratapvardhan pratapvardhan force-pushed the pratapvardhan:pd branch from 6545efc to cc7b9aa Nov 25, 2015

@pratapvardhan

This comment has been minimized.

Contributor

pratapvardhan commented Nov 25, 2015

@cpcloud , @llllllllll does the new internal pandas_sort function seem right?

@pratapvardhan pratapvardhan force-pushed the pratapvardhan:pd branch from 5823d51 to cc7b9aa Nov 25, 2015

if LooseVersion(pd.__version__) < '0.17.0':
pandas_sort = _pandas_sort_old
else:
pandas_sort = _pandas_sort_new

This comment has been minimized.

@cpcloud

cpcloud Nov 25, 2015

Member

you could probably do this with less code:

pdsort = getattr(
    pd.DataFrame, 
    'sort' if LooseVersion(pd.__version__) < '0.17.0' else 'sort_values'
)

This comment has been minimized.

@cpcloud

cpcloud Nov 25, 2015

Member

You should define this in compute/pandas.py since it's only relevant there. compatibility is for language compatibility

@pratapvardhan pratapvardhan force-pushed the pratapvardhan:pd branch from 5e87a45 to 53ce2ff Nov 26, 2015

@pratapvardhan

This comment has been minimized.

Contributor

pratapvardhan commented Nov 26, 2015

@cpcloud -- thanks for the tip, pushed the updated commit.

@cpcloud

This comment has been minimized.

Member

cpcloud commented Nov 27, 2015

@pratapvardhan nice! thanks for the update ... merging!

cpcloud added a commit that referenced this pull request Nov 27, 2015

Merge pull request #1321 from pratapvardhan/pd
Pandas: sort(columns=) is deprecated, use sort_values(by=)

@cpcloud cpcloud merged commit 0b38a21 into blaze:master Nov 27, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@pratapvardhan

This comment has been minimized.

Contributor

pratapvardhan commented Nov 28, 2015

Thanks @cpcloud =)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment