New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest numpy and pandas #1339

Merged
merged 51 commits into from Sep 21, 2016

Conversation

Projects
None yet
2 participants
@ssanderson
Member

ssanderson commented Jul 21, 2016

Bump us up to the latest major versions of pandas and numpy.

Notable breakages in the latest:

  • Pandas:
    • rolling, expanding, resample, and ewm* all changed to behave more like groupby. This PR adds backwards-compat shims to support both the new and old syntax. This change is the only one that required a material amount of work to preserve compat with pandas 0.17.
    • .loc with an integer argument on an index of Asset objects no longer works in pandas 0.18. This is probably the change I'm most worried about from a user breakage perspective.
    • Timezone information is now preserved in Series and DataFrame columns. This means that some fields that were previously tz-naive may now be tz-aware, leading to breakages. This is the second most worrisome change for user breakage.
    • Group-label conventions for DataFrame/Series.nth() changed in pandas 0.18. The only affected usage has been re-written in a way that's 2x faster, so not much cost here. See Also: pandas-dev/pandas#13666.
    • Passing null-ish values to pd.categorical is deprecated in pandas 0.18. This means, in particular, that the default missing value of None cannot be preserved in pipeline outputs for string-dtype Pipeline columns is no longer appropriate if we want to avoid pandas warnings and/or future breakages. This PR currently deprecates support for custom string-dtype missing values, and makes string-dtyped categorical output provide the pandas-recommended value of NaN. A future change should likely remove support for custom missing values entirely in favor of using categoricals with NaN missing values for both strings and ints. This is the code change I'm most conflicted about in this PR. I think a better change might be to just silence the warning for now, and remove support for missing values in one consistent change. As-is, the semantics for column missing values is inconsistent for strings and every other dtype. @llllllllll I'd be interested in your thoughts on this. See Also: pandas-dev/pandas#13648
    • DataFrame.sort was deprecated in favor of sort_values. This is trivial to fix.
    • DataFrame.convert_objects was deprecated in favor of type-specific functions. We only had one, unnecessary invocation of convert_objects.
    • DataFrame deprecated indexing with a float on .iloc. We only did this in one place, and it was almost certainly a bug.
  • Numpy:
    • np.full started warning that passing an integer value would produce an integer array in the future (it currently produces a float array). This PR fixes most of those warnings by passing explicit float values, or passing an explicit dtype. This is probably the largest change in LoC, but there's no cost to users.
    • np.NaT started warning on comparisons with itself that NaT != NaT will be true in the future. An isnat function has been added to numpy_utils, and it's been used anywhere that we were previously checking for NaT. See Also numpy/numpy#5610.

@ssanderson ssanderson force-pushed the latest-numpy-pandas branch 4 times, most recently from 2308b71 to 25efea2 Jul 21, 2016

@ssanderson ssanderson force-pushed the latest-numpy-pandas branch from f3b15a5 to e8fc7ac Jul 29, 2016

@coveralls

This comment has been minimized.

coveralls commented Jul 29, 2016

Coverage Status

Coverage decreased (-0.08%) to 85.012% when pulling e8fc7ac on latest-numpy-pandas into a937d6e on master.

@ssanderson ssanderson force-pushed the latest-numpy-pandas branch from e8fc7ac to 245fae2 Aug 1, 2016

@coveralls

This comment has been minimized.

coveralls commented Aug 1, 2016

Coverage Status

Coverage decreased (-0.05%) to 85.477% when pulling 245fae2 on latest-numpy-pandas into 9103516 on master.

@coveralls

This comment has been minimized.

coveralls commented Aug 1, 2016

Coverage Status

Coverage increased (+0.1%) to 85.66% when pulling 03efb1c on latest-numpy-pandas into 9103516 on master.

@coveralls

This comment has been minimized.

coveralls commented Aug 1, 2016

Coverage Status

Coverage decreased (-0.02%) to 85.508% when pulling 03efb1c on latest-numpy-pandas into 9103516 on master.

@ssanderson ssanderson force-pushed the latest-numpy-pandas branch 2 times, most recently from 6542b0d to 2a2e92c Aug 2, 2016

@coveralls

This comment has been minimized.

coveralls commented Aug 2, 2016

Coverage Status

Coverage decreased (-0.02%) to 85.524% when pulling 2a2e92c on latest-numpy-pandas into 129d16f on master.

@coveralls

This comment has been minimized.

coveralls commented Aug 2, 2016

Coverage Status

Coverage decreased (-0.02%) to 85.524% when pulling 2a2e92c on latest-numpy-pandas into 129d16f on master.

@ssanderson ssanderson force-pushed the latest-numpy-pandas branch from 9be8d4e to 30ff125 Aug 2, 2016

@coveralls

This comment has been minimized.

coveralls commented Aug 2, 2016

Coverage Status

Coverage decreased (-0.02%) to 85.494% when pulling 30ff125 on latest-numpy-pandas into f244dea on master.

@ssanderson ssanderson force-pushed the latest-numpy-pandas branch 4 times, most recently from bdd6c8c to 829284b Aug 2, 2016

@coveralls

This comment has been minimized.

coveralls commented Aug 8, 2016

Coverage Status

Coverage decreased (-0.02%) to 85.519% when pulling 829284b on latest-numpy-pandas into a260fb1 on master.

@ssanderson ssanderson force-pushed the latest-numpy-pandas branch from b21699e to 1d6ec4a Aug 9, 2016

@coveralls

This comment has been minimized.

coveralls commented Aug 9, 2016

Coverage Status

Coverage decreased (-0.02%) to 85.519% when pulling 1d6ec4a on latest-numpy-pandas into 24f2ef8 on master.

@ssanderson ssanderson force-pushed the latest-numpy-pandas branch from 1d6ec4a to 15d7105 Aug 16, 2016

@coveralls

This comment has been minimized.

coveralls commented Aug 16, 2016

Coverage Status

Coverage decreased (-0.02%) to 85.698% when pulling 15d7105 on latest-numpy-pandas into 4642fd2 on master.

@coveralls

This comment has been minimized.

coveralls commented Aug 16, 2016

Coverage Status

Coverage decreased (-0.02%) to 85.698% when pulling df0e748 on latest-numpy-pandas into 4642fd2 on master.

@ssanderson ssanderson force-pushed the latest-numpy-pandas branch from ed367f6 to 1bb6f7a Aug 18, 2016

@ssanderson ssanderson force-pushed the latest-numpy-pandas branch 3 times, most recently from d2ce509 to 9770f1c Aug 29, 2016

@ssanderson ssanderson force-pushed the latest-numpy-pandas branch from f5f3384 to 15b5cbf Sep 20, 2016

@coveralls

This comment has been minimized.

coveralls commented Sep 20, 2016

Coverage Status

Coverage decreased (-0.08%) to 86.589% when pulling c23dd5b on latest-numpy-pandas into 3fff659 on master.

@ssanderson ssanderson merged commit 7441369 into master Sep 21, 2016

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@ssanderson ssanderson deleted the latest-numpy-pandas branch Sep 21, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment