Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix tzaware dataframe transpose bug #26825

Merged
merged 23 commits into from Jun 27, 2019

Conversation

@jbrockmendel
Copy link
Member

commented Jun 13, 2019

This allows us to get rid of a bunch of xfails and FIXMEs in arithmetic tests. Changes in groupby are the most likely to need attention; this is not an area of the code I know well.

Not sure if TestTranspose belongs somewhere else. Suggestions welcome.

Split a couple of too-widely-scoped groupby tests.

Will follow-up with issues this closes, whatsnew entry, and GH references added to tests.

xref #23988

  • closes #13287
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry
@codecov

This comment has been minimized.

Copy link

commented Jun 13, 2019

Codecov Report

Merging #26825 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26825      +/-   ##
==========================================
- Coverage   92.01%      92%   -0.01%     
==========================================
  Files         180      180              
  Lines       50754    50766      +12     
==========================================
+ Hits        46699    46708       +9     
- Misses       4055     4058       +3
Flag Coverage Δ
#multiple 90.64% <100%> (ø) ⬆️
#single 41.84% <35%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/core/groupby/generic.py 89.37% <100%> (+0.04%) ⬆️
pandas/core/internals/construction.py 96.04% <100%> (+0.08%) ⬆️
pandas/io/gbq.py 88.88% <0%> (-11.12%) ⬇️
pandas/core/frame.py 96.89% <0%> (-0.12%) ⬇️
pandas/core/sorting.py 98.35% <0%> (ø) ⬆️
pandas/compat/_optional.py 100% <0%> (ø) ⬆️
pandas/core/dtypes/cast.py 90.89% <0%> (+0.16%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a7f1d69...8b2372e. Read the comment docs.

pandas/core/groupby/generic.py Show resolved Hide resolved
@@ -160,7 +160,29 @@ def init_ndarray(values, index, columns, dtype=None, copy=False):
# on the entire block; this is to convert if we have datetimelike's
# embedded in an object type
if dtype is None and is_object_dtype(values):
values = maybe_infer_to_datetimelike(values)

if values.ndim == 2 and values.shape[0] != 1:

This comment has been minimized.

Copy link
@jreback

jreback Jun 13, 2019

Contributor

this is much more messy, can we change something else to make this nicer?

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Jun 13, 2019

Author Member

Not really. I'm looking into the other places where maybe_infer_to_datetimelike is used to see if some of this can go into that. We could separate this whole block into a dedicated function. But one way or another we need to bite the bullet.

This comment has been minimized.

Copy link
@jreback

jreback Jun 13, 2019

Contributor

so the inside of list loop should be in pandas.core.dtypes.cast, no? (obviously up until you make the blocks themselves)

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Jun 13, 2019

Author Member

I'd like to leave this for the next pass when I'm taking a more systematic look at maybe_infer_to_datetimelike

@jbrockmendel

This comment has been minimized.

Copy link
Member Author

commented Jun 13, 2019

Updated with requested edit, GH references in tests, and whatsnew.

@@ -1664,3 +1655,36 @@ def _normalize_keyword_aggregation(kwargs):
order.append((column,
com.get_callable_name(aggfunc) or aggfunc))
return aggspec, columns, order


def _recast_datetimelike_result(result):

This comment has been minimized.

Copy link
@jreback

jreback Jun 13, 2019

Contributor

so i would put this in pandas.core.dtypes.cast, our dumping ground for casting, right after / before maybe_infer_to-datetimelike (and you can de-privatize)

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Jun 13, 2019

Author Member

This is really kludgy code (and is replacing equally kludgy code; kind of like turtles all the way down). I'd rather keep it close to its only usage and hope it is ripped out by someone who knows this part of the code better

This comment has been minimized.

Copy link
@jreback

jreback Jun 13, 2019

Contributor

as i said this look a lot like maybe_convert_objects, try to use that here

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Jun 13, 2019

Author Member

Note also: we only have 5 tests that go through this path

pandas/core/groupby/generic.py Outdated Show resolved Hide resolved
Notes
-----
- Assumes Groupby._selected_obj has ndim==2 and at least one

This comment has been minimized.

Copy link
@jreback

jreback Jun 13, 2019

Contributor

this note doesn't seem relevant as you are passing in frame right?

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Jun 13, 2019

Author Member

That wasn't obvious to me bc were talking about the dimensions of two separate objects. Are they necessarily the same?

pandas/core/groupby/generic.py Outdated Show resolved Hide resolved
pandas/core/groupby/generic.py Outdated Show resolved Hide resolved
pandas/core/groupby/generic.py Outdated Show resolved Hide resolved
@@ -160,7 +160,29 @@ def init_ndarray(values, index, columns, dtype=None, copy=False):
# on the entire block; this is to convert if we have datetimelike's
# embedded in an object type
if dtype is None and is_object_dtype(values):
values = maybe_infer_to_datetimelike(values)

if values.ndim == 2 and values.shape[0] != 1:

This comment has been minimized.

Copy link
@jreback

jreback Jun 13, 2019

Contributor

so the inside of list loop should be in pandas.core.dtypes.cast, no? (obviously up until you make the blocks themselves)

jbrockmendel added some commits Jun 13, 2019


from pandas.core.internals.blocks import make_block

# TODO: What about re-joining object columns?

This comment has been minimized.

Copy link
@jreback

jreback Jun 17, 2019

Contributor

pls reuse the block creation routines below

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Jun 17, 2019

Author Member

attempts so far have broken everything. do you have a particular routine in mind?

This comment has been minimized.

Copy link
@jreback

jreback Jun 21, 2019

Contributor

what I mean is you can remove the create_block_manager_from_blocks and let it fall thru to 184 with I think a very small change, e.g.


if .....
     blocks = bvals
else:
     dvals = .......
      blocks = [dvals]

of course pls use a longer name than dvals

# unnecessary if we ever allow 2D DatetimeArray

dvals_list = [maybe_infer_to_datetimelike(values[n, :])
for n in range(len(values))]

This comment has been minimized.

Copy link
@jreback

jreback Jun 17, 2019

Contributor

make this a list comprehension & use enumerate if you must., this is very hard to read.

pandas/tests/arithmetic/test_datetime64.py Outdated Show resolved Hide resolved
@jbrockmendel

This comment has been minimized.

Copy link
Member Author

commented Jun 20, 2019

@jreback anything else here actionable?

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jun 20, 2019

haven’t had a chance to relook

pandas/core/groupby/generic.py Outdated Show resolved Hide resolved

from pandas.core.internals.blocks import make_block

# TODO: What about re-joining object columns?

This comment has been minimized.

Copy link
@jreback

jreback Jun 21, 2019

Contributor

what I mean is you can remove the create_block_manager_from_blocks and let it fall thru to 184 with I think a very small change, e.g.


if .....
     blocks = bvals
else:
     dvals = .......
      blocks = [dvals]

of course pls use a longer name than dvals

@jbrockmendel

This comment has been minimized.

Copy link
Member Author

commented Jun 27, 2019

I think comments have been addressed.

@jreback jreback added this to the 0.25.0 milestone Jun 27, 2019

@jreback jreback merged commit 27f9d05 into pandas-dev:master Jun 27, 2019

12 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
pandas-dev.pandas Build #20190626.19 succeeded
Details
pandas-dev.pandas (Checks) Checks succeeded
Details
pandas-dev.pandas (Docs) Docs succeeded
Details
pandas-dev.pandas (Linux py35_compat) Linux py35_compat succeeded
Details
pandas-dev.pandas (Linux py36_locale_slow) Linux py36_locale_slow succeeded
Details
pandas-dev.pandas (Linux py36_locale_slow_old_np) Linux py36_locale_slow_old_np succeeded
Details
pandas-dev.pandas (Linux py37_locale) Linux py37_locale succeeded
Details
pandas-dev.pandas (Linux py37_np_dev) Linux py37_np_dev succeeded
Details
pandas-dev.pandas (Windows py36_np15) Windows py36_np15 succeeded
Details
pandas-dev.pandas (Windows py37_np141) Windows py37_np141 succeeded
Details
pandas-dev.pandas (macOS py35_macos) macOS py35_macos succeeded
Details

@jbrockmendel jbrockmendel deleted the jbrockmendel:from2d branch Jun 27, 2019

alimcmaster1 added a commit to alimcmaster1/pandas that referenced this pull request Jun 30, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.