New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DataFrame created with tzinfo cannot use to_dict(orient="records") #18372

Closed
bolkedebruin opened this Issue Nov 19, 2017 · 7 comments

Comments

Projects
None yet
3 participants
@bolkedebruin
Contributor

bolkedebruin commented Nov 19, 2017

Code Sample, a copy-pastable example if possible

import psycopg2
import datetime
import pandas as pd

data = [(datetime.datetime(2017, 11, 18, 21, 53, 0, 219225, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=60, name=None)),), (datetime.datetime(2017, 11, 18, 22, 6, 30, 61810, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=60, name=None)),)]

df = pd.DataFrame(list(x))

df.to_dict(orient='records') # fails

Problem description

The above code fails in pandas 0.21.0 with

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-30-34cf441e3f50> in <module>()
----> 1 df.to_dict(orient='records')

/Users/bolke/Documents/dev/airflow_env/lib/python2.7/site-packages/pandas-0.21.0-py2.7-macosx-10.12-x86_64.egg/pandas/core/frame.pyc in to_dict(self, orient)
    897             return [dict((k, _maybe_box_datetimelike(v))
    898                          for k, v in zip(self.columns, row))
--> 899                     for row in self.values]
    900         elif orient.lower().startswith('i'):
    901             return dict((k, v.to_dict()) for k, v in self.iterrows())

TypeError: izip argument #2 must support iteration

Expected Output

Dict of records

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]

INSTALLED VERSIONS

commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Darwin
OS-release: 17.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.21.0
pytest: 3.2.1
pip: 9.0.1
setuptools: 36.6.0
Cython: None
numpy: 1.13.3
scipy: None
pyarrow: None
xarray: None
IPython: 5.5.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0b10
sqlalchemy: 1.1.4
pymysql: None
psycopg2: 2.7.1 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Nov 19, 2017

Contributor

yeah this looks buggy. a PR to fix would be great!

Contributor

jreback commented Nov 19, 2017

yeah this looks buggy. a PR to fix would be great!

@bolkedebruin

This comment has been minimized.

Show comment
Hide comment
@bolkedebruin

bolkedebruin Nov 19, 2017

Contributor

If you can give me some hints where to look that would be appreciated. I ‘hacked’ the code in core/frame.py to check for iterable, but I don’t think that is really the fix (or is it?).

Contributor

bolkedebruin commented Nov 19, 2017

If you can give me some hints where to look that would be appreciated. I ‘hacked’ the code in core/frame.py to check for iterable, but I don’t think that is really the fix (or is it?).

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Nov 19, 2017

Contributor

see also https://github.com/pandas-dev/pandas/pull/18167/files

diff --git a/pandas/core/frame.py b/pandas/core/frame.py
index 7145fa7..02a1cf4 100644
--- a/pandas/core/frame.py
+++ b/pandas/core/frame.py
@@ -994,7 +994,7 @@ class DataFrame(NDFrame):
         elif orient.lower().startswith('r'):
             return [into_c((k, _maybe_box_datetimelike(v))
                            for k, v in zip(self.columns, row))
-                    for row in self.values]
+                    for row in np.atleast_2d(self.values)]
         elif orient.lower().startswith('i'):
             return into_c((k, v.to_dict(into)) for k, v in self.iterrows())
         else:

The issue is that a single column of datetimetz returns a 1-d array (rather than a 2-d array) when using .values. This is a more general issue. This will solve it here.

Contributor

jreback commented Nov 19, 2017

see also https://github.com/pandas-dev/pandas/pull/18167/files

diff --git a/pandas/core/frame.py b/pandas/core/frame.py
index 7145fa7..02a1cf4 100644
--- a/pandas/core/frame.py
+++ b/pandas/core/frame.py
@@ -994,7 +994,7 @@ class DataFrame(NDFrame):
         elif orient.lower().startswith('r'):
             return [into_c((k, _maybe_box_datetimelike(v))
                            for k, v in zip(self.columns, row))
-                    for row in self.values]
+                    for row in np.atleast_2d(self.values)]
         elif orient.lower().startswith('i'):
             return into_c((k, v.to_dict(into)) for k, v in self.iterrows())
         else:

The issue is that a single column of datetimetz returns a 1-d array (rather than a 2-d array) when using .values. This is a more general issue. This will solve it here.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Nov 19, 2017

Contributor

xref #13407

Contributor

jreback commented Nov 19, 2017

xref #13407

@bolkedebruin

This comment has been minimized.

Show comment
Hide comment
@bolkedebruin

bolkedebruin Nov 19, 2017

Contributor

Thx! I’ll have a look (to add tests) - or are you applying this patch yourself?

Contributor

bolkedebruin commented Nov 19, 2017

Thx! I’ll have a look (to add tests) - or are you applying this patch yourself?

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Nov 19, 2017

Contributor

nope love for a PR (test and whatsnew note)

Contributor

jreback commented Nov 19, 2017

nope love for a PR (test and whatsnew note)

@bolkedebruin

This comment has been minimized.

Show comment
Hide comment
@bolkedebruin

bolkedebruin Nov 20, 2017

Contributor

Will do. I will need to change your code slightly as it is not the proper fix, but it put me on the right path

Contributor

bolkedebruin commented Nov 20, 2017

Will do. I will need to change your code slightly as it is not the proper fix, but it put me on the right path

bolkedebruin added a commit to bolkedebruin/incubator-superset that referenced this issue Nov 20, 2017

Workaround pandas bug in datetimes with time zones
A bug in to_dict(orient="records") in pandas/core/frame.py prevents
datetimes with time zones to be worked with. This works around the
issue in superset by re-implementing the logic of pandas in the
correct way. Until pandas fixes the issue this code should stay.

pandas-dev/pandas#18372

This closes apache#1929

bolkedebruin added a commit to bolkedebruin/incubator-superset that referenced this issue Nov 20, 2017

Workaround pandas bug in datetimes with time zones
A bug in to_dict(orient="records") in pandas/core/frame.py prevents
datetimes with time zones to be worked with. This works around the
issue in superset by re-implementing the logic of pandas in the
correct way. Until pandas fixes the issue this code should stay.

pandas-dev/pandas#18372

This closes apache#1929

bolkedebruin added a commit to bolkedebruin/incubator-superset that referenced this issue Nov 20, 2017

Workaround pandas bug in datetimes with time zones
A bug in to_dict(orient="records") in pandas/core/frame.py prevents
datetimes with time zones to be worked with. This works around the
issue in superset by re-implementing the logic of pandas in the
correct way. Until pandas fixes the issue this code should stay.

pandas-dev/pandas#18372

This closes apache#1929

bolkedebruin added a commit to bolkedebruin/incubator-superset that referenced this issue Nov 20, 2017

Workaround pandas bug in datetimes with time zones
A bug in to_dict(orient="records") in pandas/core/frame.py prevents
datetimes with time zones to be worked with. This works around the
issue in superset by re-implementing the logic of pandas in the
correct way. Until pandas fixes the issue this code should stay.

pandas-dev/pandas#18372

This closes apache#1929

mistercrunch added a commit to apache/incubator-superset that referenced this issue Nov 20, 2017

Workaround pandas bug in datetimes with time zones (#3910)
A bug in to_dict(orient="records") in pandas/core/frame.py prevents
datetimes with time zones to be worked with. This works around the
issue in superset by re-implementing the logic of pandas in the
correct way. Until pandas fixes the issue this code should stay.

pandas-dev/pandas#18372

This closes #1929

bolkedebruin added a commit to bolkedebruin/pandas that referenced this issue Nov 21, 2017

[BUG-FIX] DataFrame created with tzinfo cannot use to_dict(orient="re…
…cords")

Columns with datetimez are not returning arrays.

Closes pandas-dev#18372

bolkedebruin added a commit to bolkedebruin/pandas that referenced this issue Nov 21, 2017

[BUG-FIX] DataFrame created with tzinfo cannot use to_dict(orient="re…
…cords")

Columns with datetimez are not returning arrays.

Closes pandas-dev#18372

bolkedebruin added a commit to bolkedebruin/pandas that referenced this issue Nov 21, 2017

[BUG-FIX] DataFrame created with tzinfo cannot use to_dict(orient="re…
…cords")

Columns with datetimez are not returning arrays.

Closes pandas-dev#18372

bolkedebruin added a commit to bolkedebruin/pandas that referenced this issue Nov 22, 2017

[BUG-FIX] DataFrame created with tzinfo cannot use to_dict(orient="re…
…cords")

Columns with datetimez are not returning arrays.

Closes pandas-dev#18372

@jreback jreback modified the milestones: Next Major Release, 0.22.0, 0.21.1 Nov 22, 2017

bolkedebruin added a commit to bolkedebruin/pandas that referenced this issue Nov 23, 2017

[BUG-FIX] DataFrame created with tzinfo cannot use to_dict(orient="re…
…cords")

Columns with datetimez are not returning arrays.

Closes pandas-dev#18372

bolkedebruin added a commit to bolkedebruin/pandas that referenced this issue Nov 23, 2017

[BUG-FIX] DataFrame created with tzinfo cannot use to_dict(orient="re…
…cords")

Columns with datetimez are not returning arrays.

Closes pandas-dev#18372

jreback added a commit that referenced this issue Nov 23, 2017

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Dec 8, 2017

TomAugspurger added a commit that referenced this issue Dec 11, 2017

dimaslv added a commit to dimaslv/superset that referenced this issue Feb 2, 2018

Workaround pandas bug in datetimes with time zones (apache#3910)
A bug in to_dict(orient="records") in pandas/core/frame.py prevents
datetimes with time zones to be worked with. This works around the
issue in superset by re-implementing the logic of pandas in the
correct way. Until pandas fixes the issue this code should stay.

pandas-dev/pandas#18372

This closes apache#1929

michellethomas added a commit to michellethomas/panoramix that referenced this issue May 24, 2018

Workaround pandas bug in datetimes with time zones (apache#3910)
A bug in to_dict(orient="records") in pandas/core/frame.py prevents
datetimes with time zones to be worked with. This works around the
issue in superset by re-implementing the logic of pandas in the
correct way. Until pandas fixes the issue this code should stay.

pandas-dev/pandas#18372

This closes apache#1929
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment