json round trip exception #3867

Closed
hayd opened this Issue Jun 12, 2013 · 13 comments

Comments

Projects
None yet
3 participants
Contributor

hayd commented Jun 12, 2013

This csv (from the baseball database) reads ok to a DataFrame, pastes ok to a json.

In [6]: df = pd.read_csv('https://raw.github.com/hayd/lahman2012/master/csvs/Teams.csv')

In [7]: s = df.to_json()

In [8]: pd.read_json(s)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-ebde42cd0695> in <module>()
----> 1 pd.read_json(s)

/Users/234BroadWalk/pandas/pandas/io/json.pyc in read_json(path_or_buf, orient, typ, dtype, numpy, parse_dates, keep_default_dates)
    158     obj = None
    159     if typ == 'frame':
--> 160         obj = FrameParser(json, orient, dtype, numpy, parse_dates, keep_default_dates).parse()
    161
    162     if typ == 'series' or obj is None:

/Users/234BroadWalk/pandas/pandas/io/json.pyc in parse(self)
    185
    186     def parse(self):
--> 187         self._parse()
    188         if self.obj is not None:
    189             self._convert_axes()

/Users/234BroadWalk/pandas/pandas/io/json.pyc in _parse(self)
    284             try:
    285                 if orient == "columns":
--> 286                     args = loads(json, dtype=dtype, numpy=True, labelled=True)
    287                     if args:
    288                         args = (args[0].T, args[2], args[1])

TypeError: long() argument must be a string or a number, not 'NoneType'

cc #3804

Contributor

jreback commented Jun 12, 2013

was a bug, but ran into another feature/bug

here's my new test:

df = pd.read_csv('https://raw.github.com/hayd/lahman2012/master/csvs/Teams.csv')
s = df.to_json()
result = pd.read_json(s)
result.index = result.index.astype(int)
result = result.reindex(columns=df.columns,index=df.index)
assert_frame_equal(result,df)

so, I am not sure json guarantees order?
and should I try to do automatic index conversion on other types (I am doing it on datetimes now)?

Contributor

hayd commented Jun 12, 2013

Guess it's not so surprising, python dictionaries don't... (I don't think?). Quite a big file to test against!

Not sure, what were you thinking?

Contributor

jreback commented Jun 12, 2013

I think @cpcloud had sort of the same problem in html, he added infer_types kw....now I am doing that for dates now; I mean its not hard to do a soft conversion, e.g. no forcing......

Member

cpcloud commented Jun 12, 2013

do all valid json objects have a total ordering in python? if they do why not guarantee ordering, unless of course that goes against json spec...

python dicts don't because there are hashable objects that don't define an ordering eg complex numbers, custom objects, among other erasons

Contributor

hayd commented Jun 12, 2013

Hmmm, different bug?

In [5]: pd.read_json('[{"a": 1, "b": 2}, {"b":2, "a" :1}]')
Out[5]:
   0  1
a  1  2
b  2  1
Contributor

jreback commented Jun 12, 2013

which one is more useful to round-trip exactly?

biggie = DataFrame(np.zeros((200, 4)),
                           columns=[str(i) for i in range(4)],
                           index=[str(i) for i in range(200)])
biggie2 = DataFrame(np.zeros((200,4)),
                           columns=range(4),
                           index=range(200))
Contributor

jreback commented Jun 12, 2013

@cpcloud any thoughts?

Member

cpcloud commented Jun 12, 2013

roundtrip doesn't look like it can be invertible...they both json'd the same because of json's rules about keys in objects (must be string).

Contributor

jreback commented Jun 12, 2013

I am going to setup some options so the second will roundtrip
hence convert_axes=True

while the 1st will work if you pass
convert_axes=False

Member

cpcloud commented Jun 12, 2013

this might present a problem for nested json, no? that's a different beast though so for "frame/series-able" json that's probably ok

Contributor

jreback commented Jun 12, 2013

conversion is done at the end
so should worl

Contributor

jreback commented Jun 13, 2013

fixed by #3876

Contributor

jreback commented Jun 13, 2013

closing this as incorporated in #3876

jreback closed this Jun 13, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment