Skip to content

ENH: Better read_json error when handling bad keys #4838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jtratner
Copy link
Contributor

(in json data with orient=split)

Fixes #4730.

@jtratner
Copy link
Contributor Author

@Komnomnomnom does this look okay to you? Just changes the exception to be slightly clearer...

@jtratner
Copy link
Contributor Author

E.g.

import pandas as pd
from pandas.compat import StringIO
import json
d = {u'columns': [0],
 u'data': [[0], [1], [2], [3], [4], [5], [6], [7], [8], [9]],
 u'index': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
 u'badkey': [5]}
s = StringIO(json.dumps(d))
pd.read_json(s, orient='split')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pandas/io/json.py", line 182, in read_json
    date_unit).parse()
  File "pandas/io/json.py", line 242, in parse
    self._parse_no_numpy()
  File "pandas/io/json.py", line 478, in _parse_no_numpy
    self.obj = _run_with_kwarg_exception(DataFrame, decoded, dtype=None)
  File "pandas/io/json.py", line 391, in _run_with_kwarg_exception
    raise ValueError(msg)
ValueError: JSON data had unexpected key: 'badkey'

@jtratner
Copy link
Contributor Author

and I think it's clear enough in Python 3 too:

Traceback (most recent call last):
  File "./pandas/io/json.py", line 382, in _run_with_kwarg_exception
    return func(**kwargs)
TypeError: __init__() got an unexpected keyword argument 'badkey'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "./pandas/io/json.py", line 182, in read_json
    date_unit).parse()
  File "./pandas/io/json.py", line 242, in parse
    self._parse_no_numpy()
  File "./pandas/io/json.py", line 478, in _parse_no_numpy
    self.obj = _run_with_kwarg_exception(DataFrame, decoded, dtype=None)
  File "./pandas/io/json.py", line 391, in _run_with_kwarg_exception
    raise ValueError(msg)
ValueError: JSON data had unexpected key: 'badkey'

@Komnomnomnom
Copy link
Contributor

I agree that the exception is much clearer. Rather than relying on the text of the message in the raised TypeError, which seems kind of fragile, would it be better to instead check that the decoded keys are only those encoded by split and raise a ValueError directly if not?

columns, index, data for DataFrame
name, index, data for Series

@jtratner
Copy link
Contributor Author

@Komnomnomnom thanks for the cleaner matchgroup usage. I did it this way only to make it easier to maintain...but I guess we can just define what the keys are in json once, so no big problem.

@jtratner
Copy link
Contributor Author

@Komnomnomnom any other elements need guards for bad keys? Otherwise, this is probably good.

@Komnomnomnom
Copy link
Contributor

Nope split should be the only orient using named keys. Thanks @jtratner!

(in json data with orient=split)
jtratner added a commit that referenced this pull request Sep 17, 2013
…json-key

ENH: Better read_json error when handling bad keys
@jtratner jtratner merged commit ec28c9d into pandas-dev:master Sep 17, 2013
@jtratner jtratner deleted the better-error-message-with-bad-json-key branch September 17, 2013 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better error message when handling bad keys in json data
2 participants