Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Added errors{'raise','ignore'} for keys not found in meta for json_normalize #14583
Conversation
jorisvandenbossche
referenced
this pull request
Nov 4, 2016
Closed
Avoids exception when pandas.io.json.json_normalize contains items in… #14505
jorisvandenbossche
added Error Reporting IO JSON
labels
Nov 4, 2016
codecov-io
commented
Nov 4, 2016
•
Current coverage is 85.26% (diff: 100%)@@ master #14583 diff @@
==========================================
Files 144 144
Lines 50980 50985 +5
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43470 43474 +4
- Misses 7510 7511 +1
Partials 0 0
|
|
Cleaning it up, should be better now. Is there a new whatsnew file so I can add the comments? What's the next version number? |
|
Why is nobody merging this? |
| @@ -740,6 +742,8 @@ def json_normalize(data, record_path=None, meta=None, | ||
| If True, prefix records with dotted (?) path, e.g. foo.bar.field if | ||
| path to records is ['foo', 'bar'] | ||
| meta_prefix : string, default None | ||
| + error: {'raise', 'ignore'}, default 'raise' |
| @@ -80,3 +80,4 @@ Performance Improvements | ||
| Bug Fixes | ||
| ~~~~~~~~~ | ||
| +- Enhancement in ``pandas.io.json.json_normalize``If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14505`) |
| @@ -80,3 +80,4 @@ Performance Improvements | ||
| Bug Fixes | ||
| ~~~~~~~~~ | ||
| +- Enhancement in ``pandas.io.json.json_normalize``If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14505`) |
jreback
Nov 17, 2016
Contributor
space after quotes and before If. put errors='ignore' in double-backticks.
| + meta_val = np.nan | ||
| + else: | ||
| + raise KeyError( | ||
| + "Try running with errors='ignore' as the following key may not always be present: " + str(e)) |
jorisvandenbossche
Nov 18, 2016
Owner
We check for PEP8, so this line will be too long. Normally you can set your editor to check for those things, but can also see the output of the check on travis (3d build): https://travis-ci.org/pandas-dev/pandas/jobs/174180562 (at the bottom)
pandas/io/json.py:746:80: E501 line too long (85 > 79 characters)
pandas/io/json.py:853:80: E501 line too long (129 > 79 characters)
pandas/io/tests/json/test_json_norm.py:229:5: E303 too many blank lines (2)
pandas/io/tests/json/test_json_norm.py:271:80: E501 line too long (81 > 79 characters)
pandas/io/tests/json/test_json_norm.py:272:80: E501 line too long (104 > 79 characters)
pandas/io/tests/json/test_json_norm.py:273:17: E225 missing whitespace around operator
pandas/io/tests/json/test_json_norm.py:274:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:275:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:276:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:277:10: E128 continuation line under-indented for visual indent
| @@ -225,6 +225,59 @@ def test_nested_flattens(self): | ||
| self.assertEqual(result, expected) | ||
| + | ||
| + def test_json_normalise_fix(self): | ||
| + # issue 14505 |
jreback
Nov 17, 2016
Contributor
put a comment explaining what this is. rename test -> test_json_normalize_errors
| + 'price': {0: '0', 1: '0', 2: '0', 3: '0'}, | ||
| + 'symbol': {0: 'AAPL', 1: 'GOOG', 2: 'AAPL', 3: 'GOOG'}} | ||
| + | ||
| + self.assertEqual(j.fillna('').to_dict(), expected) |
|
@dickreuter we have 107 open PR's. very few people do code review. simply ping if you want someone to look. |
|
@dickreuter just so you know: we don't get notifications when you push new code, only from comments. So it's always good to put a comment after you pushed new code to indicated that you updated. And as @jreback says, pinging like you now did always helps |
|
Ok great. Please have a look. I think this can now be merged. |
|
@dickreuter Did you push the changes? As there is still a merge conflict and there is no new commit since @jreback 's comments |
|
Just pushed the changes as instructed. |
|
Anything else I need to fix in this or can this be merged? |
|
Just did another rebase and pushed again. Can this be merged now? thanks |
| @@ -49,6 +49,7 @@ Other enhancements | ||
| ^^^^^^^^^^^^^^^^^^ | ||
| - ``pd.read_excel`` now preserves sheet order when using ``sheetname=None`` (:issue:`9930`) | ||
| +- ``pandas.io.json.json_normalize`` If meta keys are not always present a new option to set errors="ignore" (:issue:`14583`) |
jreback
Dec 1, 2016
Contributor
In pandas.io.json.json_normalize gained the option errors='ignore'|raise (in double-backticks); the default is raise and is backward compatible.
| @@ -104,4 +105,4 @@ Performance Improvements | ||
| .. _whatsnew_0200.bug_fixes: | ||
| Bug Fixes | ||
| -~~~~~~~~~ |
| @@ -740,6 +742,8 @@ def json_normalize(data, record_path=None, meta=None, | ||
| If True, prefix records with dotted (?) path, e.g. foo.bar.field if | ||
| path to records is ['foo', 'bar'] | ||
| meta_prefix : string, default None | ||
| + error: {'raise', 'ignore'}, default 'raise' | ||
| + * ignore: will ignore keyErrors if keys listed in meta are not always present |
jreback
Dec 1, 2016
Contributor
will ignore KeyError, if the keys passed in meta are not all present
| + meta_val = np.nan | ||
| + else: | ||
| + raise \ | ||
| + KeyError( |
| @@ -225,6 +225,65 @@ def test_nested_flattens(self): | ||
| self.assertEqual(result, expected) | ||
| + | ||
| + def test_json_normalize_errors(self): | ||
| + # If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14583`) |
|
just some doc comments, otherwise lgtm. |
|
Implemented doc amendments as suggested. Let me know if anything else needs to be changed. thanks |
| @@ -740,6 +742,9 @@ def json_normalize(data, record_path=None, meta=None, | ||
| If True, prefix records with dotted (?) path, e.g. foo.bar.field if | ||
| path to records is ['foo', 'bar'] | ||
| meta_prefix : string, default None | ||
| + error: {'raise', 'ignore'}, default 'raise' | ||
| + * ignore: will ignore KeyError if keys listed in meta are not always present | ||
| + * raise: will raise KeyError if keys listed in meta are not always present |
jorisvandenbossche
Dec 2, 2016
Owner
I suspect those two lines above are too long (> 80, we check for PEP8, which will let travis fail)
| + if errors == 'ignore': | ||
| + meta_val = np.nan | ||
| + else: | ||
| + raise KeyError("Try running with errors='ignore' as key %s is not always present.", e) |
| @@ -225,6 +225,65 @@ def test_nested_flattens(self): | ||
| self.assertEqual(result, expected) | ||
| + | ||
| + def test_json_normalize_errors(self): | ||
| + # GH14583: If meta keys are not always present a new option to set errors='ignore' has been implemented |
|
@dickreuter You still have some style issues (that's the reason that travis is failing). You can always look in the log of the third travis build to see the issues (https://travis-ci.org/pandas-dev/pandas/jobs/182143127, scroll to the bottom):
(it also a good idea to set up your IDE to warn for this) |
jorisvandenbossche
added this to the
0.20.0
milestone
Dec 10, 2016
|
I believe this is fixed now. Some checks are still not successful but I can't find any reason why. Any suggestions why? All it says is: The command "ci/lint.sh" exited with 1. |
|
git diff master | flake8 -diff |
|
@dickreuter You can also always look at the third travis build (https://travis-ci.org/pandas-dev/pandas/jobs/183014087, scroll to bottom), which says "pandas/io/json.py:857:45: E211 whitespace before '('" |
|
Linting still seems to fail, even though when I do "git diff master | flake8 --diff" I only get problems in v0.20.0.txt, all the rest is fine. Any suggestions how I can find the remaining issues? |
|
@dickreuter It was not linting this time, the mac build failed. |
|
And it is related to my code? I can't find any details. |
|
@dickreuter you need to rebase on master
|
dickreuter
added some commits
Oct 26, 2016
|
Now all checked have passed. |
jreback
closed this
in 7d8bc0d
Dec 13, 2016
|
thanks! |
ischurov
added a commit
to ischurov/pandas
that referenced
this pull request
Dec 19, 2016
|
|
dickreuter + ischurov |
4d40d6c
|
dickreuter commentedNov 4, 2016
•
edited by jorisvandenbossche
Continuation of #14505
Added errors{'raise','ignore'} for keys not found in meta for json_normalize