Added errors{'raise','ignore'} for keys not found in meta for json_normalize #14583

dickreuter · 2016-11-04T01:56:51Z

Continuation of #14505

Added errors{'raise','ignore'} for keys not found in meta for json_normalize

codecov-io · 2016-11-04T17:41:55Z

Current coverage is 85.26% (diff: 100%)

Merging #14583 into master will decrease coverage by <.01%

@@             master     #14583   diff @@
==========================================
  Files           144        144          
  Lines         50980      50985     +5   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43470      43474     +4   
- Misses         7510       7511     +1   
  Partials          0          0

Powered by Codecov. Last update 14e4815...701c140

dickreuter · 2016-11-04T20:46:25Z

Cleaning it up, should be better now. Is there a new whatsnew file so I can add the comments? What's the next version number?

dickreuter · 2016-11-17T22:07:45Z

Why is nobody merging this?

jreback · 2016-11-17T23:01:11Z

pandas/io/json.py

@@ -740,6 +742,8 @@ def json_normalize(data, record_path=None, meta=None,
        If True, prefix records with dotted (?) path, e.g. foo.bar.field if
        path to records is ['foo', 'bar']
    meta_prefix : string, default None
+    error: {'raise', 'ignore'}, default 'raise'


need a version in 0.20.0 tag

need a version added tag

jreback · 2016-11-17T23:01:44Z

doc/source/whatsnew/v0.20.0.txt

@@ -80,3 +80,4 @@ Performance Improvements

 Bug Fixes
 ~~~~~~~~~
+- Enhancement in ``pandas.io.json.json_normalize``If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14505`)


you can put in enhancements section

jreback · 2016-11-17T23:02:25Z

doc/source/whatsnew/v0.20.0.txt

@@ -80,3 +80,4 @@ Performance Improvements

 Bug Fixes
 ~~~~~~~~~
+- Enhancement in ``pandas.io.json.json_normalize``If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14505`)


space after quotes and before If. put errors='ignore' in double-backticks.

jreback · 2016-11-17T23:02:48Z

pandas/io/json.py

+                                meta_val = np.nan
+                            else:
+                                raise KeyError(
+                                    "Try running with errors='ignore' as the following key may not always be present: " + str(e))


this won't pass linting

Why not? What do I need to change?

We check for PEP8, so this line will be too long. Normally you can set your editor to check for those things, but can also see the output of the check on travis (3d build): https://travis-ci.org/pandas-dev/pandas/jobs/174180562 (at the bottom)

pandas/io/json.py:746:80: E501 line too long (85 > 79 characters) pandas/io/json.py:853:80: E501 line too long (129 > 79 characters) pandas/io/tests/json/test_json_norm.py:229:5: E303 too many blank lines (2) pandas/io/tests/json/test_json_norm.py:271:80: E501 line too long (81 > 79 characters) pandas/io/tests/json/test_json_norm.py:272:80: E501 line too long (104 > 79 characters) pandas/io/tests/json/test_json_norm.py:273:17: E225 missing whitespace around operator pandas/io/tests/json/test_json_norm.py:274:10: E128 continuation line under-indented for visual indent pandas/io/tests/json/test_json_norm.py:275:10: E128 continuation line under-indented for visual indent pandas/io/tests/json/test_json_norm.py:276:10: E128 continuation line under-indented for visual indent pandas/io/tests/json/test_json_norm.py:277:10: E128 continuation line under-indented for visual indent

jreback · 2016-11-17T23:03:11Z

pandas/io/tests/json/test_json_norm.py

@@ -225,6 +225,59 @@ def test_nested_flattens(self):

        self.assertEqual(result, expected)

+
+    def test_json_normalise_fix(self):
+        # issue 14505


put a comment explaining what this is. rename test -> test_json_normalize_errors

jreback · 2016-11-17T23:03:34Z

pandas/io/tests/json/test_json_norm.py

+         'price': {0: '0', 1: '0', 2: '0', 3: '0'},
+         'symbol': {0: 'AAPL', 1: 'GOOG', 2: 'AAPL', 3: 'GOOG'}}
+
+        self.assertEqual(j.fillna('').to_dict(), expected)


test the same with errors='raise'

jreback · 2016-11-17T23:04:29Z

@dickreuter we have 107 open PR's. very few people do code review. simply ping if you want someone to look.

jorisvandenbossche · 2016-11-17T23:29:16Z

@dickreuter just so you know: we don't get notifications when you push new code, only from comments. So it's always good to put a comment after you pushed new code to indicated that you updated. And as @jreback says, pinging like you now did always helps

dickreuter · 2016-11-17T23:35:15Z

Ok great. Please have a look. I think this can now be merged.

jorisvandenbossche · 2016-11-18T09:41:03Z

@dickreuter Did you push the changes? As there is still a merge conflict and there is no new commit since @jreback 's comments

dickreuter · 2016-11-18T17:28:47Z

Just pushed the changes as instructed.

dickreuter · 2016-11-28T20:58:22Z

Anything else I need to fix in this or can this be merged?

dickreuter · 2016-12-01T12:48:41Z

Just did another rebase and pushed again. Can this be merged now? thanks

jreback · 2016-12-01T15:30:01Z

doc/source/whatsnew/v0.20.0.txt

@@ -49,6 +49,7 @@ Other enhancements
 ^^^^^^^^^^^^^^^^^^

 - ``pd.read_excel`` now preserves sheet order when using ``sheetname=None`` (:issue:`9930`)
+- ``pandas.io.json.json_normalize`` If meta keys are not always present a new option to set errors="ignore" (:issue:`14583`)


In pandas.io.json.json_normalize gained the option errors='ignore'|raise (in double-backticks); the default is raise and is backward compatible.

jreback · 2016-12-01T15:30:11Z

doc/source/whatsnew/v0.20.0.txt

@@ -104,4 +105,4 @@ Performance Improvements
 .. _whatsnew_0200.bug_fixes:

 Bug Fixes
-~~~~~~~~~


revert this

jreback · 2016-12-01T15:30:46Z

pandas/io/json.py

@@ -740,6 +742,8 @@ def json_normalize(data, record_path=None, meta=None,
        If True, prefix records with dotted (?) path, e.g. foo.bar.field if
        path to records is ['foo', 'bar']
    meta_prefix : string, default None
+    error: {'raise', 'ignore'}, default 'raise'
+        * ignore: will ignore keyErrors if keys listed in meta are not always present


add an entry for raise

will ignore KeyError, if the keys passed in meta are not all present

jreback · 2016-12-01T15:31:38Z

pandas/io/json.py

+                                meta_val = np.nan
+                            else:
+                                raise \
+                                    KeyError(


better comment please

jreback · 2016-12-01T15:31:50Z

pandas/io/tests/json/test_json_norm.py

@@ -225,6 +225,65 @@ def test_nested_flattens(self):

        self.assertEqual(result, expected)

+
+    def test_json_normalize_errors(self):
+        # If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14583`)


jreback · 2016-12-01T15:32:21Z

just some doc comments, otherwise lgtm.

dickreuter · 2016-12-02T15:20:27Z

Implemented doc amendments as suggested. Let me know if anything else needs to be changed. thanks

jorisvandenbossche

Some PEP8 style issues, otherwise looks good!

jorisvandenbossche · 2016-12-02T23:10:42Z

pandas/io/json.py

@@ -740,6 +742,9 @@ def json_normalize(data, record_path=None, meta=None,
        If True, prefix records with dotted (?) path, e.g. foo.bar.field if
        path to records is ['foo', 'bar']
    meta_prefix : string, default None
+    error: {'raise', 'ignore'}, default 'raise'
+        * ignore: will ignore KeyError if keys listed in meta are not always present
+        * raise: will raise KeyError if keys listed in meta are not always present


I suspect those two lines above are too long (> 80, we check for PEP8, which will let travis fail)

jorisvandenbossche · 2016-12-02T23:11:32Z

pandas/io/json.py

+                            if errors == 'ignore':
+                                meta_val = np.nan
+                            else:
+                                raise KeyError("Try running with errors='ignore' as key %s is not always present.", e)


same here, for PEP8 line too long

jorisvandenbossche · 2016-12-02T23:12:11Z

pandas/io/tests/json/test_json_norm.py

@@ -225,6 +225,65 @@ def test_nested_flattens(self):

        self.assertEqual(result, expected)

+
+    def test_json_normalize_errors(self):
+        # GH14583: If meta keys are not always present a new option to set errors='ignore' has been implemented


here and in other places below line length is also a problem

jorisvandenbossche · 2016-12-10T11:15:12Z

@dickreuter You still have some style issues (that's the reason that travis is failing). You can always look in the log of the third travis build to see the issues (https://travis-ci.org/pandas-dev/pandas/jobs/182143127, scroll to the bottom):

Linting  *.py
pandas/io/json.py:750:1: W293 blank line contains whitespace
pandas/io/json.py:857:80: E501 line too long (81 > 79 characters)
pandas/io/json.py:858:80: E501 line too long (85 > 79 characters)
pandas/io/tests/json/test_json_norm.py:229:5: E303 too many blank lines (2)
pandas/io/tests/json/test_json_norm.py:266:80: E501 line too long (81 > 79 characters)
pandas/io/tests/json/test_json_norm.py:267:80: E501 line too long (87 > 79 characters)
pandas/io/tests/json/test_json_norm.py:269:17: E225 missing whitespace around operator
pandas/io/tests/json/test_json_norm.py:270:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:271:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:272:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:273:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:278:17: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:278:80: E501 line too long (86 > 79 characters)
pandas/io/tests/json/test_json_norm.py:279:28: E127 continuation line over-indented for visual indent
pandas/io/tests/json/test_json_norm.py:279:80: E501 line too long (87 > 79 characters)
Linting *.py DONE

(it also a good idea to set up your IDE to warn for this)

dickreuter · 2016-12-11T19:27:39Z

I believe this is fixed now. Some checks are still not successful but I can't find any reason why. Any suggestions why? All it says is:

The command "ci/lint.sh" exited with 1.
0.00s$ echo "script done"
script done

jreback · 2016-12-11T19:38:54Z

git diff master | flake8 -diff

jorisvandenbossche · 2016-12-11T20:16:27Z

@dickreuter You can also always look at the third travis build (https://travis-ci.org/pandas-dev/pandas/jobs/183014087, scroll to bottom), which says "pandas/io/json.py:857:45: E211 whitespace before '('"

dickreuter · 2016-12-12T22:20:33Z

Linting still seems to fail, even though when I do "git diff master | flake8 --diff" I only get problems in v0.20.0.txt, all the rest is fine. Any suggestions how I can find the remaining issues?

jorisvandenbossche · 2016-12-12T22:38:57Z

@dickreuter It was not linting this time, the mac build failed.

dickreuter · 2016-12-12T23:38:50Z

And it is related to my code? I can't find any details.

jreback · 2016-12-13T11:19:05Z

@dickreuter you need to rebase on master

git rebase -i origin/master then
git push -f

… meta parameter that don't always occur in every item of the list Added documentation and test for issue #14505 Added keyword errors {'raise'|'ignore} Shortened what's new Removed commas in dictionary for linting compatibility Updated doc

dickreuter · 2016-12-13T23:07:27Z

Now all checked have passed.

jreback · 2016-12-13T23:21:52Z

thanks!

…on_normalize Author: dickreuter <dickreuter@yahoo.com> Closes pandas-dev#14583 from dickreuter/json_normalize_enhancement and squashes the following commits: 701c140 [dickreuter] adjusted formatting 3c94206 [dickreuter] shortened lines to pass linting 2028924 [dickreuter] doc changes d298588 [dickreuter] Fixed as instructed in pull request page bcfbf18 [dickreuter] Avoids exception when pandas.io.json.json_normalize

jorisvandenbossche mentioned this pull request Nov 4, 2016

Avoids exception when pandas.io.json.json_normalize contains items in… #14505

Closed

jorisvandenbossche added Error Reporting Incorrect or improved errors from pandas IO JSON read_json, to_json, json_normalize labels Nov 4, 2016

jreback reviewed Nov 17, 2016

View reviewed changes

jreback reviewed Dec 1, 2016

View reviewed changes

pandas/io/json.py

meta_val = np.nan

else:

raise \

KeyError(

Copy link

Contributor

jreback Dec 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better comment please

jreback reviewed Dec 1, 2016

View reviewed changes

jorisvandenbossche reviewed Dec 2, 2016

View reviewed changes

jorisvandenbossche added this to the 0.20.0 milestone Dec 10, 2016

dickreuter added 5 commits December 13, 2016 21:50

Fixed as instructed in pull request page

d298588

doc changes

2028924

shortened lines to pass linting

3c94206

adjusted formatting

701c140

jreback closed this in 7d8bc0d Dec 13, 2016

Added errors{'raise','ignore'} for keys not found in meta for json_normalize #14583

Added errors{'raise','ignore'} for keys not found in meta for json_normalize #14583

Conversation

dickreuter commented Nov 4, 2016 • edited by jorisvandenbossche Loading

codecov-io commented Nov 4, 2016 • edited Loading

Current coverage is 85.26% (diff: 100%)

dickreuter commented Nov 4, 2016

dickreuter commented Nov 17, 2016

jreback Nov 17, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Nov 17, 2016

jorisvandenbossche commented Nov 17, 2016

dickreuter commented Nov 17, 2016 • edited by jorisvandenbossche Loading

jorisvandenbossche commented Nov 18, 2016

dickreuter commented Nov 18, 2016

dickreuter commented Nov 28, 2016

dickreuter commented Dec 1, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Dec 1, 2016

dickreuter commented Dec 2, 2016

jorisvandenbossche left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche commented Dec 10, 2016

dickreuter commented Dec 11, 2016

jreback commented Dec 11, 2016

jorisvandenbossche commented Dec 11, 2016

dickreuter commented Dec 12, 2016

jorisvandenbossche commented Dec 12, 2016

dickreuter commented Dec 12, 2016

jreback commented Dec 13, 2016

dickreuter commented Dec 13, 2016

jreback commented Dec 13, 2016

dickreuter commented Nov 4, 2016 •

edited by jorisvandenbossche

Loading

codecov-io commented Nov 4, 2016 •

edited

Loading

jreback Nov 17, 2016 •

edited

Loading

dickreuter commented Nov 17, 2016 •

edited by jorisvandenbossche

Loading