-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LzoJsonRecordReader and JsonLoader may fail with NullPointerException #65
Comments
Oh, and this is latest ElephantBird 2.0.x from the eb-dev integration branch (commit 50ad2a6). |
JSONParser#parse does mention null as on of the possible returns. But could not easily see how that occurs. yes, checking for null and logging a warning would be a good fix (same as a parse error). |
I do have the input data that produces this problem, but I'd have to narrow it down to the exact snippet that triggers the bug. I already have a patch available, but I want to add a proper unit test + test data to it. Will work on this next week. |
I have not forgotten this issue. Stay tuned! |
I sent you a pull request at https://github.com/kevinweil/elephant-bird/pull/99. The commit is based on the eb-dev branch as of commit 06608de. |
Edit: New pull request with slightly improved patch at https://github.com/kevinweil/elephant-bird/pull/99. I closed the old pull request #98. |
See my follow-up comments on commit 86bfa834d0, addressing Raghu's questions. |
The patch was merged in, see https://github.com/kevinweil/elephant-bird/pull/102. |
We have run into some issues with LzoJsonRecordReader and JsonLoader where on some JSON data -- we haven't been able to narrow it down yet -- both classes will fail with a NullPointerException.
For instance, here is the relevant snippet of LzoJsonRecordReader#decodeLineToJson(). What seems to happen is that LzoJsonRecordReader#nextKeyValue() passes a null value to decodeLineToJson(). The problem might therefore originate in the line "int newSize = in_.readLine(currentLine_);" in nextKeyValue(). From what I remember, json-simple's parse method might actually return a null value as well (= adding a safeguard to LzoJsonLoader and JsonLoader might be a good idea anyways) but in this case the null value is already passed to json-simple, before the latter will actually return itself with a null value.
A similar NPE issue exists for the JsonLoader#parseStringToTuple().
A quick fix is to simply catch for a NPE in LzoJsonRecordReader#decodeLineToJson() and JsonLoader#parseStringToTuple() as is already being done for e.g. ParseException. This fix has worked fine for us. As I said above, however, this might not fix the root of the problem.
I am not so familiar with the code, so I am not sure whether just catching NPEs is the best fix. Maybe you have a better idea!
The text was updated successfully, but these errors were encountered: