ARROW-1817: [Java] Configure JsonReader to read floating point NaN values#1375
ARROW-1817: [Java] Configure JsonReader to read floating point NaN values#1375BryanCutler wants to merge 2 commits intoapache:masterfrom
Conversation
|
LGTM. Do you want to add "NaN" in integration/data/simple.json? |
|
The background for this is that an existing Spark unit test for Arrow fails with the current master branch due to no longer being able to parse these values. The configuration fixes this and allows for the following JSON: This is not standard JSON, but is supported by the Jackson library, I don't think it is something to be concerned about though. cc @icexelloss |
Thanks @icexelloss for the quick review! Yeah, I'll try adding this to the integration tests and see if it's handled |
|
It doesn't look like C++ is able to parse the NaN value, but I'm not sure if this is something that we need to be running integration for. @wesm what do you think? |
|
Hm, I can try to fix the C++ reader. Up to you |
|
Given that the issue was only on the Java side and it's not standard JSON, I would say not to bother with the C++ reader and integration tests for now. |
|
LGTM. +1. With or without NaN in integration test. |
0b7295b to
4c4682a
Compare
|
Thanks @icexelloss , I'll try merging this later today if no more comments |
…lues The last upgrade of the Jackson JSON library changed behavior to no longer allow reading of "NaN" values by default. This change configures the JSON generator and parser to allow for NaN values (unquoted) alongside standard floating point numbers. A test was added for JSON writing/reading and modified the test for Arrow file and stream . Author: Bryan Cutler <cutlerb@gmail.com> Closes apache#1375 from BryanCutler/java-JsonReader-all_non_numeric-ARROW-1817 and squashes the following commits: 4c4682a [Bryan Cutler] configure JsonWriter to write NaN not as strings, add test for read and write of float with NaN 1fa24f4 [Bryan Cutler] added conf for JacksonParser to allow NaN tokens
The last upgrade of the Jackson JSON library changed behavior to no longer allow reading of "NaN" values by default. This change configures the JSON generator and parser to allow for NaN values (unquoted) alongside standard floating point numbers. A test was added for JSON writing/reading and modified the test for Arrow file and stream .