New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested objects in a nested array - key disapear #24

Closed
soyuka opened this Issue Dec 18, 2014 · 8 comments

Comments

Projects
None yet
4 participants
@soyuka
Copy link

soyuka commented Dec 18, 2014

To reproduce the behavior, modify example.json to :

[
  {
    "name": "example document for wicked fast parsing of huge json docs",
    "integer": 123,
    "totally sweet scientific notation": -123.123e-2,
    "unicode? you betcha!": "ú™£¢∞§\u2665",
    "zero character": "0",
    "null is boring": null,
    //this is the important part
    "lines": [
      { "test" : 1, "test2" : 2},
      { "test" : 1, "test2" : 2}
    ]
  },
  ...
]

Using the example.php script, it'll produce:

  [0]=>
  array(7) {
    ...
    [0]=> //this key should be "list" not "0"
    array(2) {
      [0]=>
      array(2) {
        ["test"]=>
        int(1)
        ["test2"]=>
        int(2)
      }
      [1]=>
      array(2) {
        ["test"]=>
        int(1)
        ["test2"]=>
        int(2)
      }
    }
  }

I first though it was because of the start_object method but it's not.
I've not looked into the core script to see if I could fix this but if someone had a clue on that it'd be great! Is it a bug or is this the indented behavior?

@dfreeman

This comment has been minimized.

Copy link
Member

dfreeman commented Dec 18, 2014

The problem appears to be in the Listener implementation in example.php rather than in the core parser. Its internal $_key variable only remembers a single value, which means if you have nested objects, the outer key gets forgotten.

In the simplest case, you can see that the "outer" key gets forgotten once the "inner" one is parsed, so

{
  "outer": {
    "inner": true
  }
}

ends up parsing as though it were

[{ "inner": true }]

@gonzofy Seems like the example listener probably needs to keep a stack of keys rather than just the most recent one seen.

@soyuka

This comment has been minimized.

Copy link

soyuka commented Dec 18, 2014

But with a simple array containing values it's working : https://github.com/salsify/jsonstreamingparser/blob/master/example/example.json#L16.

@dfreeman

This comment has been minimized.

Copy link
Member

dfreeman commented Dec 18, 2014

If you look carefully at the output of the example listener on that JSON document, you'll see it has the same problem. The object in which the "nested array" key is nested should itself have the key "nested object" in its parent, but instead it's just 0.

array(2) {
  [0]=>
  // <snip>
  [1]=>
  array(4) {
    ["name"]=>
    string(14) "another object"
    ["cooler than first object?"]=>
    bool(true)
    [0]=> // should be ["nested object"]
    array(3) {
      ["nested object?"]=>
      bool(true)
      ["is nested array the same combination i have on my luggage?"]=>
      bool(true)
      [0]=> // should be ["nested array"]
      array(5) {
        ["nested array"]=> // should be [0], and all subsequent indices should be one higher
        int(1)
        [0]=>
        int(2)
        [1]=>
        int(3)
        [2]=>
        int(4)
        [3]=>
        int(5)
      }
    }
    ["false"]=>
    bool(false)
  }
}
@soyuka

This comment has been minimized.

Copy link

soyuka commented Dec 19, 2014

It's not easy to build a Listener that does what I want with such an undocumented tool :/.

There are no comments at all in the core files, it'd be nice having a generated PHPDoc to understand how everything is working.

Do you have some concrete use cases on how you use this json parser?

@nosweat

This comment has been minimized.

Copy link

nosweat commented Dec 19, 2014

same problem here but i am trying to solve the issue. I really need to get this working as I work on Large JSON files and it's full of Nested arrays..."Hacking" parser.php

@soyuka

This comment has been minimized.

Copy link

soyuka commented Dec 19, 2014

I'll find a solution I've spent the whole day working on a listener that
calls a function on every node of the root array; with no success. I'll ofc
share it when it's working ;).
Le ven. 19 déc. 2014 à 17:16, nosweat notifications@github.com a écrit :

same problem here but i am trying to solve the issue. I really need to get
this working as I work on Large JSON files and it's full of Nested
arrays..."Hacking" parser.php


Reply to this email directly or view it on GitHub
#24 (comment)
.

@soyuka

This comment has been minimized.

Copy link

soyuka commented Jan 5, 2015

http://soyuka.me/streaming-big-json-files-the-good-way/

@nosweat My Listener is on the blog post. Feel free to use it.
Here is a small json test file

@dfreeman Thanks for this nice script, a bit more comments on the main code could be helpful though!

@soyuka soyuka closed this Jan 5, 2015

@gonzofy

This comment has been minimized.

Copy link
Member

gonzofy commented Jan 24, 2015

@soyuka I mean to complement you on the blog post a while ago but forgot about it. Great post; thanks for writing! I agree that the code could be better documented and will merge helpful comments from anyone that wants to spend time doing so (right now I'm a bit distracted with other projects to do more than fix bugs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment