Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get() key search can bleed through levels of JSON hierarchy #5

Closed
daboyuka opened this issue Mar 22, 2016 · 11 comments
Closed

Get() key search can bleed through levels of JSON hierarchy #5

daboyuka opened this issue Mar 22, 2016 · 11 comments

Comments

@daboyuka
Copy link
Contributor

I want to first thank you, @buger, for your work on this library. Looking up a few JSON key paths in large JSON blobs is a significant bottleneck in a project I'm working on, and your library could give us a big speedup without changing our data format.

Unfortunately, I've discovered an issue in Get(): when searching for a key, Get() may locate that key outside the current JSON object. Here is an example test case that breaks (written using check.v1):

package jsonparser_test

import (
    "github.com/buger-jsonparser"
    . "gopkg.in/check.v1"
    "testing"
)

func (s *JsonParserTests) TestJsonParserSearchBleed(c *C) {
    killer := []byte(`{
      "parentkey": {
        "childkey": {
          "grandchildkey": 123
        },
        "otherchildkey": 123
      }
    }`)

    var jtype int

    _, jtype, _, _ = jsonparser.Get(killer, "childkey")
    c.Assert(jtype, Equals, jsonparser.NotExist) // fails, returns data parentkey.childkey

    _, jtype, _, _ = jsonparser.Get(killer, "parentkey", "childkey", "otherchildkey")
    c.Assert(jtype, Equals, jsonparser.NotExist) // fails, returns data parentkey.otherchildkey
}

// Boilerplate
func Test(t *testing.T) { TestingT(t) }
type JsonParserTests struct{}
var _ = Suite(&JsonParserTests{})

The issue is that Get() uses bytes.Index() to find the next key it's looking for, but only validates it by checking that it is surrounded by double quotes and followed by a colon. In particular, it does not check whether it has crossed an unmatched sequence of braces, which would indicate transitioning into another JSON object level.

I don't have a great suggestion as to how to fix this, sadly. Best of luck.

@buger
Copy link
Owner

buger commented Mar 22, 2016

Good point, i have few ideas how to fix it, will keep you updated, thanks!

@buger
Copy link
Owner

buger commented Mar 22, 2016

Should be fixed now, for each next key it will limit scope to parent key object.

@buger buger closed this as completed Mar 22, 2016
@daboyuka
Copy link
Contributor Author

Sorry to say, your patch only fixes the second failing assertion, but not the first (which I think is the more difficult one, sadly).

@buger
Copy link
Owner

buger commented Mar 22, 2016

I see what you mean, thank you for pointing. Yes, it is definitely a bit harder, will try to get it fixed.

@buger buger reopened this Mar 22, 2016
@daboyuka
Copy link
Contributor Author

For what it's worth, my suggestion would be to use your parsing functions to more carefully step through the keys in an object. A (very) rough sketch:

  1. nextValue to seek to a key
  2. bytes.Equal to compare the key
  3. if failed, stringEnd to skip the key
  4. skip whitespace, colon, whitespace
  5. check type of value, use trailingBracket to skip object, etc.
  6. Repeat

Unfortunately this won't be nearly as efficient as bytes.Index, so perhaps there is a more efficient way.

@daboyuka
Copy link
Contributor Author

I found another case that breaks the current parser: key "<anything>\"abc": matches a lookup for key "abc". Here's an expanded test case including that:

package jsonparser_test

import (
    "github.com/buger-jsonparser"
    . "gopkg.in/check.v1"
    "testing"
)

func (s *JsonParserTests) TestJsonParserSearchBleed(c *C) {
    //c.Skip("jsonparser is broken as of 2016-03-21")

    killer := []byte(`{
          "parentkey": {
            "childkey": {
              "grandchildkey": 123
            },
            "otherchildkey": 123
          },
          "bad key\"good key": 123,
        }`)

    var jtype int

    _, jtype, _, _ = jsonparser.Get(killer, "childkey")
    c.Assert(jtype, Equals, jsonparser.NotExist) // fails, returns data from parentkey.childkey

    _, jtype, _, _ = jsonparser.Get(killer, "parentkey", "childkey", "otherchildkey")
    c.Assert(jtype, Equals, jsonparser.NotExist) // fails, returns data from parentkey.otherchildkey

    _, jtype, _, _ = jsonparser.Get(killer, "good key")
    c.Assert(jtype, Equals, jsonparser.NotExist) // fails, returns data from badkey"goodkey
}

// Boilerplate
func Test(t *testing.T) { TestingT(t) }
type JsonParserTests struct{}
var _ = Suite(&JsonParserTests{})

@buger
Copy link
Owner

buger commented Mar 22, 2016

Can you try this branch https://github.com/buger/jsonparser/tree/key-search, i rewrote how key search works. Thanks!

@buger
Copy link
Owner

buger commented Mar 22, 2016

(to test branch, create vendor folder and put jsonparser to vendor/github.com/buger/jsonparser, then checkout needed branch there)

@daboyuka
Copy link
Contributor Author

Thanks, I'll check it in a little while. I'll also make a PR with the tests above for your convenience. Thanks for tackling this!

daboyuka referenced this issue in pendo-io/jsonparser Mar 22, 2016
@daboyuka
Copy link
Contributor Author

Your key-search branch passes the three cases identified in this issue (test cases added in #7). Thanks!

buger added a commit that referenced this issue Mar 23, 2016
Added tricky JSON tests (issue #5)
@buger
Copy link
Owner

buger commented Mar 24, 2016

Merged to master, thanks!

@buger buger closed this as completed Mar 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants