Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to simdjson 0.6.0 #52

Merged
merged 4 commits into from
Oct 24, 2020
Merged

Update to simdjson 0.6.0 #52

merged 4 commits into from
Oct 24, 2020

Conversation

eddelbuettel
Copy link
Owner

This updates to the just released 0.6.0 of simdjson. I only needed to flip four test predicates to now expect error.

@knapply Give a quick lookover; we can probably release as is to get this to our users and can then ponder in parallel if there is a need (and time !!) for additional work from our end. I have been musing if we should maybe think about adding simple accessors for utf-validation.

/cc @lemire @jkeiser with big thanks as always

@jkeiser
Copy link

jkeiser commented Oct 23, 2020

Not needed for this obviously but I'd love see a sample using the new (preview) on demand api. It seems like the r binding is the only binding that could let users use it., because of the wondrous c++Ness.

@eddelbuettel
Copy link
Owner Author

Good plan. We can hopefully add something. Is there an example in simdjson I should be aware of? Anything else that would help bootstrap our efforts a little?

@lemire
Copy link
Collaborator

lemire commented Oct 23, 2020

@eddelbuettel

One caveat: we do not have runtime dispatching working yet for On Demand, so the code must be compiled to the machine (e.g., -march=native).

    auto json = "{\"coordinates\":[{\"x\":1.1,\"y\":2.2,\"z\":3.3}, {\"x\":3.1,\"y\":2.2,\"z\":4.3}, {\"x\":14.1,\"y\":2.1,\"z\":3.3}]}"_padded;
    ondemand::parser parser{};
    auto doc = parser.iterate(json);
    double x{0};
    double y{0};
    double z{0};
    for (ondemand::object point_object : doc["coordinates"]) {
      x += double(point_object["x"]);
      y += double(point_object["y"]);
      z += double(point_object["z"]);
    }

We'll be happy to provide more examples, but I think you can get the idea from this above.

Note that there are still rough edges to the On Demand API. We expect that it is fully correct (save maybe for corner cases), but there are things to watch for, like you cannot iterate over two objects at the same time...
https://github.com/simdjson/simdjson/blob/master/doc/ondemand.md

Despite all my warnings, we really want people to try it out. We really think it is stuff from the future.

@jkeiser
Copy link

jkeiser commented Oct 23, 2020

Yep! https://github.com/simdjson/simdjson/blob/master/doc/ondemand.md

It's essentially the DOM api, but it parses the json progressively while you iterate through it (forward only).

@jkeiser
Copy link

jkeiser commented Oct 23, 2020

To be clear, you can iterate two nested objects at the same time as long as you finish with inner one before continuing the outer (like nested for loops). What @lemire means is you can't keep references to two sibling objects at the same time. In fact, right now you need to ensure that any object or array variable is destroyed / goes out of scope before you continue iterating a parent array or look up another field in a parent object.

Basically make sure you scope object and array variables the same way they appear in the json file.

@jkeiser
Copy link

jkeiser commented Oct 23, 2020

Note that you should be able to use "auto" Instead of ondemand::object, in that example.

@jkeiser
Copy link

jkeiser commented Oct 23, 2020

Oh. Wait. No, you can't.

@lemire
Copy link
Collaborator

lemire commented Oct 23, 2020

@jkeiser My example is copied-pasted from our tests. It should work.

(The rough edges become smooth after one reads the short documentation, but I have been told that programmers in 2020 don't read documentation or books.)

Copy link
Collaborator

@knapply knapply left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Dirk. Looks good (of course!).

With the changes in query behavior requiring some test modifications on this and the last update, I recommend we enable codecov to run on Travis. We're largely set up for it already - and I just ran it locally to check if there are any dramatic coverage changes: we're still 80+% - so I'll see if I can throw that together in a separate PR now.

@knapply knapply mentioned this pull request Oct 24, 2020
@eddelbuettel
Copy link
Owner Author

Good idea regarding coverage -- I just added a simple commit doing just that via run.sh as I do in a number of other repos too.

@knapply
Copy link
Collaborator

knapply commented Oct 24, 2020

Good idea regarding coverage -- I just added a simple commit doing just that via run.sh as I do in a number of other repos too.

Welp, here I am digging through Rcpp to match it. Perfect!

@eddelbuettel
Copy link
Owner Author

eddelbuettel commented Oct 24, 2020

Yes, sorry, Rcpp uses Docker so different setup. I at first grep'ed wrong too, then looked into run.sh and indeed:

edd@rob:~/git/rcppsimdjson(feature/simdjson_0.6.0)$ grep coverage ~/git/*/.travis.yml
/home/edd/git/anytime/.travis.yml:  - ./run.sh coverage
/home/edd/git/asioheaders/.travis.yml:#  - ./run.sh coverage
/home/edd/git/bh/.travis.yml:#  - ./run.sh coverage
/home/edd/git/dang-azure/.travis.yml:#  - ./run.sh coverage
/home/edd/git/dang.orig/.travis.yml:#  - ./run.sh coverage
/home/edd/git/dang/.travis.yml:#  - ./run.sh coverage
/home/edd/git/digest/.travis.yml:  - ./run.sh coverage
/home/edd/git/n2r/.travis.yml:## no tests, coverage
/home/edd/git/nanotime-lsilvest/.travis.yml:  - ./run.sh coverage
/home/edd/git/nanotime/.travis.yml:  - ./run.sh coverage
/home/edd/git/random/.travis.yml:#  - ./run.sh coverage
/home/edd/git/rapidatetime/.travis.yml:  - ./run.sh coverage
/home/edd/git/rapiserialize/.travis.yml:#  - ./run.sh coverage
/home/edd/git/rblpapi-alfred/.travis.yml:#  - ./run.sh coverage
/home/edd/git/rblpapi/.travis.yml:#  - ./run.sh coverage
/home/edd/git/rcppapt/.travis.yml:  - ./run.sh coverage
/home/edd/git/rcppcctz/.travis.yml:#  - ./run.sh coverage
/home/edd/git/rcppclassic/.travis.yml:#  - ./run.sh coverage
/home/edd/git/rcppeigen/.travis.yml:#  - ./run.sh coverage
/home/edd/git/rcppnloptexample/.travis.yml:  - ./run.sh coverage
/home/edd/git/rcppquantuccia/.travis.yml:#  - ./run.sh coverage
/home/edd/git/rcppsimdjson/.travis.yml:  - ./run.sh coverage
/home/edd/git/rcppspdlog/.travis.yml:#  - ./run.sh coverage
/home/edd/git/rcpptoml/.travis.yml:  - ./run.sh coverage
/home/edd/git/rcpp/.travis.yml:    - name: coverage
/home/edd/git/rfoaas/.travis.yml:  - ./run.sh coverage
/home/edd/git/rpushbullet/.travis.yml:  - if [[ "${TRAVIS_PULL_REQUEST}" == "false" ]]; then ./run.sh coverage; fi
/home/edd/git/tldbclr/.travis.yml:#  - if [[ "${TRAVIS_PULL_REQUEST}" == "false" ]]; then ./run.sh coverage; fi
/home/edd/git/ttdo/.travis.yml:  - ./run.sh coverage

Always good to have a local corpus of work to copy from ;-)

@codecov
Copy link

codecov bot commented Oct 24, 2020

Codecov Report

❗ No coverage uploaded for pull request base (master@74ffa56). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master      #52   +/-   ##
=========================================
  Coverage          ?   84.15%           
=========================================
  Files             ?       18           
  Lines             ?     1218           
  Branches          ?        0           
=========================================
  Hits              ?     1025           
  Misses            ?      193           
  Partials          ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 74ffa56...31061e0. Read the comment docs.

@eddelbuettel eddelbuettel merged commit eb4f03e into master Oct 24, 2020
@eddelbuettel eddelbuettel deleted the feature/simdjson_0.6.0 branch October 24, 2020 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants