-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to simdjson 0.6.0 #52
Conversation
Not needed for this obviously but I'd love see a sample using the new (preview) on demand api. It seems like the r binding is the only binding that could let users use it., because of the wondrous c++Ness. |
Good plan. We can hopefully add something. Is there an example in |
One caveat: we do not have runtime dispatching working yet for On Demand, so the code must be compiled to the machine (e.g., auto json = "{\"coordinates\":[{\"x\":1.1,\"y\":2.2,\"z\":3.3}, {\"x\":3.1,\"y\":2.2,\"z\":4.3}, {\"x\":14.1,\"y\":2.1,\"z\":3.3}]}"_padded;
ondemand::parser parser{};
auto doc = parser.iterate(json);
double x{0};
double y{0};
double z{0};
for (ondemand::object point_object : doc["coordinates"]) {
x += double(point_object["x"]);
y += double(point_object["y"]);
z += double(point_object["z"]);
} We'll be happy to provide more examples, but I think you can get the idea from this above. Note that there are still rough edges to the On Demand API. We expect that it is fully correct (save maybe for corner cases), but there are things to watch for, like you cannot iterate over two objects at the same time... Despite all my warnings, we really want people to try it out. We really think it is stuff from the future. |
Yep! https://github.com/simdjson/simdjson/blob/master/doc/ondemand.md It's essentially the DOM api, but it parses the json progressively while you iterate through it (forward only). |
To be clear, you can iterate two nested objects at the same time as long as you finish with inner one before continuing the outer (like nested for loops). What @lemire means is you can't keep references to two sibling objects at the same time. In fact, right now you need to ensure that any object or array variable is destroyed / goes out of scope before you continue iterating a parent array or look up another field in a parent object. Basically make sure you scope object and array variables the same way they appear in the json file. |
Note that you should be able to use "auto" Instead of ondemand::object, in that example. |
Oh. Wait. No, you can't. |
@jkeiser My example is copied-pasted from our tests. It should work. (The rough edges become smooth after one reads the short documentation, but I have been told that programmers in 2020 don't read documentation or books.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Dirk. Looks good (of course!).
With the changes in query behavior requiring some test modifications on this and the last update, I recommend we enable codecov to run on Travis. We're largely set up for it already - and I just ran it locally to check if there are any dramatic coverage changes: we're still 80+% - so I'll see if I can throw that together in a separate PR now.
Good idea regarding coverage -- I just added a simple commit doing just that via |
Welp, here I am digging through Rcpp to match it. Perfect! |
Yes, sorry, Rcpp uses Docker so different setup. I at first grep'ed wrong too, then looked into edd@rob:~/git/rcppsimdjson(feature/simdjson_0.6.0)$ grep coverage ~/git/*/.travis.yml
/home/edd/git/anytime/.travis.yml: - ./run.sh coverage
/home/edd/git/asioheaders/.travis.yml:# - ./run.sh coverage
/home/edd/git/bh/.travis.yml:# - ./run.sh coverage
/home/edd/git/dang-azure/.travis.yml:# - ./run.sh coverage
/home/edd/git/dang.orig/.travis.yml:# - ./run.sh coverage
/home/edd/git/dang/.travis.yml:# - ./run.sh coverage
/home/edd/git/digest/.travis.yml: - ./run.sh coverage
/home/edd/git/n2r/.travis.yml:## no tests, coverage
/home/edd/git/nanotime-lsilvest/.travis.yml: - ./run.sh coverage
/home/edd/git/nanotime/.travis.yml: - ./run.sh coverage
/home/edd/git/random/.travis.yml:# - ./run.sh coverage
/home/edd/git/rapidatetime/.travis.yml: - ./run.sh coverage
/home/edd/git/rapiserialize/.travis.yml:# - ./run.sh coverage
/home/edd/git/rblpapi-alfred/.travis.yml:# - ./run.sh coverage
/home/edd/git/rblpapi/.travis.yml:# - ./run.sh coverage
/home/edd/git/rcppapt/.travis.yml: - ./run.sh coverage
/home/edd/git/rcppcctz/.travis.yml:# - ./run.sh coverage
/home/edd/git/rcppclassic/.travis.yml:# - ./run.sh coverage
/home/edd/git/rcppeigen/.travis.yml:# - ./run.sh coverage
/home/edd/git/rcppnloptexample/.travis.yml: - ./run.sh coverage
/home/edd/git/rcppquantuccia/.travis.yml:# - ./run.sh coverage
/home/edd/git/rcppsimdjson/.travis.yml: - ./run.sh coverage
/home/edd/git/rcppspdlog/.travis.yml:# - ./run.sh coverage
/home/edd/git/rcpptoml/.travis.yml: - ./run.sh coverage
/home/edd/git/rcpp/.travis.yml: - name: coverage
/home/edd/git/rfoaas/.travis.yml: - ./run.sh coverage
/home/edd/git/rpushbullet/.travis.yml: - if [[ "${TRAVIS_PULL_REQUEST}" == "false" ]]; then ./run.sh coverage; fi
/home/edd/git/tldbclr/.travis.yml:# - if [[ "${TRAVIS_PULL_REQUEST}" == "false" ]]; then ./run.sh coverage; fi
/home/edd/git/ttdo/.travis.yml: - ./run.sh coverage Always good to have a local corpus of work to copy from ;-) |
Codecov Report
@@ Coverage Diff @@
## master #52 +/- ##
=========================================
Coverage ? 84.15%
=========================================
Files ? 18
Lines ? 1218
Branches ? 0
=========================================
Hits ? 1025
Misses ? 193
Partials ? 0 Continue to review full report at Codecov.
|
This updates to the just released 0.6.0 of simdjson. I only needed to flip four test predicates to now expect error.
@knapply Give a quick lookover; we can probably release as is to get this to our users and can then ponder in parallel if there is a need (and time !!) for additional work from our end. I have been musing if we should maybe think about adding simple accessors for utf-validation.
/cc @lemire @jkeiser with big thanks as always