Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate parsing into user structures directly #627

Closed
pdimov opened this issue Oct 2, 2021 · 8 comments · Fixed by #924
Closed

Investigate parsing into user structures directly #627

pdimov opened this issue Oct 2, 2021 · 8 comments · Fixed by #924

Comments

@pdimov
Copy link
Member

pdimov commented Oct 2, 2021

E.g.

std::vector<X> v;
std::string_view json = /*...*/;

boost::json::parse_into( v, json );

where X is something that Boost.Json recognizes, such as a struct with value_to defined, or (given #626) a described struct.

As an example of why this is useful, on one popular benchmark

https://github.com/kostya/benchmarks#json
https://github.com/kostya/benchmarks/blob/master/json/test_boost_json.cpp

having the ability to parse directly into either std::vector<coordinate> or a custom container that accumulates in its push_back member function instead of storing the elements could potentially gain a lot of performance.

@pdimov
Copy link
Member Author

pdimov commented Oct 3, 2021

I made a prototype, https://gist.github.com/pdimov/bcdaa4fef606fdcd0f21195a79bf8648

Unfortunately it's slower. But still cooler. :-)

1.json: 74181420 bytes
boost::json::parse: 456 ms
  x: -5.00063e-30, y: 5.00037e+30, z: 0.500451: 115 ms
parse_into: 702 ms
  x: -5.00063e-30, y: 5.00037e+30, z: 0.500451: 2 ms

@madmongo1
Copy link
Collaborator

The manual loop, parsing via the JSON DOM is not performing error-checking on the types in the DOM. It'll exhibit UB if the JSON is valid but not correctly representing the co-ordinate array, so I think it's getting a better score than it should.

@pdimov
Copy link
Member Author

pdimov commented Oct 3, 2021

It's copied from the benchmark code.

This needs to be refactored to contain the subhandlers inline in the composite handlers, rather than allocating them with new each time. ("Structured concurrency" :-) ) (Also, needs a proper error enum. And a handler for tuples. And perform range checking on the numbers. I ran out of night.)

@pdimov
Copy link
Member Author

pdimov commented Oct 3, 2021

Also needs a handler for json::value, so that json files that have random unstructured parts can still fit into a structure. I had to remove the opts part of the original json generated by https://github.com/kostya/benchmarks/blob/ef8dbdec90bbcb3b242da8211ff7db5396e23979/json/generate_json.rb because it's too chaotic.

(opts would have to be a map<string, pair<int, bool>>, and I don't have support for pair. I could have declared an extra struct for that I suppose, but that's kind of not the spirit of the benchmark - opts are clearly intended as free form json object.)

@pdimov
Copy link
Member Author

pdimov commented Oct 3, 2021

This needs to be refactored to contain the subhandlers inline in the composite handlers, rather than allocating them with new each time. ("Structured concurrency" :-) ) (Also, needs a proper error enum. And a handler for tuples. And perform range checking on the numbers. I ran out of night.)

https://gist.github.com/pdimov/c74627565d35cdff97d7f4d0ae7a7b00

This one is faster.

1.json: 74181420 bytes
boost::json::parse: 431 ms
  x: -5.00063e-30, y: 5.00037e+30, z: 0.500451: 131 ms
parse_into coordinates: 234 ms
  x: -5.00063e-30, y: 5.00037e+30, z: 0.500451: 2 ms
parse_into coordinates2: 182 ms
  x: 0.500451, y: 5.00037e+30, z: 0.500451: 0 ms

@vinniefalco
Copy link
Member

It needs its own suite of parsing functions, e.g.

parse_object_into
parse_array_into
parse_string_into
parse_number_into
parse_bool_into
parse_null_into

Using pull-parsing with this signature

template< typename V >
bool
parse_string_into(
  char const*& begin,
  char const* end,
  V& v,
  error_code& ec );

@vinniefalco
Copy link
Member

template<class V> using parser_for = basic_parser<detail::handler<V>>;

@vinniefalco
Copy link
Member

there should probably be is_bool trait for consistency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants