Skip to content

Conversation

fedormsv
Copy link

functions for parsing directly from string, without unnecessary pass through string->stringstream->string pipeline

@coveralls
Copy link

Coverage Status

Coverage increased (+0.008%) to 93.81% when pulling a9025c7 on fedormsv:parseFromString into 45733df on open-source-parsers:master.

@BillyDonahue
Copy link
Contributor

i don't see how this is an optimization.

it looks like just adding some new parse variants. I don't really want to make a parse variant for each storage format a user might have, so we have a stream based parse and we have a contiguous char array parse. We don't favor String as source of that buffer. Users are free to write wrappers.

@fedormsv
Copy link
Author

fedormsv commented Sep 16, 2020

if I have a string to parse Json from, I should actually do the following (non throwing version):

std::string stringToParse;
std::istringstream iss(stringToParse);
Json::Value value;
Json::parseFromStream(builder, iss, &value, nullptr);

With the parseFromString being as following

bool parseFromStream(CharReader::Factory const& fact, IStream& sin, Value* root,
                     String* errs) {
  OStringStream ssin;
  ssin << sin.rdbuf();
  String doc = ssin.str();
  char const* begin = doc.data();
  char const* end = begin + doc.size();
  // Note that we do not actually need a null-terminator.
  CharReaderPtr const reader(fact.newCharReader());
  return reader->parse(begin, end, root, errs);
}

we actually arrive to the following: outside the parseFromStream we wrap a string to a stringstream. Inside parseFromStream we are redirecting it to another stringstream, from which it gets copied to doc object. The latter is then used to get begin and end of the string buffer.

@BillyDonahue
Copy link
Contributor

if I have a string to parse Json from, I should actually do the following (non throwing version):

std::string stringToParse;
std::istringstream iss(stringToParse);
Json::Value value;
Json::parseFromStream(builder, iss, &value, nullptr);

This isn't what a user with a string should do.
They should make a reader and call reader->parse with the string bounds:

std::unique_ptr<Json::CharReader> reader(builder.newCharReader());

reader->parse(
        stringToParse.data(),
        stringToParse.data() + stringToParse.size(),
        &value,
        nullptr);

parseFromStream is just capturing a common pattern for a convenient but inefficient dump of an istream. That pattern is inconvenient to formulate from scratch so we wrote it down, but it's not a primary API for Jsoncpp. The reader range parse is really the fundamental kind of parse, and a user with a string can use it readily.

With the parseFromString being as following

bool parseFromStream(CharReader::Factory const& fact, IStream& sin, Value* root,
                     String* errs) {
  OStringStream ssin;
  ssin << sin.rdbuf();
  String doc = ssin.str();
  char const* begin = doc.data();
  char const* end = begin + doc.size();
  // Note that we do not actually need a null-terminator.
  CharReaderPtr const reader(fact.newCharReader());
  return reader->parse(begin, end, root, errs);
}

we actually arrive to the following: outside the parseFromStream we wrap a string to a stringstream. Inside parseFromStream we are redirecting it to another stringstream, from which it gets copied to doc object. The latter is then used to get begin and end of the string buffer.

@fedormsv
Copy link
Author

If I compare

std::unique_ptr<Json::CharReader> reader(builder.newCharReader());

reader->parse(
        stringToParse.data(),
        stringToParse.data() + stringToParse.size(),
        &value,
        nullptr);

with
parseFromString(builder, stringToParse, &value, nullptr);

I will say length as well as one reference to stringToParse makes difference.

And it makes even more sense comparing to now deprecated now deprecated API where it was possible to have
Json::Reader().parse(stringToParse, value);

So it evolved to dynamic allocation instead of stack, manual memory management (with or without smart pointers) and function call that needs refering the same string 3 times.

Wrapping at least memory management and references to string seemed natural.
I would probably refactor newCharReader to return unique_ptr, but I understand it's potentially problematic for DLL build.

@fedormsv
Copy link
Author

BTW, my PR was based on the suggestion to submit it in #1209.
I'm ok with retracting

@cdunn2001
Copy link
Contributor

My only concern would be to avoid binary incompatible changes, and I don't see any of that. This should have a minor version bump though.

@BillyDonahue , you've looked at the code pretty closely. You can merge or close this.

@BillyDonahue
Copy link
Contributor

I'm gonna close it.

@cdunn2001
Copy link
Contributor

Sorry, @fedormsv. I thought it sounded like a good idea. I didn't mean to waste your time. We do appreciate the help.

@BillyDonahue
Copy link
Contributor

Same here.

I just think that taking a String, one of our typedefs, is not quite right.
And std::string is convertible from other types, which would make a string copy happen anyway, and we were trying to avoid that. I feel we should emphasize the primary parse mechanism being a contiguous range of const char* and not get into how that range is stored or allocated. In C++17 a std::string_view would be a good choice but we aren't there yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants