Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StringToDoubleConverter::StringToDouble slower than dtoa #153

Open
jandem opened this issue Feb 12, 2021 · 3 comments
Open

StringToDoubleConverter::StringToDouble slower than dtoa #153

jandem opened this issue Feb 12, 2021 · 3 comments

Comments

@jandem
Copy link
Contributor

jandem commented Feb 12, 2021

I'm trying to switch the string-to-double code in SpiderMonkey from dtoa.c to StringToDoubleConverter. This code is used by the JS parser, JSON parser and parseFloat, so is somewhat perf sensitive. The code I'm using (slightly simplified to show only the 1-byte-chars code):

  using SToDConverter = double_conversion::StringToDoubleConverter;
  SToDConverter converter(SToDConverter::ALLOW_TRAILING_JUNK, emptyStringValue,
                          /* junk_string_value = */ JS::GenericNaN(),
                          "Infinity", "NaN");
  int processed = 0;
  double d = converter.StringToDouble(reinterpret_cast<const char*>(s), length, &processed);

This works and passes all tests, but is quite a bit slower than dtoa.c when the string contains more than a few digits. Some micro-benchmark results:

code (10 million times) dtoa double-conversion
parseFloat("1.1") 318 ms 402 ms
parseFloat("3.14") 335 ms 423 ms
parseFloat("12345.678") 385 ms 513 ms
parseFloat((i % 1024) + ".5") 617 ms 725 ms

A big part of this is that StringToIeee supports a number of flags and this adds overhead: if I add a const int flags_ = ALLOW_TRAILING_JUNK to the start of that function, as a hack to let the C++ compiler eliminate branches, I get better times (346, 370, 453, 665 ms for the ones above). Still not quite as fast as dtoa, but it closes half of the gap. The JSC folks ran into this too, they changed the code to eliminate those branches (and used a static method instead of the converter instance).

Is there anything that could be done to provide a faster implementation for this use case?

I also don't know yet where the rest of the slowdown is from. I know dtoa.c is horrible code so maybe this was an intentional trade-off to reduce complexity?

@floitsch
Copy link
Collaborator

The string-to-double part of the library isn't really optimized. There are many ways to make it faster. Mostly, probably, by optimizing the code (as you already noticed), but also by finding simpler shortcuts.

I don't have the time to optimize the code myself, but I would review patches.

For optimal performance we might want to use C++'s template capabilities to remove the unnecessary branches. That would make the library harder to use for C projects though...

@jandem
Copy link
Contributor Author

jandem commented Feb 14, 2021

Thanks. I have patches that close most of the gap on this, I'll create some PRs the coming days.

@jandem
Copy link
Contributor Author

jandem commented Feb 20, 2021

#154 improved this a bit. I have one other patch that allows eliminating most flag checks, I'll polish that and post it the coming week. Hopefully with that I can switch over SpiderMonkey's string-to-double code.

(I realized an issue SpiderMonkey's dtoa-based code currently has is that it copies strings to a temporary char Vector, but websites sometimes pass fairly long non-numeric garbage strings to parseFloat. Moving to double-conversion is the simplest way to fix that perf cliff so is appealing...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants