Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing can become accidentally quadratic because of sscanf #40

Closed
leoetlino opened this issue Feb 10, 2020 · 6 comments · Fixed by #94
Closed

Parsing can become accidentally quadratic because of sscanf #40

leoetlino opened this issue Feb 10, 2020 · 6 comments · Fixed by #94

Comments

@leoetlino
Copy link
Contributor

leoetlino commented Feb 10, 2020

While investigating a performance issue in a library I am working on, I noticed that parsing a particular document (warning: very large file) causes the program to spend a lot of time in memchr (roughly 70% of its time). Pausing execution in gdb gives the following stack trace: (this is an image to keep the colors)

image

to the point that parsing is only ~3x faster than doing it with PyYAML. The speedup I get for other files is >10x.

It looks like the problem is that reading floats is done using c4::from_chars -> c4::atof -> sscanf which ultimately calls _IO_str_init_static_internal in the glibc implementation. As far as I understand, this essentially causes strlen to be called on the string that rapidyaml passes to it, essentially causing strlen to be called O(n^2) times.

Since there are a lot of floats throughout the entire document, parsing appears to become accidentally quadratic.

Not sure how to fix this. Even using the standard atof might not help since that function doesn't take the length either...

@biojppm
Copy link
Owner

biojppm commented Feb 11, 2020

First of all, I'm happy that as a matter of course ryml is already expected to be faster than alternatives by at least 10x, to the point that a mere 3x speedup is considered an issue!

But of course, this is a nasty issue, so thanks for detecting it and reporting.

The stringifying landscape in C/C++ is bleak, even after 50 years of language life. Stringifying/destringifying floats is really hard, and it took until C++17 to have that with a non-zero-terminated bounded string.

Are you using C++17 or later? I've been meaning to use std::to_chars() and std::from_chars() in these atof()/ftoa() functions, so I will take the opportunity and do it now. Maybe this will solve your problem.

But anyway, ryml has to support everything down to and including C++11, and here we have no recourse in the standard. You've likely looked into these c4::atof() functions, and seen the song and dance they have to perform to guarantee boundedness. And now it looks as glibc's sscanf() is looking for a non-existing terminating zero character, so it fails (and is likely incurring a read-after-end bug).

I'm going to look for alternatives to snprintf()/sscanf() for float types; I was already thinking about doing this. The first one is hard to find (because it's really hard to do) and I won't even attempt to write it, but the second one is easier. Maybe I'll implement snscanf() and just keep the calls to snprintf().

Also, bear in mind that this issue is with the c4core repo. I'm tracking the issue there, and will close this after the c4core submodule is updated here.

@leoetlino
Copy link
Contributor Author

leoetlino commented Feb 12, 2020

Oh right, I forgot about the C++17 or later only std::from_chars/to_chars... Yeah it's unfortunate that there doesn't seem to be any standard function that can handle non-zero terminated strings until C++17. As a workaround I am currently getting the value from the NodeRef manually and since I am already using abseil I simply used SimpleAtod (which internally uses a port of the C++17 from_chars).

@biojppm
Copy link
Owner

biojppm commented Sep 25, 2020

Quick note to say I have not forgotten this and I mean to return to it when I can.

@biojppm
Copy link
Owner

biojppm commented Nov 22, 2020

@leoetlino I finally found something suitable, and also exceedingly faster than sscanf: fast_float. It should fix your problem.

@dataf3l
Copy link

dataf3l commented Oct 4, 2021

I'll ask the stupid question:
does it make sense to use a preprocessor directive to select which c++ standard to use, and then proceed to use the right function on each case?

@biojppm
Copy link
Owner

biojppm commented Oct 4, 2021

@dataf3l exactly something like that is in effect: see the code here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants