-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing charconv static assert for Microsoft STL C++20. #16
Conversation
Hey @can1357 - thanks for addressing this! Actually, I'd like to avoid the So if you do that change, I'd be happy to merge this in. |
Thanks! |
std::size being constexpr, it shouldn't really be relevant but stl iterators are indeed slow in general so I understand avoiding anything that has to do with it completely. I've pushed the style change, should be g2g. P.S. A little off topic but just so you have another reference, I'm dealing with some large YAML files so I've switched from |
You are right that It's the As for the speedup, that's incredible. On my benchmarks the best I've seen was ~100x speedup, but you got twice that! If possible, could you share your file(s) with me? I'd be interested in profiling the parse sometime in the future. |
I'm really intrigued by the 200x speedup. Is it apples to apples? Eg, maybe you are comparing optimized (ryml) vs non-optimized (yaml-cpp)? |
Of course both are O3, only difference from your tests would be that I process the data I suppose (though even the initial parsing time is also something similar), also random access to large arrays from indexes as references stored in other fields. I'm processing llvm-pdbtool's yaml output on some PDBs if you want the samples. It indeed is pretty incredible because previously I was unable to test any changes in release before waiting minutes, debug being totally out of question haha. This was ofc without any changes to my code, I simply implemented NodeRef::as mapped to >> to match the API and replaced the types. |
So let me get this right:
If so, it is almost too good to be true, and I'd like to investigate. Sorry for insisting, but could you provide the largest file you have? A link is fine, even a temporary one. I'd like to profile, and of course I can generate one myself, but it will be different from yours and then the numbers will be different, whereas I'd prefer to be able to reproduce the numbers you're seeing. |
Here's the sample. |
I just ran the rapidyaml parse benchmark (parse only). The results are more in line with what I usually see: On vs2019:
So a ~10x-16x speedup from yaml-cpp and ~1.5x-3x speedup from libyaml. Of course if the tree traversal has significant logic including lookup for remote nodes, then the figures will improve a lot for rapidyaml, as the data is contiguous so the number of cache misses will be significantly lower than yaml-cpp. 200x could be possible to see in such situations, but can you confirm the figures you reported above? I want to be able to brag :-) |
If you've noticed the data structure most of it is just indexes back into the root so there's definitely a LOT of random lookups, and I can assure you it's nowhere near 10x or 16x when doing real work with it. Not exaggration by any means either since it got me angry enough to switch a library mid-project since I was unable to debug in release(!) builds having to sit in front of the computer waiting for it to load whereas ever since I am dealing with even more files without any issues. I did a few speed measurements before switching completely to ryml and those were the figures I've got, the other library is long gone from the project by now so I'd have to integrate it again to get a new measurement. I can probably try isolating a benchmark from the project I'm working on sometime but have a little bit too much to do right now so would take a while; with any kind of basic parsing of the data I'm somewhat certain you'd get similar results though. |
Thanks for replying.
Please, no point going through that just to put some figures down. I just wanted to confirm that the numbers you gave were right. Usually I see between 10x to 70x speedup vs yaml-cpp -- which if you think about it, is already amazing: just imagine that instead of buying one house now you can buy 70 houses! The best I had seen before was 100x. But 200x is almost unbelievable; we tend to be skeptic of such results -- while our ego wants to believe that, our reasoning starts trying to poke holes in it. Again, thanks for being patient with my prompting. Now the ego has free roam! |
Hey!
I've just found your library for parsing YAML files after getting tired of waiting 10 seconds to parse a 20mb YAML file with some other library and been trying to build this on Clang C++20 with Microsoft STL, but ran into this issue where
enum class chars_format
is not strictly char so it ends up breaking this static assert (sincesizeof(...)
ends up 16). After the fix it should build for C++20 as well, C++11 was working fine eitherway it seems.C4_STATIC_ASSERT(sizeof(fmt) == _FTOA_COUNT);
P.S. I had to include the
<iterator>
header forstd::size
, not sure if the changes fit the coding style of this repo since I've just cloned it so my apologies if it doesn't.