Fixing charconv static assert for Microsoft STL C++20. #16

can1357 · 2020-09-28T00:28:05Z

Hey!

I've just found your library for parsing YAML files after getting tired of waiting 10 seconds to parse a 20mb YAML file with some other library and been trying to build this on Clang C++20 with Microsoft STL, but ran into this issue where enum class chars_format is not strictly char so it ends up breaking this static assert (since sizeof(...) ends up 16). After the fix it should build for C++20 as well, C++11 was working fine eitherway it seems.
C4_STATIC_ASSERT(sizeof(fmt) == _FTOA_COUNT);

P.S. I had to include the <iterator> header for std::size, not sure if the changes fit the coding style of this repo since I've just cloned it so my apologies if it doesn't.

coveralls · 2020-09-28T00:37:00Z

Pull Request Test Coverage Report for Build 380

0 of 0 changed or added relevant lines in 0 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage remained the same at 95.664%

Totals
Change from base Build 376:	0.0%
Covered Lines:	1699
Relevant Lines:	1776

💛 - Coveralls

biojppm · 2020-09-29T20:27:01Z

Hey @can1357 - thanks for addressing this!

Actually, I'd like to avoid the <iterator> include as I've seen some reports that it is costly. So instead of std::size, let's use C4_COUNTOF() as the alternative.

So if you do that change, I'd be happy to merge this in.

biojppm · 2020-09-29T20:34:03Z

Thanks!

can1357 · 2020-09-29T20:37:36Z

std::size being constexpr, it shouldn't really be relevant but stl iterators are indeed slow in general so I understand avoiding anything that has to do with it completely. I've pushed the style change, should be g2g.

P.S. A little off topic but just so you have another reference, I'm dealing with some large YAML files so I've switched from yaml-cpp to ryml and it went from an approx. speed of 0.5mb/s to 111mb/s, a total life saver. Thank you for your work, it has been very helpful.

biojppm · 2020-09-29T21:16:20Z

You are right that std::size is lightweight and indeed it would be the ideal solution here.

It's the <iterator> include itself that I'd prefer not to have here. For application-level code, I guess it could be fine, but not here.

As for the speedup, that's incredible. On my benchmarks the best I've seen was ~100x speedup, but you got twice that! If possible, could you share your file(s) with me? I'd be interested in profiling the parse sometime in the future.

biojppm · 2020-09-29T21:50:21Z

I'm really intrigued by the 200x speedup. Is it apples to apples? Eg, maybe you are comparing optimized (ryml) vs non-optimized (yaml-cpp)?

can1357 · 2020-10-01T05:42:45Z

I'm really intrigued by the 200x speedup. Is it apples to apples? Eg, maybe you are comparing optimized (ryml) vs non-optimized (yaml-cpp)?

Of course both are O3, only difference from your tests would be that I process the data I suppose (though even the initial parsing time is also something similar), also random access to large arrays from indexes as references stored in other fields. I'm processing llvm-pdbtool's yaml output on some PDBs if you want the samples. It indeed is pretty incredible because previously I was unable to test any changes in release before waiting minutes, debug being totally out of question haha.

This was ofc without any changes to my code, I simply implemented NodeRef::as mapped to >> to match the API and replaced the types.

biojppm · 2020-10-01T10:42:20Z

So let me get this right:

parsing speedup is inline with overall speedup (~200x)?
subsequent traversal speedup and deserialization are also inline with overall speedup?

If so, it is almost too good to be true, and I'd like to investigate. Sorry for insisting, but could you provide the largest file you have? A link is fine, even a temporary one. I'd like to profile, and of course I can generate one myself, but it will be different from yours and then the numbers will be different, whereas I'd prefer to be able to reproduce the numbers you're seeing.

can1357 · 2020-10-01T14:07:03Z

Here's the sample.

combase.pdb.zip

biojppm · 2020-10-01T15:52:04Z

I just ran the rapidyaml parse benchmark (parse only). The results are more in line with what I usually see:

On vs2019:

file: D:/proj/rapidyaml/bm/cases/combase.pdb.yaml
Run on (12 X 3400 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x6)
  L1 Instruction 32 KiB (x6)
  L2 Unified 256 KiB (x6)
  L3 Unified 15360 KiB (x1)
---------------------------------------------------------------------------
Benchmark                 Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------
yamlcpp_ro       23089836500 ns   23093750000 ns            1 bytes_per_second=6.05868M/s
libyaml_ro       4727608400 ns   4718750000 ns            1 bytes_per_second=29.6514M/s
libyaml_ro_reuse 4725386900 ns   4718750000 ns            1 bytes_per_second=29.6514M/s
ryml_ro          2854529200 ns   2859375000 ns            1 bytes_per_second=48.9329M/s items_per_second=1.40594M/s
ryml_rw          2715757900 ns   2718750000 ns            1 bytes_per_second=51.464M/s items_per_second=1.47866M/s
ryml_ro_reuse    2479569100 ns   2484375000 ns            1 bytes_per_second=56.319M/s items_per_second=1.61816M/s
ryml_rw_reuse    1685076200 ns   1687500000 ns            1 bytes_per_second=82.9141M/s items_per_second=2.38228M/s

So a ~10x-16x speedup from yaml-cpp and ~1.5x-3x speedup from libyaml.

Of course if the tree traversal has significant logic including lookup for remote nodes, then the figures will improve a lot for rapidyaml, as the data is contiguous so the number of cache misses will be significantly lower than yaml-cpp. 200x could be possible to see in such situations, but can you confirm the figures you reported above? I want to be able to brag :-)

can1357 · 2020-10-01T19:00:25Z

I just ran the rapidyaml parse benchmark (parse only). The results are more in line with what I usually see:

On vs2019:
file: D:/proj/rapidyaml/bm/cases/combase.pdb.yaml
Run on (12 X 3400 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x6)
  L1 Instruction 32 KiB (x6)
  L2 Unified 256 KiB (x6)
  L3 Unified 15360 KiB (x1)
---------------------------------------------------------------------------
Benchmark                 Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------
yamlcpp_ro       23089836500 ns   23093750000 ns            1 bytes_per_second=6.05868M/s
libyaml_ro       4727608400 ns   4718750000 ns            1 bytes_per_second=29.6514M/s
libyaml_ro_reuse 4725386900 ns   4718750000 ns            1 bytes_per_second=29.6514M/s
ryml_ro          2854529200 ns   2859375000 ns            1 bytes_per_second=48.9329M/s items_per_second=1.40594M/s
ryml_rw          2715757900 ns   2718750000 ns            1 bytes_per_second=51.464M/s items_per_second=1.47866M/s
ryml_ro_reuse    2479569100 ns   2484375000 ns            1 bytes_per_second=56.319M/s items_per_second=1.61816M/s
ryml_rw_reuse    1685076200 ns   1687500000 ns            1 bytes_per_second=82.9141M/s items_per_second=2.38228M/s
So a ~10x-16x speedup from yaml-cpp and ~1.5x-3x speedup from libyaml.

Of course if the tree traversal has significant logic including lookup for remote nodes, then the figures will improve a lot for rapidyaml, as the data is contiguous so the number of cache misses will be significantly lower than yaml-cpp. 200x could be possible to see in such situations, but can you confirm the figures you reported above? I want to be able to brag :-)

If you've noticed the data structure most of it is just indexes back into the root so there's definitely a LOT of random lookups, and I can assure you it's nowhere near 10x or 16x when doing real work with it. Not exaggration by any means either since it got me angry enough to switch a library mid-project since I was unable to debug in release(!) builds having to sit in front of the computer waiting for it to load whereas ever since I am dealing with even more files without any issues.

I did a few speed measurements before switching completely to ryml and those were the figures I've got, the other library is long gone from the project by now so I'd have to integrate it again to get a new measurement. I can probably try isolating a benchmark from the project I'm working on sometime but have a little bit too much to do right now so would take a while; with any kind of basic parsing of the data I'm somewhat certain you'd get similar results though.

biojppm · 2020-10-01T20:53:16Z

Thanks for replying.

I can probably try isolating a benchmark from the project I'm working on sometime

Please, no point going through that just to put some figures down. I just wanted to confirm that the numbers you gave were right. Usually I see between 10x to 70x speedup vs yaml-cpp -- which if you think about it, is already amazing: just imagine that instead of buying one house now you can buy 70 houses! The best I had seen before was 100x. But 200x is almost unbelievable; we tend to be skeptic of such results -- while our ego wants to believe that, our reasoning starts trying to poke holes in it.

Again, thanks for being patient with my prompting. Now the ego has free roam!

can1357 added 3 commits September 28, 2020 02:12

Use std::size instead of sizeof, breaks on MS STL + clang10.

c944e36

No algo so fix again.

6e84cbe

Just include iterator.

8f304e4

Replace std::size with C4_COUNTOF.

28466ae

biojppm merged commit 2fe974d into biojppm:master Sep 29, 2020

biojppm mentioned this pull request Sep 30, 2020

compile C++20 in msvc #17

Merged

This was referenced Oct 7, 2020

Static Assertion Failure #20

Closed

update c4core biojppm/rapidyaml#86

Merged

Static Assertion Failing at charconv.cpp:144 on VS2019 #19

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing charconv static assert for Microsoft STL C++20. #16

Fixing charconv static assert for Microsoft STL C++20. #16

can1357 commented Sep 28, 2020

coveralls commented Sep 28, 2020 •

edited

Loading

biojppm commented Sep 29, 2020

biojppm commented Sep 29, 2020

can1357 commented Sep 29, 2020

biojppm commented Sep 29, 2020 •

edited

Loading

biojppm commented Sep 29, 2020 •

edited

Loading

can1357 commented Oct 1, 2020 •

edited

Loading

biojppm commented Oct 1, 2020

can1357 commented Oct 1, 2020

biojppm commented Oct 1, 2020 •

edited

Loading

can1357 commented Oct 1, 2020

biojppm commented Oct 1, 2020

Fixing charconv static assert for Microsoft STL C++20. #16

Fixing charconv static assert for Microsoft STL C++20. #16

Conversation

can1357 commented Sep 28, 2020

coveralls commented Sep 28, 2020 • edited Loading

Pull Request Test Coverage Report for Build 380

💛 - Coveralls

biojppm commented Sep 29, 2020

biojppm commented Sep 29, 2020

can1357 commented Sep 29, 2020

biojppm commented Sep 29, 2020 • edited Loading

biojppm commented Sep 29, 2020 • edited Loading

can1357 commented Oct 1, 2020 • edited Loading

biojppm commented Oct 1, 2020

can1357 commented Oct 1, 2020

biojppm commented Oct 1, 2020 • edited Loading

can1357 commented Oct 1, 2020

biojppm commented Oct 1, 2020

coveralls commented Sep 28, 2020 •

edited

Loading

biojppm commented Sep 29, 2020 •

edited

Loading

biojppm commented Sep 29, 2020 •

edited

Loading

can1357 commented Oct 1, 2020 •

edited

Loading

biojppm commented Oct 1, 2020 •

edited

Loading