Added C++ benchmark. #1525

haberman · 2016-05-12T00:54:33Z

Here are initial benchmark results:

Run on (12 X 3201 MHz CPU s)
2016-05-11 17:49:45
Benchmark                                     Time           CPU Iterations
---------------------------------------------------------------------------
google_message1_proto2_parse_noarena        274 ns        274 ns    2773101   792.218MB/s
google_message1_proto2_parse_arena          996 ns        993 ns     707578   218.903MB/s
google_message1_proto2_serialize            155 ns        156 ns    4489021   1.36459GB/s
google_message1_proto3_parse_noarena        520 ns        519 ns    1268185   419.151MB/s
google_message1_proto3_parse_arena         1204 ns       1205 ns     604370   180.504MB/s
google_message1_proto3_serialize            293 ns        292 ns    2403943   722.365MB/s
google_message2_parse_noarena            125942 ns     126397 ns       5557   638.088MB/s
google_message2_parse_arena              284564 ns     285310 ns       2464   282.683MB/s
google_message2_serialize                 94871 ns      94737 ns      10123   851.324MB/s

haberman · 2016-05-12T18:01:27Z

Review to @xfxyjwf.

haberman · 2016-05-12T21:26:10Z

retest this please

xfxyjwf · 2016-05-18T20:59:15Z

benchmarks/cpp_benchmark.cc

+    while (state.KeepRunning()) {
+      const std::string& payload = payloads_[i.Next()];
+      total += payload.size();
+      m->ParseFromString(payload);


In the ArenaParseFixture, a new message is created in every parsing loop, however, in this NoArenaParseFixture, you are reusing the same message. A more fair comparison is probably recreating the message in this parsing loop as well.

If we want to benchmark the case where a message is reused, I guess we can change the ArenaParseFixture to something like:

Arena arena; Message* m = Arena::CreateMessage<T>(&arena); while () { const std::string& payload = payloads_[i.Next()]; total += payload.size(); m->ParseFromString(payload); if (counter++ % kArenaThreshold == 0) { arena.reset(); Message* m = Arena::CreateMessage<T>(&arena); } }

To make this more fair, I split the "NoArena" case into two: one that creates a message from scratch (parse_new) and one that reuses an existing message (parse_reuse).

I'm not sure it makes sense to allocate multiple top-level messages in a single arena, but reset it periodically. Does anybody use arenas this way?

If you can point me to some real-world uses of arena that work this way, I'll update the benchmark (or maybe add a new one for that pattern).

haberman · 2016-09-23T17:46:57Z

Ping @xfxyjwf , cc @gerben-s.

haberman · 2016-09-23T17:50:25Z

Results on my desktop:

Run on (12 X 3201 MHz CPU s)
2016-09-23 10:49:28
Benchmark                                      Time           CPU Iterations
----------------------------------------------------------------------------
google_message1_proto2_parse_new             602 ns        604 ns    1150294    359.99MB/s
google_message1_proto2_parse_reuse           255 ns        254 ns    2665418   855.155MB/s
google_message1_proto2_parse_newarena        926 ns        926 ns     769950   234.796MB/s
google_message1_proto2_serialize             165 ns        165 ns    4214354   1.28456GB/s
google_message1_proto3_parse_new             828 ns        825 ns     855202   263.405MB/s
google_message1_proto3_parse_reuse           471 ns        470 ns    1476628   462.167MB/s
google_message1_proto3_parse_newarena       1046 ns       1049 ns     659339   207.186MB/s
google_message1_proto3_serialize             231 ns        232 ns    2993231   909.373MB/s
google_message2_parse_new                 318212 ns     317106 ns       2223   254.338MB/s
google_message2_parse_reuse               113398 ns     113764 ns       6129   708.942MB/s
google_message2_parse_newarena            252076 ns     252894 ns       2802   318.918MB/s
google_message2_serialize                  65855 ns      65689 ns      10722   1.19901GB/s

xfxyjwf · 2016-09-23T17:53:33Z

LGTM

Please squash the commits before merging.

gerben-s · 2016-09-23T17:57:17Z

benchmarks/cpp_benchmark.cc

+  WrappingCounter(size_t limit) : value_(0), limit_(limit) {}
+
+  size_t Next() {
+    size_t ret = value_;


(value + 1) % limit

I think what I currently have is much faster. "limit" isn't a compile-time constant, so % will turn into a real idiv instruction, which is very slow. Mine is a single extremely predictable branch.

I consider this the wrong abstraction of the above one-liner.

If you want to do this wrapping as an abstraction than just abstract the whole payload.

const string& NextPayload() { ...}

I disagree. I think what I have is simpler. It doesn't need to know anything about the type or storage or lifetime of the things being iterated over. It is just a simple wrapping counter.

gerben-s

LGTM overall minor comment

googlebot added the cla: yes label May 12, 2016

xfxyjwf reviewed May 18, 2016
View reviewed changes

haberman added 6 commits September 21, 2016 17:38

Added C++ benchmark.

51ed8e0

Added support for building benchmarks in tests.sh and Dockerfile.

c0239bc

Fixed path for building benchmarks.

cea1470

Oops, git wants "git submodule init" first.

e2e0b8e

Fix path again.

653c2d1

Updated Travis dist to Trusty to get newer cmake.

2bd8724

haberman force-pushed the cppbenchmark branch from 4564c7e to 2bd8724 Compare September 22, 2016 01:18

haberman added 2 commits September 21, 2016 18:19

Added parse test that allocates from scratch.

0fc7a21

Updates from code review comments.

ae4dd3c

gerben-s reviewed Sep 23, 2016

View reviewed changes

haberman merged commit a289d43 into protocolbuffers:master Sep 23, 2016

haberman deleted the cppbenchmark branch August 15, 2019 12:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added C++ benchmark. #1525

Added C++ benchmark. #1525

haberman commented May 12, 2016

haberman commented May 12, 2016

haberman commented May 12, 2016

xfxyjwf May 18, 2016

haberman Sep 22, 2016

haberman commented Sep 23, 2016

haberman commented Sep 23, 2016

xfxyjwf commented Sep 23, 2016

gerben-s Sep 23, 2016

haberman Sep 23, 2016

gerben-s Sep 23, 2016

haberman Sep 23, 2016

gerben-s left a comment

Added C++ benchmark. #1525

Added C++ benchmark. #1525

Conversation

haberman commented May 12, 2016

haberman commented May 12, 2016

haberman commented May 12, 2016

xfxyjwf May 18, 2016

Choose a reason for hiding this comment

haberman Sep 22, 2016

Choose a reason for hiding this comment

haberman commented Sep 23, 2016

haberman commented Sep 23, 2016

xfxyjwf commented Sep 23, 2016

gerben-s Sep 23, 2016

Choose a reason for hiding this comment

haberman Sep 23, 2016

Choose a reason for hiding this comment

gerben-s Sep 23, 2016

Choose a reason for hiding this comment

haberman Sep 23, 2016

Choose a reason for hiding this comment

gerben-s left a comment

Choose a reason for hiding this comment