Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added C++ benchmark. #1525

Merged
merged 8 commits into from Sep 23, 2016
Merged

Added C++ benchmark. #1525

merged 8 commits into from Sep 23, 2016

Conversation

haberman
Copy link
Member

Here are initial benchmark results:

Run on (12 X 3201 MHz CPU s)
2016-05-11 17:49:45
Benchmark                                     Time           CPU Iterations
---------------------------------------------------------------------------
google_message1_proto2_parse_noarena        274 ns        274 ns    2773101   792.218MB/s
google_message1_proto2_parse_arena          996 ns        993 ns     707578   218.903MB/s
google_message1_proto2_serialize            155 ns        156 ns    4489021   1.36459GB/s
google_message1_proto3_parse_noarena        520 ns        519 ns    1268185   419.151MB/s
google_message1_proto3_parse_arena         1204 ns       1205 ns     604370   180.504MB/s
google_message1_proto3_serialize            293 ns        292 ns    2403943   722.365MB/s
google_message2_parse_noarena            125942 ns     126397 ns       5557   638.088MB/s
google_message2_parse_arena              284564 ns     285310 ns       2464   282.683MB/s
google_message2_serialize                 94871 ns      94737 ns      10123   851.324MB/s

@haberman
Copy link
Member Author

Review to @xfxyjwf.

@haberman
Copy link
Member Author

retest this please

while (state.KeepRunning()) {
const std::string& payload = payloads_[i.Next()];
total += payload.size();
m->ParseFromString(payload);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the ArenaParseFixture, a new message is created in every parsing loop, however, in this NoArenaParseFixture, you are reusing the same message. A more fair comparison is probably recreating the message in this parsing loop as well.

If we want to benchmark the case where a message is reused, I guess we can change the ArenaParseFixture to something like:

Arena arena;
Message* m = Arena::CreateMessage<T>(&arena);
while () {
  const std::string& payload = payloads_[i.Next()];
  total += payload.size();
  m->ParseFromString(payload);
  if (counter++ % kArenaThreshold == 0) {
    arena.reset();
    Message* m = Arena::CreateMessage<T>(&arena);
  }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this more fair, I split the "NoArena" case into two: one that creates a message from scratch (parse_new) and one that reuses an existing message (parse_reuse).

I'm not sure it makes sense to allocate multiple top-level messages in a single arena, but reset it periodically. Does anybody use arenas this way?

If you can point me to some real-world uses of arena that work this way, I'll update the benchmark (or maybe add a new one for that pattern).

@haberman
Copy link
Member Author

Ping @xfxyjwf , cc @gerben-s.

@haberman
Copy link
Member Author

Results on my desktop:

Run on (12 X 3201 MHz CPU s)
2016-09-23 10:49:28
Benchmark                                      Time           CPU Iterations
----------------------------------------------------------------------------
google_message1_proto2_parse_new             602 ns        604 ns    1150294    359.99MB/s
google_message1_proto2_parse_reuse           255 ns        254 ns    2665418   855.155MB/s
google_message1_proto2_parse_newarena        926 ns        926 ns     769950   234.796MB/s
google_message1_proto2_serialize             165 ns        165 ns    4214354   1.28456GB/s
google_message1_proto3_parse_new             828 ns        825 ns     855202   263.405MB/s
google_message1_proto3_parse_reuse           471 ns        470 ns    1476628   462.167MB/s
google_message1_proto3_parse_newarena       1046 ns       1049 ns     659339   207.186MB/s
google_message1_proto3_serialize             231 ns        232 ns    2993231   909.373MB/s
google_message2_parse_new                 318212 ns     317106 ns       2223   254.338MB/s
google_message2_parse_reuse               113398 ns     113764 ns       6129   708.942MB/s
google_message2_parse_newarena            252076 ns     252894 ns       2802   318.918MB/s
google_message2_serialize                  65855 ns      65689 ns      10722   1.19901GB/s

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Sep 23, 2016

LGTM

Please squash the commits before merging.

WrappingCounter(size_t limit) : value_(0), limit_(limit) {}

size_t Next() {
size_t ret = value_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(value + 1) % limit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what I currently have is much faster. "limit" isn't a compile-time constant, so % will turn into a real idiv instruction, which is very slow. Mine is a single extremely predictable branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider this the wrong abstraction of the above one-liner.

If you want to do this wrapping as an abstraction than just abstract the whole payload.

const string& NextPayload() { ...}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. I think what I have is simpler. It doesn't need to know anything about the type or storage or lifetime of the things being iterated over. It is just a simple wrapping counter.

Copy link
Contributor

@gerben-s gerben-s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall minor comment

@haberman haberman merged commit a289d43 into protocolbuffers:master Sep 23, 2016
@haberman haberman deleted the cppbenchmark branch August 15, 2019 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants