New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support hash computation for protobuf messages. #2304
Conversation
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
1 similar comment
Thanks for your pull request. The automated tests will run as soon as one of the admins verifies this change is ok for us to run on our infrastructure. |
Sorry for not getting back to you early... I see that you have added a generated function for each message, but we probably can't afford that for code size concerns. Right now, some Google server binary contain so many generated protobuf code that they are either compiled very slowly or even on the brink of not able to be compiled... We are trying hard to reduce the code size so another generated function would not be acceptable... Can you switch to a pure reflection-based implementation instead? As to the public API, I would prefer an interface similar to the existing MessageDifferencer class. For example, MessageHasher::HashCode(const Message& message). I can foresee some users might want to skip fields/treat fields differently in the future just like what MessageDifferencer already supports. Could you update this pull request in that direction? |
/cc @xfxyjwf I would still consider rethinking this kind of architecture (similar to MessageDifferencer). I started to use hashes because I wanted to use messages as a key in unordered_map. Computing hashes via reflection for all messages raises performance conserns. As for MessageDifferencer I can understand that reflection usage is needed because there might be unknown fields, but still if developer does not need to compare them, explicit specialization of If you concerned about generated code size, that's possible to add an option whether or not to generate this kind of specializations. And for all integrated protobuf message types add I've already provided Unfortunately, it looks for me that if I implement MessageHasher as you propose, I probably am not going to use it in my project that uses protobuf, because of performance concerns. In my initial proposal I just wanted protobuf generated message classes to be more integrated with stdlib. |
@slavanap I agree that the performance of a reflection-based implementation will be a concern, though we just can't afford the code size increase. Adding an option is also unlikely to be accepted as we generally don't allow new options unless necessary. The good news is that we are currently experimenting a light-weight reflection implementation that are designed to replace most of the generated methods and it offers a much faster speed than what the existing reflection API offers. When that's ready, we should be able to implement the hash function based on that and get both benefits (i.e., fast hash without generated code). Until then, we can only accept reflection-based hash implementation to the protobuf library. You can still implement a generated hash function using a protocol compiler plugin though, i.e., instead of making it a part of protobuf library, use a standalone compiler plugin that will generate the hash<> functions separately (or injected into the protoc genreated files). |
Okay. I've cut generator code and other unrelated code from my pull request. Still there's a few things I want you to know about.
I welcome to any changes to this pull request.
Will the new reflection interface be similar to one we already have here? Note, I squashed my both commits into new one. It still can be merged with current |
* Reverting back compiler and other unrelated changes. * Support hash computation for protobuf messages.
Closing old issues/prs. Feel free to reopen. |
This pull request relates to following issue: #2066
It adds hash computation support for
::google::protobuf::Message
class and for all generated classes for messages in C++ code.This pull request is a draft because I need advise or help about how to
And of course I need a code review.
@xfxyjwf
We can move discussion from issue ticket here, if you want.
And sorry for the delay. Life is hard sometimes.