Skip to content
/ pbview Public

Read protobuf messages without deserialization

Notifications You must be signed in to change notification settings

mrpi/pbview

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pbview

Read protobuf messages without deserialization

If you only need to read a few fields of large Google Protocol Buffer messages you can use this library to do this much faster than with the original google/protobuf parser.

The pbview compiler (pbviewc) generates classes that have the same (read-only) interface as the C++ classes generated by the protobuf compiler (protoc).

Usage

Generate classes the same way, that you generate the regular protobuf c++ messages:

$ pbviewc --cpp_out=out_dir --proto_path=in_dir mymessage.proto

Access binary message fields directly inplace:

#include <mymessage.pbview.h>

MyMessage msg;
msg.set_id(42);
auto binStr = msg.SerializeAsString();
auto view = pbview::View<MyMessage>::fromBytesString(binStr);
REQUIRE(view.id() == 42);

Requirements

  • C++17 compiler
  • google/protobuf
  • range-v3

Features

  • The parser seekes fast to the requested fields. Large strings and even sub-messages are skipped in one step
  • Working with serialized messages has significant lower memory consumptions than holding deserialized messages in memory
  • No memory allocations (std::string_view directly pointing into the serialized message, instead of std::string)
  • Variant types that contain either a binary view or a google::protobuf::Message

Drawbacks

Please note, that every field access requires a (partial) parsing of the containing message.

This means, that there is always a break-even point at which the deserialization of the message is faster. (Hybrid approaches are possible: It is easy to deserialize only single sub-messages of larger structs with this library.)

Benchmark your exact use-cases and than make a well-founded decision!

TODO

  • Compatibility with proto3 syntax
  • Reflection+Descriptor interface
  • libfuzzer + asan tests
  • Evaluate caching strategies
    • value or offset of already accessed fields
    • id of first field in each cache-line (for binary search)
    • offset of each field (like done by Cap’n Proto or FlatBuffers)
  • Support ZeroCopyInputStreams instead of only flat memory data (for compressed data)
  • Support uncanonically serialized messages
    • fields not ordered by field number (untested)
    • repeated fields marked as packed but serialized without packing

License

Boost Software License 1.0

About

Read protobuf messages without deserialization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages