Protocol Changes #8

llchan · 2019-11-01T17:32:29Z

This PR is a proposal for two protocol-level changes. Some stuff is not pushed yet, just getting this PR started.

Use Flatbuffers rather than Protobuf. Preliminary tests are looking promising (1.6x higher throughput in publishing, for example), and I think the added control over allocations and encoding/decoding logic will be beneficial in pushing performance. The tradeoff is that fewer people are familiar with flatbuffers, but the Go and C++ code generators at least have a protobuf-like object API that make it pretty intuitive for protubuf folks, and honestly the lower-level message builder is pretty straightforward as well.
NATS message envelope, which every Liftbridge message should use when communicating over NATS. This affords us internal consistency as well as safety if other messages are sent on those subjects.

llchan · 2019-11-01T18:16:48Z

A few comments and open questions:

I think it makes sense to make the Message as lean as possible, and additionally keep it immutable through its lifecycle. For publishing, we can include ack inbox/policy etc. in the publish request, and for subscriptions we include metadata in what I'm calling SubscriptionMessage.
I'm not sure reply and message headers need to be in Liftbridge. Those seem like things that should live inside the value (aka payload). The key makes sense, because it is used for log compaction, but the others seem application-level to me. I think we should keep Liftbridge as lean as possible unless it needs to know about these fields. Thoughts?

tylertreat

lgtm, only real question is around MsgType.

tylertreat · 2019-11-01T19:55:18Z

README.md

+| MsgType | Description |
+| ------- | ----------- |
+| 0       | Publish     |
+| 1       | Replication |


I question the value of specifying the message type. Do you have a use for this information?

I thought we were considering using the same envelope for the replication messges as well, in which case we need to know which message type to decode the payload with.

How would we be handling decoding differently between publishes vs. replication? In either case we should know if it's a publish or a replicated message based on the context in which the message is received, right?

Yes, I suppose maybe I'm just being overly cautious. If the user happens to send liftbridge-encoded publishes over the replication subject, it could be catastrophic. But maybe we can just choose a different magic number for the replication header and call it a day? I'm fine with that, and we can keep this byte reserved for better use cases down the road.

Different magic number makes sense to me, but I can be convinced otherwise.

tylertreat · 2019-11-01T20:03:48Z

README.md

+## NATS Message Envelope
+
+Every Liftbridge message that is sent over NATS should be sent with the following
+envelope header:


It's important that we still support "plain" NATS messages for cases where people just want transparent recording of NATS subjects without publisher code changes. The trade-off obviously is that you give up the features of the envelope (key, acking, etc.), but that's the expectation.

Yes, absolutely. I should be more clear in the wording here. The plain messages on subjects with streams attached will be treated as value-only messages with no acks etc, just like you have it now. This note is only referring to the Liftbridge-generated messges sent over nats.

tylertreat · 2019-11-01T20:10:53Z

To answer your questions...

I think it makes sense to make the Message as lean as possible, and additionally keep it immutable through its lifecycle. For publishing, we can include ack inbox/policy etc. in the publish request, and for subscriptions we include metadata in what I'm calling SubscriptionMessage.

I'm not sure we can do this. What of the case where you publish a Message directly to NATS, not through Liftbridge's Publish API?

I'm not sure reply and message headers need to be in Liftbridge. Those seem like things that should live inside the value (aka payload). The key makes sense, because it is used for log compaction, but the others seem application-level to me. I think we should keep Liftbridge as lean as possible unless it needs to know about these fields. Thoughts?

Reply is a NATS concept which Liftbridge includes on the message in order to expose the reply on the original NATS message. This is mainly for cases where publishers are publishing plain, Liftbridge-agnostic NATS messages. We want to expose the info set on the original message.

Headers I'm torn on since this was a frequently requested feature in NATS Streaming and Kafka supports it.

llchan · 2019-11-01T20:34:11Z

The first note about refactoring the Message is mostly internal to the client/server encoding. The user would still interact with the Publish API or NewMessage function, it would just be packed differently in the flatbuffers, such that the Message is self-contained and contains no pointers to anything outside of it (i.e. it is memcpy-able opaquely). Perhaps this will be more clear once I push up the next batch.
Gotcha, did not realize NATS has a built-in reply concept. That makes sense to keep as a first-class field.
If headers are very commonly-requested we can keep them first-class as well. Now that I think about it, if we ever want to do application-agnostic header-based filtering/routing we would need that so we should leave that possibility open.

tylertreat · 2019-11-01T21:33:53Z

The first note about refactoring the Message is mostly internal to the client/server encoding. The user would still interact with the Publish API or NewMessage function, it would just be packed differently in the flatbuffers, such that the Message is self-contained and contains no pointers to anything outside of it (i.e. it is memcpy-able opaquely). Perhaps this will be more clear once I push up the next batch.

I think I follow now.

Now that I think about it, if we ever want to do application-agnostic header-based filtering/routing we would need that so we should leave that possibility open.

Yeah, that is what I was thinking as well. Headers give us a lot of flexibility to do interesting things in the future. Routing/filtering is a good example.

llchan · 2019-11-01T23:31:47Z

I'm also starting to look at the on-disk encoding, which I think can be unified with the wire protocol.
One thing I didn't quite understand: what are Attributes? It doesnt currently exist in the wire protocol, just want to make sure I'm not missing something.

tylertreat · 2019-11-01T23:42:58Z

One thing I didn't quite understand: what are Attributes? It doesnt currently exist in the wire protocol, just want to make sure I'm not missing something.

The thought was to use attributes for flags such as compression, e.g. a compression codec. The on-disk protocol could definitely use some cleaning up though.

llchan · 2019-11-01T23:58:22Z

Ah got it, good idea. We can probably wrap the messages in a on-disk flatbuffers message, so that the only things we need to manually encode are the size and CRC-32. I think that will be more future-proof in the event that we want to add more metadata to the message (e.g. compression algorithm). Might also be good to add a header to the segment for segment-level configuration---arguably compression would be able to do more with large batches of messages, vs just within a single message. Anyways I'm kind of digressing, should keep this PR focused on the client-facing API.

tylertreat

lgtm, one question around the use of required.

tylertreat · 2019-11-02T23:01:12Z

api.fbs

+// CreateStreamRequest is sent to create a new stream.
+table CreateStreamRequest {
+  subject           : string (id: 0); // Stream NATS subject
+  name              : string (id: 1); // Stream name (unique per subject)


I'm not very familiar with flatbuffers, but why not mark more fields as required?

Required is kind of a very strong guarantee that the field will never change, and the idea of using it very sparingly is one of the lessons learned from protobuf (they removed required fields entirely in proto3, see [1] and the quote below). Once something is required, it can never be modified without changing all clients at the same time, which defeats the purpose of backwards-compatibility of schema changes. In fact, I'd say maybe we should go the other way and mark nothing as required here. I have it on a few key fields but maybe for upgradeability i should remove those.

We dropped required fields in proto3 because required fields are generally considered harmful and violating protobuf's compatibility semantics. The whole idea of using protobuf is that it allows you to add/remove fields from your protocol definition while still being fully forward/backward compatible with newer/older binaries. Required fields break this though. You can never safely add a required field to a .proto definition, nor can you safely remove an existing required field because both of these actions break wire compatibility. For example, if you add a required field to a .proto definition, binaries built with the new definition won't be able to parse data serialized using the old definition because the required field is not present in old data. In a complex system where .proto definitions are shared widely across many different components of the system, adding/removing required fields could easily bring down multiple parts of the system. We have seen production issues caused by this multiple times and it's pretty much banned everywhere inside Google for anyone to add/remove required fields. For this reason we completely removed required fields in proto3.

[1] protocolbuffers/protobuf#2497

Got it. I'm ok with removing required altogether.

llchan · 2019-11-06T04:57:31Z

Also I'm overseas for a work trip at the moment but I'll try to carve out some time this weekend to get some of the remaining things I have in my worktree pushed up. Hopefully we can get this closer to merge-ready sooner rather than later so everyone has some time to absorb the changes.

llchan · 2019-11-20T03:16:29Z

Hey sorry didn't get a chance to take a look while I was away but I'm back now so will give this some attention over the next few days.

Most notably, we make the core message payload its own table, so that it is self-contained and wrapper messages can simply memcpy it around.

tylertreat · 2019-12-15T02:22:39Z

@llchan Wondering what the state of this is?

llchan · 2019-12-17T04:37:08Z

Was running around a bit for Thanksgiving + a conference, but can pick up on this tomorrow. Last I left off I was reading through the existing on-disk serialization code for the liftbridge server. The implementation there may influence the way we should structure our messages---ideally we can memcpy blobs without parsing/unpacking, and possibly keep the door open for compression/mmap in the future. I think the current state of the PR already includes some adjustments towards this end, but will need to review.

caioaao · 2020-01-03T10:20:54Z

api.fbs

+  EARLIEST  = 2, // Start at the oldest message
+  LATEST    = 3, // Start at the newest message
+  TIMESTAMP = 4, // Start at a specified (ack) timestamp
+}


Imo start position could be an Union instead. This should make consistency easier (eg: not setting an offset when start position type is timestamp)

tylertreat · 2020-02-14T23:36:31Z

Closed in favor of #14.

Add NATS message envelope documentation to README

764e13f

llchan force-pushed the protocol-changes branch from 00f9c2d to 764e13f Compare November 1, 2019 17:35

Use box drawing characters

3f2dab5

tylertreat reviewed Nov 1, 2019

View reviewed changes

llchan and others added 2 commits November 1, 2019 18:07

Update README

e39b799

Initial flatbuffers conversion (direct port)

ab96e80

Remove MsgType

b7607cf

tylertreat reviewed Nov 2, 2019

View reviewed changes

tylertreat mentioned this pull request Nov 16, 2019

Failed to parse server response paambaati/node-liftbridge#2

Open

llchan added 2 commits November 30, 2019 20:47

Reorganize protocol message types

5aebfe1

Most notably, we make the core message payload its own table, so that it is self-contained and wrapper messages can simply memcpy it around.

Fix typo in README

d7c17bf

llchan force-pushed the protocol-changes branch from f5be7e6 to d7c17bf Compare December 1, 2019 02:47

caioaao reviewed Jan 3, 2020

View reviewed changes

tylertreat closed this Feb 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Protocol Changes #8

Protocol Changes #8

llchan commented Nov 1, 2019 •

edited

Loading

llchan commented Nov 1, 2019

tylertreat left a comment

tylertreat Nov 1, 2019

llchan Nov 1, 2019

tylertreat Nov 1, 2019

llchan Nov 1, 2019

tylertreat Nov 1, 2019

tylertreat Nov 1, 2019

llchan Nov 1, 2019

tylertreat commented Nov 1, 2019

llchan commented Nov 1, 2019 •

edited

Loading

tylertreat commented Nov 1, 2019

llchan commented Nov 1, 2019

tylertreat commented Nov 1, 2019

llchan commented Nov 1, 2019

tylertreat left a comment

tylertreat Nov 2, 2019

llchan Nov 6, 2019

tylertreat Nov 6, 2019

llchan commented Nov 6, 2019

llchan commented Nov 20, 2019

tylertreat commented Dec 15, 2019

llchan commented Dec 17, 2019 •

edited

Loading

caioaao Jan 3, 2020 •

edited

Loading

tylertreat commented Feb 14, 2020

Protocol Changes #8

Protocol Changes #8

Conversation

llchan commented Nov 1, 2019 • edited Loading

llchan commented Nov 1, 2019

tylertreat left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tylertreat commented Nov 1, 2019

llchan commented Nov 1, 2019 • edited Loading

tylertreat commented Nov 1, 2019

llchan commented Nov 1, 2019

tylertreat commented Nov 1, 2019

llchan commented Nov 1, 2019

tylertreat left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

llchan commented Nov 6, 2019

llchan commented Nov 20, 2019

tylertreat commented Dec 15, 2019

llchan commented Dec 17, 2019 • edited Loading

caioaao Jan 3, 2020 • edited Loading

Choose a reason for hiding this comment

tylertreat commented Feb 14, 2020

llchan commented Nov 1, 2019 •

edited

Loading

llchan commented Nov 1, 2019 •

edited

Loading

llchan commented Dec 17, 2019 •

edited

Loading

caioaao Jan 3, 2020 •

edited

Loading