-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Protocol Changes #8
Conversation
00f9c2d
to
764e13f
Compare
A few comments and open questions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, only real question is around MsgType
.
README.md
Outdated
| MsgType | Description | | ||
| ------- | ----------- | | ||
| 0 | Publish | | ||
| 1 | Replication | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I question the value of specifying the message type. Do you have a use for this information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we were considering using the same envelope for the replication messges as well, in which case we need to know which message type to decode the payload with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would we be handling decoding differently between publishes vs. replication? In either case we should know if it's a publish or a replicated message based on the context in which the message is received, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I suppose maybe I'm just being overly cautious. If the user happens to send liftbridge-encoded publishes over the replication subject, it could be catastrophic. But maybe we can just choose a different magic number for the replication header and call it a day? I'm fine with that, and we can keep this byte reserved for better use cases down the road.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different magic number makes sense to me, but I can be convinced otherwise.
README.md
Outdated
## NATS Message Envelope | ||
|
||
Every Liftbridge message that is sent over NATS should be sent with the following | ||
envelope header: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's important that we still support "plain" NATS messages for cases where people just want transparent recording of NATS subjects without publisher code changes. The trade-off obviously is that you give up the features of the envelope (key, acking, etc.), but that's the expectation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, absolutely. I should be more clear in the wording here. The plain messages on subjects with streams attached will be treated as value-only messages with no acks etc, just like you have it now. This note is only referring to the Liftbridge-generated messges sent over nats.
To answer your questions...
I'm not sure we can do this. What of the case where you publish a Message directly to NATS, not through Liftbridge's Publish API?
Headers I'm torn on since this was a frequently requested feature in NATS Streaming and Kafka supports it. |
|
I think I follow now.
Yeah, that is what I was thinking as well. Headers give us a lot of flexibility to do interesting things in the future. Routing/filtering is a good example. |
I'm also starting to look at the on-disk encoding, which I think can be unified with the wire protocol. |
The thought was to use attributes for flags such as compression, e.g. a compression codec. The on-disk protocol could definitely use some cleaning up though. |
Ah got it, good idea. We can probably wrap the messages in a on-disk flatbuffers message, so that the only things we need to manually encode are the size and CRC-32. I think that will be more future-proof in the event that we want to add more metadata to the message (e.g. compression algorithm). Might also be good to add a header to the segment for segment-level configuration---arguably compression would be able to do more with large batches of messages, vs just within a single message. Anyways I'm kind of digressing, should keep this PR focused on the client-facing API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, one question around the use of required
.
api.fbs
Outdated
// CreateStreamRequest is sent to create a new stream. | ||
table CreateStreamRequest { | ||
subject : string (id: 0); // Stream NATS subject | ||
name : string (id: 1); // Stream name (unique per subject) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very familiar with flatbuffers, but why not mark more fields as required
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Required is kind of a very strong guarantee that the field will never change, and the idea of using it very sparingly is one of the lessons learned from protobuf (they removed required fields entirely in proto3, see [1] and the quote below). Once something is required, it can never be modified without changing all clients at the same time, which defeats the purpose of backwards-compatibility of schema changes. In fact, I'd say maybe we should go the other way and mark nothing as required here. I have it on a few key fields but maybe for upgradeability i should remove those.
We dropped required fields in proto3 because required fields are generally considered harmful and violating protobuf's compatibility semantics. The whole idea of using protobuf is that it allows you to add/remove fields from your protocol definition while still being fully forward/backward compatible with newer/older binaries. Required fields break this though. You can never safely add a required field to a .proto definition, nor can you safely remove an existing required field because both of these actions break wire compatibility. For example, if you add a required field to a .proto definition, binaries built with the new definition won't be able to parse data serialized using the old definition because the required field is not present in old data. In a complex system where .proto definitions are shared widely across many different components of the system, adding/removing required fields could easily bring down multiple parts of the system. We have seen production issues caused by this multiple times and it's pretty much banned everywhere inside Google for anyone to add/remove required fields. For this reason we completely removed required fields in proto3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I'm ok with removing required
altogether.
Also I'm overseas for a work trip at the moment but I'll try to carve out some time this weekend to get some of the remaining things I have in my worktree pushed up. Hopefully we can get this closer to merge-ready sooner rather than later so everyone has some time to absorb the changes. |
Hey sorry didn't get a chance to take a look while I was away but I'm back now so will give this some attention over the next few days. |
Most notably, we make the core message payload its own table, so that it is self-contained and wrapper messages can simply memcpy it around.
f5be7e6
to
d7c17bf
Compare
@llchan Wondering what the state of this is? |
Was running around a bit for Thanksgiving + a conference, but can pick up on this tomorrow. Last I left off I was reading through the existing on-disk serialization code for the liftbridge server. The implementation there may influence the way we should structure our messages---ideally we can memcpy blobs without parsing/unpacking, and possibly keep the door open for compression/mmap in the future. I think the current state of the PR already includes some adjustments towards this end, but will need to review. |
EARLIEST = 2, // Start at the oldest message | ||
LATEST = 3, // Start at the newest message | ||
TIMESTAMP = 4, // Start at a specified (ack) timestamp | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imo start position could be an Union instead. This should make consistency easier (eg: not setting an offset when start position type is timestamp
)
Closed in favor of #14. |
This PR is a proposal for two protocol-level changes. Some stuff is not pushed yet, just getting this PR started.