This repository was archived by the owner on Dec 21, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 9
5. Message Sequencing – Guarantee sequence follows publish order & Prevent backdated messages silently breaking future publishes #166
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…when messages published in correct sequence.
…reak regular sequential publishes.
94d8432 to
8946ebb
Compare
mirpo
approved these changes
Sep 22, 2020
src/Publisher.js
Outdated
| // than they were issued we will end up generating the wrong sequence numbers | ||
| const streamId = getStreamId(streamObjectOrId) | ||
| const key = streamId | ||
| const queue = this.pending.get(key) || this.pending.set(key, pLimit(1)).get(key) |
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
timoxley
added a commit
that referenced
this pull request
Nov 24, 2020
* Refactor & co-locate publish code. * Avoid using expensive ethers.Wallet.createRandom() calls in test. e.g. 1000x calls with ethers: 14s, randomBytes: 3.5ms. * Ensure messageCreationUtil is cleaned up after test. * Fix non-functional MessageCreationUtil test. * Swap out receptacle for more flexible mem/p-memoize/quick-lru. * Convert LoginEndpoints test to async/await. * Remove calls to ensureConnected/ensureDisconnected in test. * 5. Message Sequencing – Guarantee sequence follows publish order & Prevent backdated messages silently breaking future publishes (#166) * Improve authFetch logging. * Update message sequencer to strictly enforce message order. * Queue publishes per-stream otherwise can skip forward then back even when messages published in correct sequence. * Add partial solution to broken backdated messages, at least doesn't break regular sequential publishes. * Tidy up, add some comments. * Move publish queue fn into utils.
timoxley
added a commit
that referenced
this pull request
Nov 24, 2020
* Promisify subscribe/unsubscribe. * Fail _request call if unsubscribe/subscribe request fails to send. * Fix missing import in resend test. * Fix browser tests. * Clean up unused function in test. * 4. Refactor Publish (#164) * Refactor & co-locate publish code. * Avoid using expensive ethers.Wallet.createRandom() calls in test. e.g. 1000x calls with ethers: 14s, randomBytes: 3.5ms. * Ensure messageCreationUtil is cleaned up after test. * Fix non-functional MessageCreationUtil test. * Swap out receptacle for more flexible mem/p-memoize/quick-lru. * Convert LoginEndpoints test to async/await. * Remove calls to ensureConnected/ensureDisconnected in test. * 5. Message Sequencing – Guarantee sequence follows publish order & Prevent backdated messages silently breaking future publishes (#166) * Improve authFetch logging. * Update message sequencer to strictly enforce message order. * Queue publishes per-stream otherwise can skip forward then back even when messages published in correct sequence. * Add partial solution to broken backdated messages, at least doesn't break regular sequential publishes. * Tidy up, add some comments. * Move publish queue fn into utils.
Closed
Merged
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This fixes two edge-case issues with sequences & publishing, both of which result in silent/subtle issues, so they may or may not affect existing streams and just haven't been noticed.
Problem 1: Backdated messages corrupt sequencing
Publishing back-dated (i.e. non-sequentially timestamped) messages will break sequencing. This could happen if the user publishes old data with a custom timestamp using the same client instance + stream that they're publishing realtime data to i.e. same message chain.
This happens because we lose track of biggest sequence number whenever the timestamp changes for stream id+partition combo, there's an assumption that if the timestamp changes, then the new timestamp is actually newer, but this isn't verified in the client. Backdated messages then start at sequence 0 again, regardless of the sequencing of existing messages. And the prevMessageRef of these backdated messages will point into future.
e.g. consider this publish sequence, where [timestamp, sequence] -> prevMessageRef,
Then you publish a new message with timestamp 0, and then a new message with timestamp 1:
Were these messages to be accepted by the network, since it considers a timestamp+sequence number unique, the newer messages occurring with the same timestamp + sequence number would clobber the older messages, while also having a a messed up future-dated & cyclic
prevMessageRef🤢Thankfully this doesn't seem to happen, currently backdated messages seem to simply disappear from the client's perspective. The backdated messages aren't sent into a stream's subscription, nor are they resent. Unfortunately, all future publishes to this stream also seem to disappear. Only messages before the first backdated message survive.
The backend seems to be responsible for the disappearing behaviour. No error messages appear in the logs on publish, though resends on bad streams do report:
Not sure a good way around this other than blocking backdated messages on the same chain. Should probably enforce this in the protocol e.g.
StreamMessageconstructor could error on aprevMessageRefwith a timestamp greater than their current timestamp. Would also be good if the backend would send some information to the client when such things occur. I found a partial workaround that at least keeps non-backdated publishing working (see below).We could keep a single global sequence counter for a stream, this would prevent duplicate timestamp + sequence number entries from occurring, and keep everything sortable, but doesn't prevent
prevMessageReffrom pointing into the future.Should probably figure out and document an official way to safely publish backdated messages e.g. publishing archived data. Sequencing should be fine so long as the backdated messages end up on a different message chain. Currently this requires using a different client instance. Perhaps we could add a new method to the client like:
or something else specifically for this use-case.
Partial Workaround
streamr-client-javascript/src/Publisher.js
Lines 68 to 70 in e2fcccc
I've added a partial workaround for backdated messages silently breaking sequencing by ignoring updates to the latest sequence number/timestamp if the timestamp is older than the previous message timestamp. Without this all published messages after the backdated message will just silently disappear. With this change, backdated messages still get sent to the network and silently disappear, but at least the "properly" sequenced messages will continue to work. Not good but better.
Possible alternative workaround
Another option could be to automatically start a new chain when it detects stuff out of order. Using same example as before, if after publishing a message at timestamp 1, you publish new messages at timestamp 0, and then another at 1, it creates a new chain on the first backdated message, which prevents the backdated message from being lost:
This results in inconsistent message sequencing on a realtime subscription, but seems to produce reliable sequencing for resends. The upside of this method is that no data is lost, and user doesn't have to explicitly opt-into thinking about message chains. The downside is that it adds breaks into the chain, resulting in less reliable ordering.
Another option could be to remove the ability to have user-supplied timestamps altogether, if users need to sort they can add their own timestamp field into the message body. This makes the timestamp explicitly refer to "publish time", rather than anything about the data's time, which is probably the intended purpose anyway.
Problem 2: Sequencing occurs after unsequenced async calls:
A number of async calls are made before the message is sequenced, specifically:
If something causes the timing in these async functions to resolve in a different order than they were issued, then the sequencing will be applied out-of-order. Both of the above calls are cached remote calls, so in practice this should never occur but a bug here is very subtle, may only occur under certain circumstances, leads to only slightly corrupt data, so likely it would go undetected.
I've added a few tests which catch this type of bug going forward, and fixed the publishing code to guarantee the async calls always resolve in the same order they were issued, thus publish call-order sequencing is guaranteed by pushing these async calls into a per-stream queue:
streamr-client-javascript/src/Publisher.js
Lines 222 to 237 in e2fcccc
Builds on: