Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync store baseline understanding #62

Open
ABresting opened this issue Nov 20, 2023 · 13 comments
Open

Sync store baseline understanding #62

ABresting opened this issue Nov 20, 2023 · 13 comments
Assignees
Labels
enhancement New feature or request track:message-reliability Improve message reliability guarantees

Comments

@ABresting
Copy link

ABresting commented Nov 20, 2023

Sync store is a vital feature of Waku protocol where a node can synchronize with peer-nodes hoping to get missing messages while the node was offline/out-of-activity. Every message in Waku protocol can be uniquely identified using a messageHash, which is a DB attribute. Using the messageHash it gets easier for nodes to identify if their store has that certain message. The following are the potential features of the Waku store sync:

  1. Sync request can be triggered:
    • when a node boots/powers-on
    • if the last received message elapsed X time for eg. 5 mins
    • client-based manual trigger
  2. Sync request is passive, i.e. only a node missing the data/messages triggers it, the provider node should not actively trigger/advertise it
  3. An outdated client (no provision to support such req.) node when receives a Sync request, sends 501 (Not Implemented) or 405 (Method Not Allowed) status code
  4. Upon agreeing on the missing hashes, the provider node should prepare to transport the messages to the requesting peer node

There are some open questions such as:

  1. Things to consider when implementing the Sync mechanism Understanding the data structure for Sync mechanism  #63
  2. How older Waku messages are eligible to be Sync'ed using peer-nodes? Sync store - How old messages can be requestd for Sync #64

Eventually, after establishing the understanding and operating details of the Prolly tree-based Synchronization mechanism, the integration of the Synchronization layer into the Waku protocol requires careful consideration, ensuring a deep understanding of its operational nuances and a thoughtful approach to its implementation. #73

Topics such as incentives to serve sync requests are kept out of this document's scope.

@ABresting
Copy link
Author

@jm-clius @waku-org/waku

@SionoiS
Copy link

SionoiS commented Nov 20, 2023

waku-org/pm#101 (comment)

@SionoiS
Copy link

SionoiS commented Nov 20, 2023

Also, I would like to say that we should aim for a solution geared towards specific apps.

I believe that apps using TWN will naturally form sync groups among themselves. Meaning an App would have couple of TWN nodes but only sync messages it cares about.

Supporting that should be our first priority IMO.

Only then should we think about general store provider that would store all message because it would be a more general use case.

@ABresting
Copy link
Author

Also, I would like to say that we should aim for a solution geared towards specific apps.

I believe that apps using TWN will naturally form sync groups among themselves. Meaning an App would have couple of TWN nodes but only sync messages it cares about.

Oh yes 100%, That's also what I have figured from Status way of functioning, XMTP implementation, Tribes requirements and a nice brainstorming session with @chaitanyaprem!

Supporting that should be our first priority IMO.

Only then should we think about general store provider that would store all message because it would be a more general use case.

I am wondering if we should let client somehow provide configuration parameter that allows it to make a Prolly tree (or some other Sync mechanism) based on content topic since most of the client nodes will be interested in their content topic that serves their apps.

@SionoiS
Copy link

SionoiS commented Nov 21, 2023

I am wondering if we should let client somehow provide configuration parameter that allows it to make a Prolly tree (or some other Sync mechanism) based on content topic since most of the client nodes will be interested in their content topic that serves their apps.

If the Sync mechanism is Prolly tree based, a sync request becomes a set diff. The diff of the 2 local trees becomes the hash list of message to send to the other node, it's beautifully symmetric!

@jm-clius
Copy link
Contributor

Thanks for opening up this issue, @ABresting!

A couple of comments:

Sync request can be triggered

At some point we may want to periodically sync while the node is online too, ensuring less fragmented histories due to unnoticed down periods or other short lapses in good connectivity.

Sync request is passive

This seems fine for now as a simple evolution of Store requests and responses. If we build a sync mechanism that periodically syncs, though, we may want to take inspiration from GossipSub's IHAVE and IWANT mechanisms where nodes also periodically advertises which messages they HAVE and others request what they WANT (fewer round trips)

outdated client...when receives a Sync request

In the simplest version of this protocol, I envision it could simply be a better Store protocol, with HistoryQuery either for a list of message hashes or the full contents belonging to such message hashes? In this case, if the other node doesn't support this version of the Store protocol, libp2p would fail to establish a protocol stream (dial failure). This happens before the service-side can respond with an error code within the protocol.

@jm-clius
Copy link
Contributor

One thing that is important for the baseline understanding is to consider the layered architecture here and where the synchronisation mechanism lives:

Option 1: Store protocol layer

The Store protocol itself can evolve to exchange information about keys (message hashes) and full message contents. However, the store node would still need to be able to determine which hashes it's missing and request the full contents for these from other store nodes. In the simplest, but most inefficient, version of such an architecture, the Store node would have to query its own archive backend (the key-value store, which is likely a DB such as postgres) for a full list of keys and compare this with a full list of keys it receives from other nodes (who are doing the same inefficient DB queries).

However, if we introduce some efficient "middle layer" here between the DB/archive backend and the Store protocol, we could vastly improve the efficiency of doing a "diff" between the indexes/message hashes known to both nodes. The Store protocol would still be responsible for communicating which message hashes it knows about, comparing it to those known by other nodes and finding what's missing, but with an efficient way to compare its own history with those in other nodes. One such method is building efficient search trees, such as the Prolly trees described here: https://docs.canvas.xyz/blog/2023-05-04-merklizing-the-key-value-store.html
The archive would remain the persistence layer underlying all of this - any DB/storage/persistence technology that is compatible with key-value storage.

Option 2: New middleware, synchronised "backend" for Store

With this option, we will not change the Store protocol - it will remain a way for clients to query the history stored in Store service nodes according to a set of filter criteria. However, the Store nodes themselves would build on some synchronised mechanism with its own protocol for synchronising between nodes (e.g. GossipLog based on Prolly Trees). The archive would remain the persistence layer where the synchronised messages are inserted and retrievable when queried.

Option 3: Synchronised backend/archive

In this option the Store protocol would not have to be modified and we won't need to introduce any "middleware" to effect synchronisation, messageHash exchange, etc. Instead, the Store protocol would assume that it builds on top of a persistence layer that handles synchronisation between instances. For example, all Store nodes could somehow persist and query messages from a Codex-backed distributed storage for global history with reliability and redundancy guarantees. A simpler analogy would be if all Store nodes somehow have access to the same Postgresql instance and simply write/query from there.

@jm-clius
Copy link
Contributor

If the Sync mechanism is Prolly tree based, a sync request becomes a set diff. The diff of the 2 local trees becomes the hash list of message to send to the other node, it's beautifully symmetric!

I like this!

@ABresting
Copy link
Author

Weekly Update

  • achieved: Clarity on Store sync protocol, nearly finalized (creating visual images/diagrams) the research document to explain architecture and issues with potential approaches.

  • next: prepare the workshop for Store sync and publish the research document on prolly tree with Waku use case.

@ABresting
Copy link
Author

Weekly Update

  • achieved: baseline clarity on how and what Sync protocol will be done, supplementary Waku node parts interaction document.

  • next: a workshop with team folks to reach an agreement on how Sync store will look like.

@chair28980 chair28980 added the track:message-reliability Improve message reliability guarantees label Dec 11, 2023
@ABresting
Copy link
Author

Weekly Update

  • achieved: PoC of Prolly Tree (fixing a Bug), insertion and deletion of data into it.

  • next: a writeup about Prolly trees PoC in issue, further testing, generating some operational data details such as memory consumption using RLN specs.

@chair28980 chair28980 added the enhancement New feature or request label Dec 18, 2023
@ABresting
Copy link
Author

Weekly Update

  • achieved: PoC of Prolly Tree feature complete, Postgres retention policy PR, diff protocol ground work started.

  • next: pending technical writeup about Prolly trees PoC in issue, Diff protocol, generating some operational data details such as memory consumption using RLN specs.

@ABresting
Copy link
Author

Weekly Update

  • achieved: 1-day work this week due to time off, nim implementation of Prolly trees

  • next: Diff protocol discussion, Sync mechanism on wire query protocol discussion, generating some operational data details such as memory consumption using RLN specs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request track:message-reliability Improve message reliability guarantees
Projects
None yet
Development

No branches or pull requests

4 participants