Skip to content

Latest commit

 

History

History
111 lines (74 loc) · 6.71 KB

PROTOCOL.md

File metadata and controls

111 lines (74 loc) · 6.71 KB

Protocol Spec

y-protocol implements different binary communication protocols for efficient exchange of information.

This is the recommended approach to exchange Awareness updates, and syncing Yjs documents & incremental updates.

Base encoding approaches

We use efficient variable-length encoding where possible.

The protocol operates on byte arrays. We use the operator to define concatenations on array buffers (i.e. [1, 2] = [1] • [2]). A "buffer" shall refer to any byte array.

  • varUint(number) - encodes a 53bit unsigned integer to 1-8 bytes.
    • unsigned integers are serialized 7 bits at a time, starting with the least significant bits.
    • the most significant bit (msb) in each output byte indicates if there is a continuation byte (msb = 1).
    • A reference implementation can be found in lib0: encoding.writeVarUint decoding.readVarUint
  • varByteArray(buffer) := varUint(length(buffer)) • buffer - allows us to read any buffer by prepending the size of the buffer
  • utf8(string) - transforms a string to a utf8-encoded byte array.
  • varString(string) := varByteArray(utf8(string))
  • json(object) := varString(JSON.stringify(object)) - Write a JavaScript object as a JSON string.

Sync protocol (v1 encoding)

The Sync protocol defines a few message types that allow two peers to efficiently sync Yjs documents with each other. For more information about Yjs updates and sync messages, please look at Yjs Docs / Document Updates.

We initially sync using the state-vector approach. First, each client sends a SyncStep1 message to the other peer that contains a state-vector (see Yjs docs). When receiving SyncStep1, one should reply with SyncStep2 which contains the missing document updates (Y.encodeStateAsUpdate(remoteYdoc, sv)). Once a client receives SyncStep2, it knows that it is now synced with the other peer. From now on all changes on the Yjs document should be send to the remote client using an Update message containing the update messages generated by Yjs.

Message types

  • SyncStep1MessageType := 0
  • SyncStep2MessageType := 1
  • UpdateMessageType:= 2

Encodings

  • syncStep1(sv) := varUint(SyncStep1MessageType) • varByteArray(sv) - Initial sync request. The state vector can be received by calling Y.encodeStateVector(ydoc).
  • syncStep2(documentState) := varUint(SyncStep2MessageType) • varByteArray(documentState) - As a reply to SyncStep1. The document state can be received by calling Y.encodeStateAsUpdate(ydoc, sv).
  • documentUpdate(update) := varUint(UpdateMessageType) • varByteArray(update) - Incremental updates the Yjs event handler Y.on('update', update => sendUpdate(update)). The receiving part should apply incremental updates to the Yjs document Y.applyUpdate(ydoc, update).

Awareness protocol

The Awareness protocol synchronizes the pure state-based Awareness CRDT between peers. This can be useful to exchange ephemeral data like presence, cursor positions, etc..

Each peer is allocated a unique entry in the Awareness CRDT that only they can modify. Eventually, this state is going to be removed by either a timeout or by the owner setting the state to null.

Since the Awareness CRDT is purely state-based, we always exchange the whole state of all locally known clients. Eventually, all Awareness CRDT instances will synchronize. The Awareness CRDT must remove clients from clients that haven't been updated for longer than 30 seconds. With each generated update, the clock: uint counter must increase. The Awareness CRDT must only apply updates if the received clock is newer / larger than the currently known clock for that client.

awarenessUpdate(clients) := varUint(clients.length) • clients.map(client => varUint(client.clientid) • varUint(client.clock) • json(client.state))

Combining protocols

The base protocols can be mixed with your own protocols. The y-protocol package only defines the "base" protocol layers that can be reused across communication providers.

  • SyncProtocolMessageType := 0
  • AwarenessProtocolMessageType := 1

A message should start with the message-type (e.g. SyncProtocolMessageType) and be appended with a specify protocol message (e.g. SyncStep1MessageType)

  • E.g. encoding a SyncStep1 message over the communication protocol: varUint(SyncProtocolMessage) • syncStep1(sv)
  • E.g. encoding an awareness update over the communication protocol: varUint(AwarenessProtocolMessageType) • awarenessUpdate(clients)

A communication provider could parse protocols as follows:

import * as decoding from 'lib0/decoding'
import * as encoding from 'lib0/encoding'
import * as sync from 'y-protocols/sync'
import * as awareness from 'y-protocols/awareness'

const messageTypes = {
  [SyncProtocolMessageType]: sync.readSyncMessage,
  [SyncProtocolMessageType]: awareness.readAwarenessMessage,
  [YourCustomMessageType]: readCustomMessage
}

function readMessage (buffer) {
  const decoder = decoding.createDecoder(buffer)
  const messageType = decoding.readVarUint(decoder)
  const replyMessage = encoding.createEncoder()

  const messageHandler = messageTypes[messageType]
  if (messageHandler) {
    messageHandler(decoder, encoder, ydoc)
    if (encoding.length(encoder) > 0) {
      // the message handler wants to send a reply (e.g. after receiving SyncStep1 the client should respond with SyncStep2)
      provider.sendMessage(encoding.toUint8Array(encoder))
    }
  } else {
    throw new Error('Unknown message type')
  }
}

Handling read-only users

Yjs itself doesn't distinguish between read-only and read-write users. However, you can enforce that no modifying operations are accepted by the server/peer if the client doesn't have write-access.

It suffices to read the first two bytes in order to determine whether a message should be accepted from a read-only user.

  • [0, 0, ..] is a SyncStep1. It is request to receive the missing state (it contains a state-vector that the server uses to compute the missing updates)
  • [0, 1, ..] is SyncStep2, which is the reply to a SyncStep1. It contains the missing updates. You want to ignore this message as it contains document updates.
  • [0, 2, ..] is a regular document update message.
  • [1, ..] Awareness message. This information is only used to represent shared cursors and the name of each user. However, with enough malice intention you could assign other users temporarily false identities.

It suffices to block messages that start with [0, 1, ..] or [0, 2, ..]. Optionally, awareness can be disabled by blocking [1, ..].