Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misordered updates result in temporarily missing Y.Map keys #591

Open
mattc opened this issue Oct 26, 2023 · 9 comments · Fixed by himself65/refine#32
Open

Misordered updates result in temporarily missing Y.Map keys #591

mattc opened this issue Oct 26, 2023 · 9 comments · Fixed by himself65/refine#32
Assignees
Labels

Comments

@mattc
Copy link

mattc commented Oct 26, 2023

Describe the bug
As discussed on the YJS Community, a property of a Y.Map will be lost if:

  • Two Y.Map values are modified in document A, producing two updates.
  • Document B, which started as a copy of document A, applies only the second update. At this point, one of the modified properties will be entirely missing from document B.

The documents will converge again when the first update is finally applied to document B, but the intermediate state is highly unexpected and difficult for code to tolerate.

To Reproduce
Run the following code:

const Y = require('yjs')

const doc1 = new Y.Doc()
const doc2 = new Y.Doc()

const doc1Map = doc1.getMap('top')
const doc2Map = doc2.getMap('top')

let updates = []
doc1.on('updateV2', update => {
    updates.push(update)
})

// Set initial values on and apply updates to doc 2
doc1Map.set('x', 1)
doc1Map.set('y', 2)
Y.applyUpdateV2(doc2, updates[0])
Y.applyUpdateV2(doc2, updates[1])
updates = []
console.log(doc2Map.toJSON())   // => { x: 1, y: 2 }

// Change two values, but apply updates to doc 2 in reverse order
doc1Map.set('x', 3)
doc1Map.set('y', 4)
Y.applyUpdateV2(doc2, updates[1])
console.log(doc2Map.toJSON())   // => { x: 1 }  y is entirely missing!
Y.applyUpdateV2(doc2, updates[0])
console.log(doc2Map.toJSON())   // => { x: 3, y: 4 }

Expected behavior
Both keys should remain present at all times. Any combination of the values would be reasonable when only the second update has been applied, e.g. { x: 1, y: 2}.

Environment Information

  • Both Chrome 118 and Node v18
  • Both YJS 13.6.8 and YJS 13.5.30

Additional Information
I also have some signal that the missing keys can be permanent, even after all updates have been applied. However, I have not reproduced this in an isolated way yet.

@mattc mattc added the bug label Oct 26, 2023
@raineorshine
Copy link
Contributor

@himself65
Copy link
Contributor

I think the workaround is to check pendingStruct every time if it's not null. It indicates if a document is broken (because of missing update)

@himself65
Copy link
Contributor

import * as Y from 'yjs'

const doc1 = new Y.Doc()
const doc2 = new Y.Doc()

const doc1Map = doc1.getMap('top')
const doc2Map = doc2.getMap('top')

let updates = []
doc1.on('updateV2', update => {
    updates.push(update)
})

// Set initial values on and apply updates to doc 2
doc1Map.set('x', 1)
doc1Map.set('y', 2)
Y.applyUpdateV2(doc2, updates[0])
Y.applyUpdateV2(doc2, updates[1])
updates = []
console.log(doc2Map.toJSON())   // => { x: 1, y: 2 }

// Change two values, but apply updates to doc 2 in reverse order
doc1Map.set('x', 3)
doc1Map.set('y', 4)
Y.applyUpdateV2(doc2, updates[1])
console.log(doc2Map.toJSON())   // => { x: 1 }  y is entirely missing!
console.log('pendingStruct', doc2.store.pendingStructs)
Y.applyUpdateV2(doc2, updates[0])
console.log(doc2Map.toJSON())   // => { x: 3, y: 4 }
console.log('pendingStruct', doc2.store.pendingStructs)
[CODESANDBOX] No container config detected
[CODESANDBOX] Running zsh
[CODESANDBOX] Running command: yarn start
yarn run v1.22.19
warning package.json: "test" is also the name of a node core module
$ nodemon index.js
[nodemon] 2.0.20
[nodemon] to restart at any time, enter `rs`
[nodemon] watching path(s): *.*
[nodemon] watching extensions: js,mjs,json
[nodemon] starting `node index.js`
{ x: 1, y: 2 }
{ x: 1 }
pendingStruct {
  missing: Map(1) { 1669215651 => 2 },
  update: Uint8Array(26) [
    0, 0, 6, 227, 214, 241, 183,  12,
    0, 1, 2,   0,   1, 136,   1,   0,
    0, 0, 1,   1,   1,   1,   3, 125,
    4, 0
  ]
}
{ x: 3, y: 4 }
pendingStruct null
[nodemon] clean exit - waiting for changes before restart

In your application logic, you should query the stateVector from missing map

@himself65
Copy link
Contributor

himself65 commented Nov 6, 2023

I have made some a script that might can check whether a update would break the data

import { decodeUpdate, decodeUpdateV2, Doc } from 'yjs'

function willMissingUpdateImpl (
  doc: Doc,
  update: Uint8Array,
  decode: typeof decodeUpdateV2 | typeof decodeUpdate
): false | Map<number, number> {
  const { structs } = decode(update)
  // clientId -> clock
  const missingMap = new Map<number, number>()

  // find if missing update in the structs
  for (let i = 0; i < structs.length; i++) {
    const struct = structs[i]
    const client = struct.id.client
    const items = doc.store.clients.get(client) ?? []
    const lastItem = items.at(-1)
    if (!lastItem) {
      if (struct.id.clock !== 0) {
        missingMap.set(client, struct.id.clock)
      }
      continue
    }
    const nextClock = lastItem.id.clock + lastItem.length
    if (nextClock < struct.id.clock) {
      missingMap.set(client, struct.id.clock)
    }
  }
  return missingMap.size > 0 ? missingMap : false
}

export function willMissingUpdate (
  doc: Doc, update: Uint8Array): false | Map<number, number> {
  return willMissingUpdateImpl(doc, update, decodeUpdate)
}

export function willMissingUpdateV2 (
  doc: Doc, update: Uint8Array): false | Map<number, number> {
  return willMissingUpdateImpl(doc, update, decodeUpdateV2)
}

@himself65
Copy link
Contributor

The idea is from the applyUpdate source code; it will check if an update has some missing info before doing the operations.

How to check missing data is easy; just check if the clocks are coherent at the end of each client updates and beginning of the incoming updates

@cacosandon
Copy link

hey @mattc! did you manage to solve this problem?

it still removes the key temporarily.. so I am getting some errors. but it is never permanent tho

@mattc
Copy link
Author

mattc commented Apr 19, 2024

hey @mattc! did you manage to solve this problem?

it still removes the key temporarily.. so I am getting some errors. but it is never permanent tho

No, this is still a painful issue for us. We mitigated it by using a version of what @himself65 describes, by simply not applying changes that result in pendingStructs. However, this eliminates part of why we wanted to use a CRDT at all, and there are traffic patterns that can result in extended periods of not applying changes.

@himself65
Copy link
Contributor

hey @mattc! did you manage to solve this problem?
it still removes the key temporarily.. so I am getting some errors. but it is never permanent tho

No, this is still a painful issue for us. We mitigated it by using a version of what @himself65 describes, by simply not applying changes that result in pendingStructs. However, this eliminates part of why we wanted to use a CRDT at all, and there are traffic patterns that can result in extended periods of not applying changes.

I would say your yjs provider must make sure data is pushed & pulled correctly. The network would likely lose the data pack during the connection so the yjs doc will likely be broken for an unknown reason. My suggestion is to add a signature in each data pack and do checksums frequently, and if data is lost or the doc is broken, fallback to full-size data sync.

@mattc
Copy link
Author

mattc commented Apr 19, 2024

Yes, absolutely. We have a panic moment if a pending struct doesn't arrive within a reasonable amount of time. It doesn't happen frequently but it happens.

Some access patterns can lead to extended periods of at least one missing change, even if no single change will be missing too long. We now avoid client operations that produce very large changes followed by very small changes, since the smaller ones will often be processed first unless we want to introduce a bottleneck. In chatty documents where multiple pairs of misordered changes overlap, this can create longer periods of invalid documents. That requires our system for persisting in-flight changes to consider an arbitrary large number of changes in each batch to guarantee that we're only persisting a stable document.

We managed to mitigate most of this, but the complexity of these mitigations far exceeds the complexity of the rest of our document synchronization system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants