Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

23.05.1 Writer garbled during collaborative editing #6897

Closed
pedropintosilva opened this issue Jul 13, 2023 · 10 comments
Closed

23.05.1 Writer garbled during collaborative editing #6897

pedropintosilva opened this issue Jul 13, 2023 · 10 comments
Assignees
Labels
bug Something isn't working

Comments

@pedropintosilva
Copy link
Contributor

pedropintosilva commented Jul 13, 2023

Describe the bug
Writer garbled during collaborative editing in today's COOL Weekly. Seen in Firefox.
https://staging.eu.collaboraonline.com/nextcloud/index.php/s/JTFyA6nZoYZWZKJ

To Reproduce

  1. Selected
  * Accelerator keys improvements: Focus back to document when Escape key pressed
     * Focus back to document when Escape key pressed by Darshan-upadhyay1110 · Pull Request #6838 · CollaboraOnline/online · GitHub
  1. Press backsapce
    3 Press down key
  2. Press up key
    5 Undo via ctrl +Z
  3. Garbled text

Screenshots
image

Desktop (please complete the following information)
COOLWSD version: 23.05.1.2 (git hash: 4c3be85 (E))
LOKit version: Collabora Office 23.05.1.2 (git hash: 272de6e)
Served by: openSUSE Leap 15.5
Server ID: 79edb2a9

@pedropintosilva pedropintosilva added the bug Something isn't working label Jul 13, 2023
@mmeeks
Copy link
Contributor

mmeeks commented Jul 13, 2023

Szymon also got the same wrong text at the same time - perhaps didn't get invalidation when doing something here. Although looks like a wrong delta is applied.

@mmeeks
Copy link
Contributor

mmeeks commented Jul 14, 2023

So - the 'undo' ctrl+Z would have done a full invalidate of the document - which is interesting; it seems clear the garbled tiles are due to having the wrong deltas applied somehow on top of them. It may be an artifact of GC'ing and re-hydrating tiles when we run out of them; I'll try shrinking the thresholds here to see if that helps reproduce:

_garbageCollect: function() {
	// 4k screen -> 8Mpixel, each tile is 64kpixel uncompressed
	var highNumCanvases = 250; // ~60Mb.
	var lowNumCanvases = 125;  // ~30Mb
	// real RAM sizes for keyframes + delta cache in memory.
	var highDeltaMemory = 120 * 1024 * 1024; // 120Mb
	var lowDeltaMemory = 60 * 1024 * 1024;   // 60Mb
	// number of tiles
	var highTileCount = 2 * 1024;
	var lowTileCount = 1024;

Lets hope Szymon was following you pedro so had the same tile counts and the same output.

@mmeeks mmeeks self-assigned this Jul 14, 2023
@mmeeks
Copy link
Contributor

mmeeks commented Jul 14, 2023

I'm thinking that when we have multiple subscribers, and we get a blank delta back from coolwsd - it is not the case that we shouldn't send anything to the clients - since it depends what sequence they're at - perhaps they didn't catch up with the last change. So;

wsd/TileCache.cpp- // notify that the tile was re-rendered, with no change.

is unhelpful and needs a re-think =)

@mmeeks
Copy link
Contributor

mmeeks commented Jul 14, 2023

With a multi-copy of the ODT from the last call - it's possible to reproduce some things like this with various crass hacks locally ;-) I rather suspect that the idea of bumping the wid in the on the last delta stored in the TileData miight be quite unhelpful. Possibly it could encourage other subscribers to get the same delta again and re-apply it so:

wid: 4 -> 6 -> 8
data: T D

If we get a fetch from one client for a delta from 7 - where we'd normally get nothing since it's up to date, but now we re-wrote this to:

wid: 4 -> 8
data: T D

by bumping the last item - then possibly we re-send the 'D', re-apply it, and get screen corruption as shown.

Possibly we need to append an empty delta to the tile-data, and just bump that - or ... .

Quite probably this is the root cause.

mmeeks added a commit that referenced this issue Jul 14, 2023
Also: address parts of #6897, primarily:

* remove the problematic aspect of bumping the last wid in our
  TileData, when this could trigger a re-send of a previously
  sent delta, causing tile corruption.
    * instead append an empty wid entry.
    * as an optimization - if the last entry is empty update
      the wid - since re-sending an empty delta is of no
      concern.

* simplify a number of code-paths that special-case zero length
  deltas. All deltas now commence with 'D'.

* still track updates in the JS - by detecting empty deltas.

* shares more code and simplifies various paths.

Signed-off-by: Michael Meeks <michael.meeks@collabora.com>
Change-Id: I02af6d4b152524c201b6985b7a3497da7f08a517
@mmeeks
Copy link
Contributor

mmeeks commented Jul 14, 2023

There is some hope that this patch: #6908 may help - at least there is a plausible mechanism whereby it might, in the absence of a reproducer. Need to write some more unit tests targetting that really.

@mmeeks
Copy link
Contributor

mmeeks commented Jul 17, 2023

If I'm right - to reproduce this you would need multiple views on the document with markedly different latency, and editing by one perhaps two of them - the unit test should reflect this from tilewins8 ;-)

eszkadev pushed a commit that referenced this issue Jul 17, 2023
Also: address parts of #6897, primarily:

* remove the problematic aspect of bumping the last wid in our
  TileData, when this could trigger a re-send of a previously
  sent delta, causing tile corruption.
    * instead append an empty wid entry.
    * as an optimization - if the last entry is empty update
      the wid - since re-sending an empty delta is of no
      concern.

* simplify a number of code-paths that special-case zero length
  deltas. All deltas now commence with 'D'.

* still track updates in the JS - by detecting empty deltas.

* shares more code and simplifies various paths.

Signed-off-by: Michael Meeks <michael.meeks@collabora.com>
Change-Id: I02af6d4b152524c201b6985b7a3497da7f08a517
@mmeeks
Copy link
Contributor

mmeeks commented Jul 17, 2023

This persists; its most curious - if I disable _garbageCollect - it stops; so related to client side / JS goodness it seems. As soon as I free canvas' and call _reclaimTileCanvasMemory I can reproduce the issue ... exciting; presumably then related to re-constituting the canvas from the rawData.

@mmeeks
Copy link
Contributor

mmeeks commented Jul 17, 2023

This is/was cause by the somewhat unusual reentrancy between var ctx = this._ensureContext(tile); and _ensureContext calling _applyDelta which can re-constitute the canvas from the 'rawDeltas'. Hopefully quite simple. Good thing to fix a lot of other issues on the way =)

@mmeeks
Copy link
Contributor

mmeeks commented Jul 17, 2023

Tilewins8 has - #6921 with a fix for this one - turned out to be lots simpler than worried - but also fixes a number of other problems that might be related and/or a problem in future. Main fix is:

Author: Michael Meeks michael.meeks@collabora.com
Date: Mon Jul 17 20:59:52 2023 +0100

deltas: fix re-consitution of canvas' from rawDeltas under pressure.

@eszkadev
Copy link
Contributor

I think we can close

rparth07 pushed a commit to rparth07/online that referenced this issue Aug 5, 2023
Also: address parts of CollaboraOnline#6897, primarily:

* remove the problematic aspect of bumping the last wid in our
  TileData, when this could trigger a re-send of a previously
  sent delta, causing tile corruption.
    * instead append an empty wid entry.
    * as an optimization - if the last entry is empty update
      the wid - since re-sending an empty delta is of no
      concern.

* simplify a number of code-paths that special-case zero length
  deltas. All deltas now commence with 'D'.

* still track updates in the JS - by detecting empty deltas.

* shares more code and simplifies various paths.

Signed-off-by: Michael Meeks <michael.meeks@collabora.com>
Change-Id: I02af6d4b152524c201b6985b7a3497da7f08a517
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

No branches or pull requests

3 participants