mixed up buffered data in demo #4485

jerch · 2023-04-20T11:29:46Z

Repo:

spawn demo terminal with a col width of multiples of 10
exec python3 -c "print(''.join(['e\u0301e\u0301e\u0301e\u0301e\u0301e\u0301e\u0301e\u0301e\u0301#' for i in range(1000000)]))", which inserts 9 combined char + '#' over and over

Expected:

'#' position should not change col

Results:

Firefox: all good, no weirdness in scrollback
safari, all good, no weirdness in scrollback
recent chromium (on linux, 112.0.5615.49) - '#' shifts around indicating that something is wrong with buffer contents. Scrolling through scrollback reveals this:

For some non obivous reason the initial shell prompt occurs in the middle of the scrollback again. This behavior is stable across all renderers in chromium, therefore my guess is that the buffer is partially broken. I cannot provoke the same in firefox or safari, which makes me wonder if we have chrome specific code somewhere or if this is a real chrome bug.

Still needs more digging...

Edit - some more tests:

ttyd 1.7.3 (based on xterm.js 5.1)
- with firefox: cannot repro
- with webkit: cannot repro
- with chrome (again same chrome as above): repro'ed
vscode 1.77.3 terminal (internal chromium 102.0.5005.196) - cannot repro

The text was updated successfully, but these errors were encountered:

jerch · 2023-04-20T20:36:05Z

This is a bug in server.js of the demo, here the binary message buffering does not work as expected. For some reason socket.send mixed data up here:

xterm.js/demo/server.js

Line 119 in 2fdb469

socket.send(Buffer.from(dataBuffer.buffer, 0, length));

I could not quite get down to issue behind that, as it points deep into ws.js and nodejs socket implementation. At least one aspect looks suspicious - the use of the underlying array buffer with a byteoffset of 0 is most certainly wrong in nodejs, as sometimes Buffer might not start from byte 0 of the underlying array buffer (imho its even mentioned somewhere in docs as a speed optimization underneath).

@Tyriar I saw that you changed it from the old Buffer.concat method to the current version for GC reasons in 7908b7c. To me it seems we run now into edge cases, where basic assumptions in ws.js or in Socket are not fullfilled anymore:

One guess is that with a "static" buffer chunk we'd have to wait for the callback of socket.send(data, opts, callback) before we can reuse the same memory. I did not find any clue how ws.js goes about ownership of the data chunk, if and how long send would block, or when we can write to that memory location again. I also looked at socket.write of nodejs but dont know how that exactly maps to ws.send.
With Buffer.concat we dont have the same issue, as it does always an interim copy of the accumulated data, but sure, now alloc/GC gets busy.

I dont care much about perf optimization of the nodejs side of our demo, but if you want to stick with the no alloc/GC approach we will have to implement a bigger static fifo buffer battery with some sort of backpressure until we can write again to that buffer. We cannot have both at once (no alloc and no blocking), as we either have to deal with tons of data (alloc), or have to tell the pty to stop data madness for some time. (For the sake of correctness - ofc blocking at times is the right approach, Buffer.concat is just a lame coding excuse with OOM risk...)

While changing the buffering in server.js fixes things for chrome, I still could not figure out why it only affects chrome in the first place, hmm.

Edit:
It is def. related to the borrowed nature of the memory - I replaced the line above with these 2 variants:

// also borrowed, but w'o the byteOffset issue - still repros!
socket.send(dataBuffer.subarray(0, length));

// interim copy with slice - does not repro anymore
socket.send(dataBuffer.slice(0, length));

So not a non-0 byte offset is the reason, but borrow mechs messing things up. Thats surprising, a non-0 zero byte offset would have explained the re-occurrence pretty good (like the shell prompt written in the past at pos 0 gets reprinted by a later non-0 buffer chunk wrongly pulling from pos 0).

jerch · 2023-04-20T22:53:30Z

Here is a quickly hacked version based on the current code with proper backpressure:

    // binary message buffering
    function bufferUtf8(socket, timeout, maxSize) {
      const dataBuffer = new Uint8Array(maxSize);
      let sender = null;
      let length = 0;

      // whether we paused pty due to websocket pressure
      let isPaused = false;
      // must hold excess chunks somewhere temporarily
      // those excess chunks still create some alloc/GC actions,
      // hopefully not many depending on how fast the PTY
      // can stop the data inflow
      const excessChunks = [];

      function flush(d) {
        // cannot write to our memory while socket is undrained
        // accumulate chunks in excessChunks, gets flushed on drain below
        if (isPaused) {
          if (d) excessChunks.push(d);
          return;
        }
        socket.send(Buffer.from(dataBuffer.buffer, 0, length));
        // ws.js does not expose return value from socket.write
        // use writableNeedDrain property instead
        if (ws._socket.writableNeedDrain) {
          term.pause();
          isPaused = true;
        }
        length = 0;
        if (sender) {
          clearTimeout(sender);
          sender = null;
        }
      }

      function write(data) {
        if (length + data.length > maxSize) {
          flush(data);
        }
        dataBuffer.set(data, length);
        length += data.length;
        if (length > maxSize || userInput) {
          userInput = false;
          flush(null);
        } else if (!sender) {
          sender = setTimeout(() => {
            sender = null;
            flush(null);
          }, timeout);
        }
      }

      socket._socket.on('drain', () => {
        // resume early, so pty has time to prepare next chunk
        // (opportunistic, might get blocked again right away,
        // if there was too much data in excessChunks)
        if (isPaused) {
          term.resume();
          isPaused = false;
        }
        // need to catch up writing the excess data
        if (excessChunks.length) {
          let i = 0;
          for (; i < excessChunks.length; ++i) {
            write(excessChunks[i]);
            // too much data in excessChunks,
            // need to wait for next drain event :(
            if (isPaused) break;
          }
          if (i < excessChunks.length - 1) {
            excessChunks = excessChunks.splice(0, i);
          } else {
            excessChunks.length = 0;
          }
        }
      });

      return write;
    }
    const send = (USE_BINARY ? bufferUtf8 : buffer)(ws, 5, 262144);

The code is not nice by any means. Better would be to return the blocking state from send/write, and handle that on the caller level. But that needs a full rewrite of the buffering logic, meh.
Also the excessChunks array is unlucky, in C one would just request as much as you can really digest/hold, but we have no such possibility from an event callback (writing alloc-free code in a mainly GC-driven language, where really everything relies on hire&forget memory, is quite challenging).

Tyriar · 2023-05-17T19:45:13Z

We can revert my change if it introduced a bug, perf on the demo isn't critically important anyway

jerch · 2023-05-18T14:43:14Z

Ah ok, then reverting is prolly the easiest fix here. It still has the "stocking up data" issue, but a fully-fledged integration should deal with that anyway by proper flow control.

jerch added the type/bug Something is misbehaving label Apr 20, 2023

jerch changed the title ~~serious buffer bug~~ mixed up buffered data in demo Apr 20, 2023

Tyriar mentioned this issue May 20, 2023

Render glitches (blacked out content) #4480

Closed

jerch mentioned this issue May 21, 2023

fix buffer corruption for utf8 transport in demo #4527

Merged

Tyriar closed this as completed in #4527 May 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mixed up buffered data in demo #4485

mixed up buffered data in demo #4485

jerch commented Apr 20, 2023 •

edited

Loading

jerch commented Apr 20, 2023 •

edited

Loading

jerch commented Apr 20, 2023 •

edited

Loading

Tyriar commented May 17, 2023

jerch commented May 18, 2023

mixed up buffered data in demo #4485

mixed up buffered data in demo #4485

Comments

jerch commented Apr 20, 2023 • edited Loading

jerch commented Apr 20, 2023 • edited Loading

jerch commented Apr 20, 2023 • edited Loading

Tyriar commented May 17, 2023

jerch commented May 18, 2023

jerch commented Apr 20, 2023 •

edited

Loading

jerch commented Apr 20, 2023 •

edited

Loading

jerch commented Apr 20, 2023 •

edited

Loading