Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redraw hangs with the client decoding error on all >=6.x versions #217

Closed
Rush-iam opened this issue Jan 22, 2023 · 19 comments
Closed

Redraw hangs with the client decoding error on all >=6.x versions #217

Rush-iam opened this issue Jan 22, 2023 · 19 comments
Assignees
Labels
bug Something isn't working

Comments

@Rush-iam
Copy link

It can happen a second after connecting or when a lot of redrawing happens (scrolling the text or selecting text).
With minor screen updates, it can work longer.

When freeze happens, the server logs this:

Warning: client decoding error:
2023-01-22 20:29:02,468  DataCloneError: Failed to execute 'postMessage' on 'Worker': Value at index 0 does not have a transferable type

After refreshing the browser page, it works for a while before freezing again.
It also unfreezes if I switch browser tabs.
When the redraw is frozen - all input continues to work (mouse/keyboard).

I tried almost every connection option on the client side and different options on the server side, but nothing helped.
I tried 6.0, 6.2 and 7.0 beta clients.
The v5.x branch HTML5 client works perfectly without any problems.

My laptop is ChromeOS + Chrome; I tried on a different laptop with Windows 11 + Opera with the same results.
The server is running on Ubuntu 18.04 (v4.3.4-r0).

I continue to use v5.x, but I am ready to provide more information. Thank you for your time!

@totaam totaam added the bug Something isn't working label Jan 23, 2023
@TijZwa
Copy link
Collaborator

TijZwa commented Jan 24, 2023

@Rush-iam do you connect to a SSL server or localhost? Or just over unencrypted websocket?
The main difference is that SSL or localhost context allows native videodecoders.

@totaam
Copy link
Collaborator

totaam commented Jan 24, 2023

@Rush-iam apart from the SSL question from TijZwa, could you also run with -d compress so we can see which type of draw packet is causing the problem.

@Rush-iam
Copy link
Author

@Rush-iam do you connect to a SSL server or localhost? Or just over unencrypted websocket? The main difference is that SSL or localhost context allows native videodecoders.

I connect to a remote server in the cloud.
Connection is the default - I suppose it is WSS (it shows wss://myserver when connecting).
I haven't figured out how to connect without SSL because if I uncheck Secure Sockets, it refuses to connect with the error You were disconnected for the following reason: connection failed. I tried with --ssl=off. Is this flag enough to make it unencrypted?

@Rush-iam apart from the SSL question from TijZwa, could you also run with -d compress so we can see which type of draw packet is causing the problem.

log
Freeze happened with the line:
2023-01-26 00:08:27,112 Warning: client decoding error:

I also tried with nvjpeg encoding - the same.

@totaam
Copy link
Collaborator

totaam commented Jan 26, 2023

it refuses to connect with the error

Without your server command line and log, it's impossible to say why that is.

I tried with --ssl=off. Is this flag enough to make it unencrypted?

As per the documentation, this turns off ssl socket upgrades.
Without knowing your server command line, it's impossible to say what effect that would have on your specific setup.

Freeze happened with the line:

There's nothing unexpected before that, just a mix of rgb24, scroll, webp and jpeg screen updates.
Maybe try running with --compressors=none to avoid using lz4 and with --encodings=all,-scroll to skip scroll encoding.

The server is running on Ubuntu 18.04 (v4.3.4-r0).

That's well out of date and not a supported version.
Please update first.

@Rush-iam
Copy link
Author

Maybe try running with --compressors=none to avoid using lz4 and with --encodings=all,-scroll to skip scroll encoding.

--encodings=all,-scroll fixes the problem! It now doesn't freeze on scrolling. Thank you, @totaam!
BTW, I tried with PyCharm Community.
Can I help with narrowing down the reason problem? Or is it related to the v4.3.4-r0 version?

The server is running on Ubuntu 18.04 (v4.3.4-r0).

That's well out of date and not a supported version. Please update first.

Unfortunately, it is the only system in my company, and v4.3.4-r0 is the latest binary package available for Bionic Beaver.
Should I expect any newer builds for Ubuntu 18.04?
I might try building xpra from sources if it can be run on that Ubuntu version.
Should I try?

@totaam
Copy link
Collaborator

totaam commented Jan 27, 2023

--encodings=all,-scroll fixes the problem!

@TijZwa that should help - let me know if you don't have time for this and I will have a go at it.

Or is it related to the v4.3.4-r0 version?

No, the bug is in the html5 client.

and v4.3.4-r0 is the latest binary package available for Bionic Beaver
Should I expect any newer builds for Ubuntu 18.04?

The official xpra repositories have newer builds than that:
https://xpra.org/dists/bionic/main/binary-amd64/
If your system does not update past 4.3.4 and you have the repositories correctly configured, you may be hitting repository metadata issues (can be seen in the apt-get update command output) which can be fixed by re-configuring the repositories and GPG keys.

@TijZwa
Copy link
Collaborator

TijZwa commented Jan 27, 2023

@TijZwa that should help - let me know if you don't have time for this and I will have a go at it.

It does help. I will try to fix this soon.

@Rush-iam
Copy link
Author

The official xpra repositories have newer builds than that:
https://xpra.org/dists/bionic/main/binary-amd64/
If your system does not update past 4.3.4...

I've opened this link, and the latest version there is xpra_4.3.4-r0-1_amd64.deb 🤔

@totaam
Copy link
Collaborator

totaam commented Jan 27, 2023

@Rush-iam oh, sorry, my bad!
Now I remember - it will be EOL in March so I have already removed support for it to simplify #3592

@Rush-iam
Copy link
Author

@totaam, should I build a newer version of xpra for Ubuntu 18.04 by myself?
(I also have troubles with nvenc on Tesla T4, and I am curious if the newer xpra version could help)

@totaam
Copy link
Collaborator

totaam commented Jan 28, 2023

should I build a newer version of xpra for Ubuntu 18.04 by myself?

I guess you could do that. Compiling with --without-enc_ffmpeg --without-csc_swscale --without-dec_avcodec may be enough to get it to build.
Video encodings will be much slower, but it should run.


(I also have troubles with nvenc on Tesla T4, and I am curious if the newer xpra version could help)

What specific problem?
I am not aware of any major nvenc fixes in 4.4.x

@Rush-iam
Copy link
Author

What specific problem?
I am not aware of any major nvenc fixes in 4.4.x

2023-01-28 17:59:00,185 Error: failed to create data packet
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/xpra/server/window/window_source.py", line 2057, in make_data_packet_cb
    packet = self.make_data_packet(damage_time, process_damage_time, image, coding, sequence, options, flush)
  File "/usr/lib/python3/dist-packages/xpra/server/window/window_source.py", line 2566, in make_data_packet
    ret = encoder(coding, image, options)
  File "/usr/lib/python3/dist-packages/xpra/server/window/window_video_source.py", line 2082, in video_encode
    return self.do_video_encode(encoding, image, options)
  File "/usr/lib/python3/dist-packages/xpra/server/window/window_video_source.py", line 2176, in do_video_encode
    return self.video_fallback(image, options, warn=False)
  File "/usr/lib/python3/dist-packages/xpra/server/window/window_video_source.py", line 2078, in video_fallback
    return encode_fn(encoding, image, options)
  File "xpra/codecs/nvjpeg/encoder.pyx", line 684, in xpra.codecs.nvjpeg.encoder.encode
  File "xpra/codecs/nvjpeg/encoder.pyx", line 362, in xpra.codecs.nvjpeg.encoder.Encoder.init_context
  File "/usr/lib/python3/dist-packages/xpra/codecs/cuda_common/cuda_context.py", line 451, in __enter__
    assert self.lock.acquire(False), "failed to acquire cuda device lock"
AssertionError: failed to acquire cuda device lock

I tested in Blender (OpenGL, xorg + virtualgl wrapper), and it happens after a few seconds of correct work when a large screen redraw is involved.
The client hangs, and after reconnecting, redraw fps is choppy (it seems like compressed formats become disabled on the server).

Tested with HTML clients:

  • v7 beta (master branch) - a few seconds of work before the error
  • v6.2 (v6.x branch) - it sometimes works longer, but the image is more compressed/blurry (and v5/v6 clients have an annoying repeatedly turning on/off scaling in Blender (0.5 sec before switching back), which produces very pixelated (image enc) or very soft (video enc) images.

BTW, v7 beta perfectly works with nvjpeg with super smooth fps! (I stick to it for now)
v5/v6 have that annoying scaling problem, and I had no luck trying to solve it with server flags.

TijZwa added a commit to Tribion/xpra-html5 that referenced this issue Feb 9, 2023
@TijZwa TijZwa closed this as completed in 5608f69 Feb 9, 2023
TijZwa added a commit that referenced this issue Feb 9, 2023
@TijZwa
Copy link
Collaborator

TijZwa commented Feb 9, 2023

Fixed in 5608f69.
This causes a regression, scroll is now out-of-order.

@totaam
Copy link
Collaborator

totaam commented Mar 9, 2023

@TijZwa does the fix above look right to you?
@Rush-iam does that work for you?
Can I release this?

@TijZwa
Copy link
Collaborator

TijZwa commented Mar 9, 2023

does the fix above look right to you?

Yes! This hotfix is fine for now.

@Rush-iam
Copy link
Author

@Rush-iam does that work for you?

@totaam do you mean to apply 6be1053 from 5.x to the master branch and see how it works? It seems just disables scroll encoding (I don't see using scroll as x rectangles message in log anymore), same as running the server with --encodings=all,-scroll.

@totaam
Copy link
Collaborator

totaam commented Mar 10, 2023

@Rush-iam that's what it does.

I'm going to close this ticket then.

@totaam totaam closed this as completed Mar 10, 2023
@Rush-iam
Copy link
Author

Rush-iam commented Mar 13, 2023

@totaam you probably forgot to make a commit to the master, v6.x and v7.x branches before releasing 7.0. The patch was applied only to v5.x

@totaam
Copy link
Collaborator

totaam commented Mar 14, 2023

@Rush-iam 5608f69 fixed the redraw hangs.
6be1053 disables scroll altogether to avoid the occasional visual corruption - I'll apply it since @TijZwa doesn't have a fix for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants