-
Notifications
You must be signed in to change notification settings - Fork 2.3k
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
noVNC eats CPU and hangs Firefox #431
Comments
I am not able to reproduce using TigerVnc as the vnc server. Tested Firefox 33 on Mac OS-X and Firefox 34 on Fedora 20. |
Definitely can be replicated with qemu as the vnc server :-) |
(i.e. qemu causes Firefox, but not Chrome, to start eating CPU, but a traditional VNC server (e.g. tightvnc) does not) |
Ok, so, I did some quick debugging and comparison, and here what I've found so far: Both TightVNC and qemu use the TIGHT encoding. However, they each make different uses of the various compression modes. I inserted some quick metrics code into the TIGHT encoding handler, and here's what a got: TightVNC uses a mix of copy, fill, jpeg, and palette filter operations, and tends to send larger chunks of data per operation. On the other hand, qemu tends to use copy and fill operations (with a , with small chunks of data. I did two sets of runs to capture some numbers. The first set of runs consisted of 5 second bursts, and captured data about copy, fill, and jpeg operations. The second set consisted of 10s bursts, and captured data about copy, fill, jpeg, and filter operations (note that this operation was a bit "slower" since the computations and logging for the filter average also had to be done). The commands used were
As you can see from the table above, qemu uses many more (30x!) operations with much smaller amounts of data. I suspect that this is somehow overwhelming our code in Firefox. We probably didn't notice it earlier because it really only becomes an issue with high-frequency refreshes with large amounts of change (such as rapidly scrolling text). An important thing to note here is that the display in chrome seems to slow down, but Chrome doesn't freeze like Firefox. I'll investigate further and see if I can pin down if there's a specific factor that's causing the slowdown. |
Interesting. Out of interest I connected with guacamole+vnc and it is fine with both chrome and firefox. This may of course be because they are handling things server side. |
@abligh yeah, guacamole is completely different. The question would be if the Java VNC client on the server starts to grow memory use or slow down when connecting to QEMU. @DirectXMan12 Any chance you could do similar tests with the memory profiler in firefox or Google. There might be a memory leak/cycle that QEMU vnc traffic is exacerbating and profiling would probably tell us where. We might need to self-manage a memory pool for something. Actually, I suspect switching everything to use typed arrays throughout would probably address this issue too (partly because, with typed arrays everywhere we would probably be managing more of our own memory too). |
@kanaka: that's what I've been investigating this afternoon -- I suspected it was either leaks or GC pauses (or both) that were doing us in. I'll let you know if I find conclusive data, but preliminary results indicate that we have a lot of array allocations (unsurprising). |
@kanaka: here's a couple of memory timelines that you can open up in Chrome/Chromium (open up the Dev Tools, go to "Timeline", right click -> "Load Timeline Data", and then make sure that up top the whole range is selected (not in grey)): https://gist.github.com/DirectXMan12/6cbec585cfe23679ae06. The sharp drop at the end is me triggering a forced GC. Since it goes all the way down, it looks like (at least in Chrome) we don't have a leak. However, it looks to me like we're triggering GC fairly frequently. |
I've added Firefox profile data to the above multi-file gist as well. From a peek at the Firefox timeline, it doesn't seem like it's GCing frequently like Chromium. |
@kanaka: So, after a bit more investigating, it turns out that one major issues is that our zlib decompressor is slow. I attempted to replace it with pako (https://github.com/nodeca/pako), but encountered some difficulties where a couple of messages result in outputs that are much larger than what the old library generated (additionally, you have to tell it to use a suitably big buffer, since it uses fixed-sized Uint8Arrays). Ignoring, for the moment, a small bit of graphical distortion initially, the resulting output is quite fast, and does not seem to crash Firefox. The main issue that I've found with other zlib implementations in javascript is that they assume the use case of "decompress this whole object" and not "decompress this next part of a stream", and thus get unreasonably fussy when you try to use them with VNC. I may be a bit delayed in following up on this, so if someone else wants to do the research, go ahead. |
Fixes implemented in #488 |
noVNC launched connecting to a qemu instance running Firefox causes extreme CPU usage and essentially hangs firefox. Essentially a lot of RFB output appears to hang firefox.
Here's the easiest way to replicate:
ls -lR /
or indeed anything else that produces a lot of screen outputThe 'death' involves one page of output, a multisecond delay, then a second page of output, then a complete hang with the CPU pegged. Firefox is then unresponsive until the window containing VNC is closed (by hitting the close button on the window), at which point after a few seconds it recovers.
I am using the following:
There is no problem using Chrome.
I conclude the problem is thus somewhere between Firefox and novnc, possibly when qemu is the server, when there's a lot of screen output and scrolling.
The text was updated successfully, but these errors were encountered: