Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

youtube.com takes really long time to load #266

Closed
dmik opened this issue Apr 24, 2018 · 83 comments
Closed

youtube.com takes really long time to load #266

dmik opened this issue Apr 24, 2018 · 83 comments
Milestone

Comments

@dmik
Copy link
Contributor

dmik commented Apr 24, 2018

It appears that sometimes opening https://youtube.com takes minutes before anything appears on the page — in the mean time you only see the progress ring spinning and something like Read www.youtube.com, Connecting to s.ytimg.com... and alike on the status tooltip at the bottom left corner of the page. You may make it work faster by reloading the page several times with Ctrl+R (in this part it's similar to #242).

All 45.9.0 builds as well as 45.5.0 from May 2017 are affected while 38.x and earlier builds seem to be not.

My current guess is still that it has something to do with the network connection. However, it looks like it's on the Firefox side, not on the TCP/IP stack side. It might be some security issues, missing certificates or such and delays in the connection caused by them. At least I sometimes see a lot of errors in JS regarding certificates. I need to study it closer. The problem is, as usual, that the failure is irregular and once it starts working, it's quite hard to make it fail again.

@dmik dmik added this to the 45esr GA2 milestone Apr 24, 2018
@dmik
Copy link
Contributor Author

dmik commented Apr 24, 2018

Note that builds from Dave Yeo (e.g. https://bitbucket.org/dryeo/dry-comm-esr31/downloads/firefox-45.9.0.en-US.os2.zip) don't make any difference here. I couldn't make it hang so far but I can't my own test builds (or our official RPM builds) hang either any more. I doubt that the compiler optimization options (the only thing which is different in Dave's builds apart that he also uses the unofficial gcc 5.1.0 build) play any significant role here. It more looks like a timing issue which gets triggered at different moments in different builds depending on the optimization options. Which means that Dave's builds are subject to these hangs too sooner or later.

One possible workaround for the issue is to use a clean Mozilla profile (by renaming %HOME%/Mozilla to %HOME%/Mozilla.old to let Firefox create a new one). But if I'm right about timings, this is also only a workaround.

@dspiatkowski
Copy link

@dmik After several days of running the GA1.1 drop I noticed that the video performance did actually get worse as compared to the previous test builds. Specifically, the native YouTube video coded (VP9) would simply hang at start (both GA1.0 and GA1.1), exhibiting the symptoms you described above. However, utilizing the h264ify add-on allowed me to play back videos smoothly. The looping sound was addressed with the upgraded UniAudio drivers, so prior to the GA1.1 build I finally had a well working YouTube playback. Following the upgrade the VP9 continued to be a problem, just as it was before, but to make matters worse the h264ify add-on no longer produced a smooth playback, it now stutters.

A couple of days ago Dave released his build. I installed it and for the first time ever can successfully play back the YouTube VP9 stuff, no more need to run the h264ify add-on, the videos simply play. Smooth, no stuttering, etc, etc. His build also does not show the larger CPU consumption, although compared to the previous releases, such as GA1.0, the CPU usage is still higher. I previously commented on that in #265.

I have a Fibre-120 (125 Mbit/s) network connection, with a sustained high data rates to YouTube, therefore, I believe it is not a network speed issue. Whatever changes Dave implemented, could be just the GCC 5.1.0 or the other optimizations, they seem to have been a step in the right direction from my perspective. Dave's release is also an i686 build, I would like to test a pentium4 build for a closer 'apples to apples' comparison. Not sure that has anything to do with it, but it's another question mark that we can address.

@dmik
Copy link
Contributor Author

dmik commented Apr 24, 2018

@dspiatkowski still, what you say doesn't prove it's not a timing issue. I'm pretty sure you will see similar problems with the Dave's build one day. I will do a test build with the same optimizer options Dave did but even that won't be a proof. It simply must not depend on those options this way. And if it does, something is wrong somewhere else (kernel task scheduling, network stack, bad Firefox code design, whatever).

@dspiatkowski
Copy link

@dmik I agree, not a hard proof at all, but it is the only working configuration I seem to have at the moment and so I share my experience in hopes of providing additional data points for analysis. For what it's worth, given that I see a consistent outcome (and a different video playback result) between these two builds I would be more than happy to do whatever additional data capture/debug you need to try to narrow this one down. In #264 I included visual screenshots of the YouTube video 'Nerd Info' for that very reason, to show a side-by-side VP9 vs h264ify outcomes. As others have pointed on in the OS2World thread, they are also seeing different results.

There is something else I wanted to point out that pertains to Dave's build's behaviour that is consistent with what I have seen in the past, and that included the BWW FF builds. When using a i686 build of FF sometimes I will get a solid 100% CPU spike that persists for about 5-10 sec (usually) and which all of a sudden goes away. Your GA1.0 and 1.1 would never show this, but again, these was a pentium4 builds. Dave's test is a i686 build and it shows these very same spikes once again. This is part of the reason why I asked Dave to produce pentium4 build. Given the various results people are reporting I wonder if there is a specific corelation to the RPM/YUM platform settings and how this affects FF. After all, something as simple as having ffmpeg libraries installed for video playback will either get you a pentium4 of i686 versions. Maybe this is all that is required to cause the sort of timing issues that you are concerned about???

@dmik
Copy link
Contributor Author

dmik commented Apr 24, 2018

@dspiatkowski thanks for the feedback but so far I have no idea which info besides what you already said you could provide. We need at least some hint to understand what's behind all of it. But I doubt it's just machine type. Here I test both pentium4 as well as i686 and haven't noticed any consistent difference.

May be finishing the profiler code (#264, your reference was to #242 I believe) and actually enabling profiling could shed some light on it.

@dmik
Copy link
Contributor Author

dmik commented Apr 24, 2018

I've uploaded a test build with the same optimization as Dave's as http://rpm.netlabs.org/test/firefox-45.9.0-3.t1.i686.7z (and it's also i686, yes). Please test. And compare with the official GA1.1.

@dryeo
Copy link
Contributor

dryeo commented Apr 25, 2018

This build plays YouTube videos fine. Never could get the previous build to play one with YouTube suggesting to restart my device and waiting or pressing F5 to reload didn't help.
I've found straight -O3 optimization with SeaMonkey results in the least CPU usage but -Os helps a bit with memory. You may want to test both.
The -O2 optimization in configure.in has been there since forever, probably originally for EMX builds where -O3 was likely unstable.
I actually have a different problem with compiling with 4.9.2 where where VP9 videos display static and then a message comes up about an error occurring.

@dryeo
Copy link
Contributor

dryeo commented Apr 25, 2018

This build also cleanly shutdown unlike the previous build which would leave the icon hatched until using TOP or such to kill it.

@dspiatkowski
Copy link

Confirming Dave's findings here as well. There appears to be as substantial difference in the GA1.1 and this T1 build. Not only does FF now play the YouTube VP9 videos, but they are pretty smooth (short of some stutter when the system is being heavily tasked by other processes - I actually stress tested it by having a faull-screen FF playback while openning OpenOffice document), but even attempting to run h264ify add-on also gives smooth playback. Basically VP9 quality now matches (if not actually exceeds) the h264ify playback quality here.

Also, what I noticed is that this test build appears to be less CPU cycle hungry. I've only got a few hours of runtime at the moment so this could be just the result of limitted browsing when FF is generaly well bahaved. However, even compared to a similar runtime on GA1.1 the CPU appears to be less tasked. Further update to follow, so this last comment may be a bit premature...

@dspiatkowski
Copy link

...some YouTube playback stats screenshots using VP9 and h264ify add-on.
t1-youtube_vp9
t1-youtube_h264ify

@dmik
Copy link
Contributor Author

dmik commented Apr 25, 2018

@dryeo are you sure about GA1.1? Can you just wait for several minutes? Can you let FF create a new profile? As I said, both work here but I had to wait and Ctrl-R my GA1.1 build to make it work. And since then it works like a charm. And I still don’t see how compiler optimization could break logic here. So it still looks like a random (=timing) issue for me which just less rare with different optimization options for some reason. At least we proved that GCC 5.1.0 is not involved here.

I will also try full -O3 here and pentium4 to see if it makes any difference. Regarding the default of -O2, it’s not an OS/2 only thing - this is default on many other platforms too. I don’t know reasons behind that though. I guess it’s just a balance between the size and the speed. Again, this should not make a difference function wise. If it does, something is seriously wrong somewhere else.

@dmik
Copy link
Contributor Author

dmik commented Apr 25, 2018

And I will just state it once again that the only difference between ga1.1 and t1 is forcing -O3 for JS and -Os for the rest and -march=i686 instead of pentium4. The rest is fully identical.

@dmik
Copy link
Contributor Author

dmik commented Apr 25, 2018

Note also that it’s not -march=pentium4 either as 45.5.0 released in May 2017 is also i686 and it was broken here too when I found 45.9.0 to be broken. Now both work.

@NeilWaldhauer
Copy link

With t1, I get good YouTube playback on Lenovo ThinkCentre Tiny M92p
With ga1.1 it I did not see it play; perhaps I didn't wait long enough.
With Dave Yeo's build YouTube playback is good.\

@dryeo
Copy link
Contributor

dryeo commented Apr 25, 2018 via email

@dmik
Copy link
Contributor Author

dmik commented Apr 25, 2018

Well, different ads? Hmm, are we talking about the same thing? Here I saw a blank page in in all 45.* builds when it broke and some constant loading progress of the background content (like scripts and so on). Can you give me a screen shot of GA1.1 when it’s trying to open YouTube.com? Maybe several if it changes while you wait.

@dryeo
Copy link
Contributor

dryeo commented Apr 25, 2018 via email

@dryeo
Copy link
Contributor

dryeo commented Apr 26, 2018

Here are a couple of screen shots, one of a commercial and one of a hung video. I misspoke earlier, I only get a new ad when reloading. After a while, the ad turns into a black screen.
While taking a screenshot, an ad finally played, but afterwards no more videos would play.
yt_test1
yt_test2

@dryeo
Copy link
Contributor

dryeo commented Apr 26, 2018

Today I built Firefox with GCC 5.1.0 and -O2 -march=pentium4 and it consistently plays videos here.

@lerdmann
Copy link

I now tried dmiks test build:

  1. playability of Youtube videos is entirely tied to using h264ify add-on. If I have it installed I can play Youtube videos, if I don't have it installed I cannot. It does not matter what build I use.
  2. on my system, the test build works worse than the latest release build. I am pretty sure that dmik is right that it is some sort of timing issue (and not a network issue). The test build blocks the browser for extended periods of time but funny enough I can use the file menu and close it.
  3. both, the test build and the latest release do not properly delete the temporary files they create, for example the "parent.lock" file in the profile directory. I think that is a good indicator that FF does not properly clean up after itself.

I have tried this with the 14.106 SMP kernel as well as with the latest OS4 kernel. I have an 8-core AMD system. What I could try is to enable only one core per package (where a package contains 2 cores) to avoid and HT issues but the past has shown that that helps only marginally.

@dspiatkowski
Copy link

dspiatkowski commented Apr 26, 2018 via email

@dmik
Copy link
Contributor Author

dmik commented Apr 26, 2018

@dryeo well, ok, your situation is different from what this ticket originates from. Your case looks more like #242. Anyway, I'm sure the reason behind both problems is the same. However, -O2 and -march=pentium4 is exactly what the GA1.1 build in 7z is (except GCC but my t1 is a proof that it doesn't matter). Which, in turn, proves that compiler options are just a side effect here and the main problem is timing. My current guess is that the new JVM "task scheduler" (that was heavily changed after Firefox 38) works very unstable on OS/2 so that some JS scripts start starving like hell especially when several are executed in parallel. Looks like we miss some platform-specific implementation detail and the generic code is just too dumb (it's also possible that the new JVM scheduler design is just not good). All this requires quite a complex research.

Anyway, I will try another build: -O3 and -march=pentium4 to see if it makes any difference.

@lerdmann re parent.lock are you sure it should go away at shutdown? This might be a usual unix code path that deletes an open file (which doesn't work on OS/2 or Windows of course). May be some fix is needed in the JS routines responsible for that. It's worth a separate ticked, please create one if you are sure it should be gone.

@dryeo
Copy link
Contributor

dryeo commented Apr 26, 2018 via email

@dmik
Copy link
Contributor Author

dmik commented Apr 26, 2018

BTW, I'm looking at optimization options of other platforms now. In fact, current platforms like Linux and Darwin use -O3 by default. I'm also seeing they play around with -freorder-functions and -freorder-blocks which reorder code in the executable to make it more local and reduce the number of branches. And -freorder-functions actually requires support from the linker which we might not have.

The GCC documentation is a bit contradictive about these options (as usual), one place says that both are enabled in -O2, -O3 and -Os while the other place says that -Os disables -freorder-blocks as well as some code alignment optimizations. This might be the reason why timings change that much when using different options. I might also try to disable -freoder-functions on OS/2 given there is no support in the linker. I see that Android builds use -Os for XUL, -O3 for JS and in both cases disable function reordering but enable block reordering. BTW, Linux also uses -Os for XUL and -O3 for JS. However, it doesn't disable function reordering.

@dmik
Copy link
Contributor Author

dmik commented Apr 26, 2018

Dave, I finally proved that it's not compiler options or such. I got it broken again and your build dated 23.04.2018 also doesn't work, this is what I get with it and it's hanging there for many minutes already:

default

I made a couple of other test builds though, will upload them shortly.

@dryeo
Copy link
Contributor

dryeo commented Apr 26, 2018 via email

@lerdmann
Copy link

lerdmann commented Apr 27, 2018

It must be a timing problem or such. I now went back to GA 1.1 with h264ify still being activated.
All of a sudden Youtube again ceased to work. If would have to guess I would believe that some thread tries to preload some data in order to start playing the video and gets interrupted by the OS scheduler changing to another execution thread and later it cannot properly pick up where it left off. Or, especially on multi core, 2 FF threads are not properly synchronized and the problem starts to show if these 2 threads are simulatenously executed on 2 cores.

As to "parent.lock": it's fairly obvious that this serves some sort of notification (the file always has zero content). And in the past people stated that having that file in your profile will eventually lead to Firefox asking you to refresh your profile (killing all your preferences and such). So yes I am fairly sure it has to go away on FF shutdown.
But it's well possible that this has been broken for a long time and therefore we all got used to finding that file in our profile ...
On the other hand: Thunderbird also creates that file and does not delete it ...

@dmik
Copy link
Contributor Author

dmik commented Apr 27, 2018

BTW, an attempt to build XUL with -O3 failed, XPCSHELL.EXE fails with this:

______________________________________________________________________

 Exception C0000005 - Access Violation
______________________________________________________________________

 Process:  D:\USERS\DMIK\RPMBUILD\BUILD\MOZILLA-OS2-FIREFOX_45_9_0ESR_RELEASE_OS2_GA1_1\OBJDIR\DIST\BIN\XPCSHELL.EXE (04/27/2018 04:12:28 274,602)
 PID:      8BAF (35759)
 TID:      01 (1)
 Priority: 200

 Filename: D:\USERS\DMIK\RPMBUILD\BUILD\MOZILLA-OS2-FIREFOX_45_9_0ESR_RELEASE_OS2_GA1_1\OBJDIR\DIST\BIN\XUL.DLL (04/27/2018 04:12:22 439,005,274)
 Address:  005B:10910AC2 (0001:01160AC2)
 Cause:    Unknown access fault

______________________________________________________________________

 Failing Instruction
______________________________________________________________________

 10910AAC  MOV    BYTE [EAX+0x7], 0xa        (c640 07 0a)
 10910AB0  MOV    EAX, 0x8                   (b8 08000000)
 10910AB5  JMP    0x10910983                 (e9 c9feffff)
 10910ABA  MOVDQA XMM0, DQWORD [0x1093ecc0]  (660f6f05 c0ec9310)
 10910AC2 >MOVDQA DQWORD [0x17a01588], XMM0  (660f7f05 8815a017)
 10910ACA  MOVDQA XMM0, DQWORD [0x1093ecd0]  (660f6f05 d0ec9310)
 10910AD2  MOVDQA DQWORD [0x17a01598], XMM0  (660f7f05 9815a017)
 10910ADA  MOVDQA XMM0, DQWORD [0x1093ece0]  (660f6f05 e0ec9310)

______________________________________________________________________

 Registers
______________________________________________________________________

 EAX : 0000003A   EBX  : 00000008   ECX : 0000003A   EDX  : 20202DE0
 ESI : 0000003A   EDI  : 2006E2F8
 ESP : 0013F7F0   EBP  : 2006E2F8   EIP : 10910AC2   EFLG : 00010246
 CS  : 005B       CSLIM: FFFFFFFF   SS  : 0053       SSLIM: FFFFFFFF

 EAX : not a valid address
 EBX : not a valid address
 ECX : not a valid address
 EDX : read/write memory allocated by LIBC066
 ESI : not a valid address
 EDI : read/write memory allocated by LIBC066

______________________________________________________________________

 Stack Info for Thread 01
______________________________________________________________________

   Size       Base        ESP         Max         Top
 00100000   00140000 -> 0013F7F0 -> 0013C000 -> 00040000

______________________________________________________________________

 Call Stack
______________________________________________________________________

   EBP     Address    Module     Obj:Offset    Nearest Public Symbol
 --------  ---------  --------  -------------  -----------------------
 Trap  ->  10910AC2   XUL       0001:01160AC2  nsTextFragment.cpp#60 __ZN14nsTextFragment4InitEv + 180 0001:01160942 (D:\Users\dmik\rpmbuild\BUILD\mozilla-os2-FIREFOX_45_9_0esr_RELEASE_OS2_GA1_1\objdir\dom\base\Unified_cpp_dom_base8.cpp)

 2006E2F8  200712A0   *Unknown*

 Lost Stack chain - new EBP below previous

Given that there is no any significant benefit from -O3 (and Linux builds specifically use -Os for some reason), I'm not going to debug this.

So, here is the other test build: http://rpm.netlabs.org/test/firefox-45.9.0-3.t2.pentium4.7z. It has JS compiled with -O3 -fno-reorder-functions -freorder-blocks -march=pentium4 and the rest (XUL etc) is the same but -Os is used instead of -O3. Given that I don't see any effect from -fno-reorder-functions -freorder-blocks I'm going to leave them out (and -freorder-blocks should be on anyway both for -O3 and -Os according to GCC docs). The only thing I think these reorder things may affect is crashes in LIBC at exit. So those who experiences them, please test to see if t2 is any different to t1 and GA1.1 in this regard.

@dspiatkowski
Copy link

@dmik T2 test result: FF traps, the window frame shows up and it's "game over" after that. I've attached the trap dump.


Exception C0000005 - Access Violation


Process: G:\APPS\TCPIP\FIREFOX\FIREFOX.EXE (04/27/2018 08:18:16 50,900)
PID: 60 (96)
TID: 01 (1)
Slot: CF (207)
Priority: 200

Module: XUL
Filename: G:\APPS\TCPIP\FIREFOX\XUL.DLL (04/27/2018 08:18:16 29,284,524)
Address: 005B:11484AE5 (0001:02564AE5)
Cause: Unknown access fault


Failing Instruction


11484AD6 MOV EAX, [ESI+0x8] (8b46 08)
11484AD9 MOV [ESP+0x4], EAX (894424 04)
11484ADD MOV [ESP], EBP (892c24)
11484AE0 CALL 0x113fbab4 (e8 cf6ff7ff)
11484AE5 >MOVDQA XMM0, DQWORD [ESP+0x50] (660f6f4424 50)
11484AEB MOV DWORD [EBX], 0x0 (c703 00000000)
11484AF1 MOV DWORD [EBX+0x4], 0x0 (c743 04 00000000)
11484AF8 MOV DWORD [EBX+0x8], 0x0 (c743 08 00000000)


FF_T2_0060_01_TRP.zip

@dmik
Copy link
Contributor Author

dmik commented May 7, 2018

Ok guys, please test another build: http://rpm.netlabs.org/test/ff45_9_0_t5.7z. The only difference from T4 is that I use -Os again for everything but JS (to prove that it's -O3 and -march=pentium4 w/o -mno-sse -mno-sse2 which are responsible for crashes). I also want to check how much -Os affects performance compared to -O3 for XUL. Given that with -Os the size of XUL (with debug info removed) drops from 40 MB to just 30 MB, we may consider using it if the performance drop is not significant because saving 10 MB in our tight shared memory arena is not that bad, actually.

@an64 well, plugin-container.exe should just not be there. It doesn't work right ATM (there is a respective ticket). IIRC, I had to remove it from RPM-based builds because there turned to be the simplest way to completely disable out-of-process for plugins or such. So please remove it and see how it goes. Simple plugins should work then. I made them work this way for 45.x back then at least. If not, please find the latest release where they still do.

@dspiatkowski
Copy link

@dmik Build T5 feedback: working fine here, no crash, no issues to report so far. Compared to builds T3 & T4 there is a noticable improvement in YouTube VP9 playback, I would say it is on-par with the previous working test release, which I think was T1 b/c T2 was trapping right away (sorry, I should have been logging these things here, or labelling my ticket updates accordingly - like I'm doing with this one).

@dmik
Copy link
Contributor Author

dmik commented May 7, 2018

@dspiatkowski Interesting. T1 was also -Os. Are you sure it's not an observer's effect? -Os is optimize for size which should be generally slower than any other type of optimization (for the price of a better in-memory footprint). Do you have any numbers indicating that Youtube behaves better on T5 than on T4?

@dspiatkowski
Copy link

@dmik Well, I am basing my YouTube feedback on how actual playback proceeds. It is either smooth, no video and/or audio drop-outs, or stutters, hangs, etc., or it is not. I have no hard numbers to use, other than the few screenshots of YouTube "nerd data" I have captured in the past as we were looking at potential network speed issues. All of these pointed to a significant amount of network speed availability.

Would capturing the YouTube "nerd data" for the very same video in both T4 & T5 help in any way?

@an64
Copy link

an64 commented May 8, 2018

@dmik I'm dont understand...
In #229 you say that plugins can only work in oop mode and you make a lot of commits to support this , isnt right?

@an64
Copy link

an64 commented May 8, 2018

t5 here shows slightly lower cpu usage on vp9 and h264 playback, no hangs
But sound on any version (38, 45 all builds) somethimes drops, maybe some thread needs higer priority?

@lerdmann
Copy link

lerdmann commented May 8, 2018

Trying http://rpm.netlabs.org/test/ff45_9_0_t5.7z: with the Youtube web page open, a main window resize takes an extended period of time (I did not try this with http://rpm.netlabs.org/test/ff45_9_0_t4.7z, maybe I should).At least, initially. Sluggish scrolling behavior. Youtube video playback is sluggish, moving the mouse cursor over the Firefox window will get the video out of sync with the sound (but I think it has been this way for quite some time). I am using "layout.frame_rate" = 0.
But no hangs or traps or the like, neither during normal operation nor on exit.
By the way, setting "layout.frame_rate" to 1200 still makes Firefox "cycle" if it is minimized.

@dmik
Copy link
Contributor Author

dmik commented May 9, 2018

@an64 hmm I refreshed my memory and you are right, thanks for popping it up. Plugins should generally work in OOP mode now and no reason to remove/disable plugin-container.exe anymore. At least it was the case when I closed that issue. Can you please tell me what is the latest release where plugins work for you?

@dmik
Copy link
Contributor Author

dmik commented May 9, 2018

@lerdmann remember t4 and t5 differ only in -O3 vs -Os for XUL.DLL (all code except JS). Please test t4 then. Though I don't think these options make that much difference so please make sure you don't have some weird frame_rate value (and restart FF each time you change it — it might not pick it up everywhere on the fly). And why you still see high CPU load when minimized and frame_rate != 0 is also a puzzle since nobody else is seeing that.

@dmik
Copy link
Contributor Author

dmik commented May 9, 2018

@dspiatkowski re new YouTube nerd data, if you see any significant difference between the builds, then yes, post it.

@an64 regarding your sound issues, I doubt it has anything to do with FF per se. Please try to install the latest UNIAUD from Netlabs — there are reports it helps with sound in FF.

@lerdmann
Copy link

lerdmann commented May 9, 2018

  1. yes, http://rpm.netlabs.org/test/ff45_9_0_t5.7z is not significantly worse than http://rpm.netlabs.org/test/ff45_9_0_t4.7z. Changing window size while a movie is playing takes a long time to finish for both versions
  2. I always close FF when I change "layout.frame_rate". I realized right away that apparently this value is only read once on program start.
  3. Specifying anything but -1 for "layout.frame_rate" will indeed eventually set all threads to "Blocked" on minimizing the window. But that apparently takes some time to happen. Strange.
  4. Specifying 10000 is equivalent to specifying 0 indeed. Typing something into an entry field is mostly ok (apart from some occasional blocking).

I wonder what "layers.offmainthreadcomposition.frame_rate" variable can do for us. It is set to -1.

I think FF is much too greedy in releasing its threads. Why would you want to handle ANY messages if the program is in the background ? I think that this WinPeekMsg is really counterproductive.

@dryeo
Copy link
Contributor

dryeo commented May 9, 2018 via email

@dmik
Copy link
Contributor Author

dmik commented May 10, 2018

@lerdmann okay, I see. Thanks for testing! layers.offmainthreadcomposition.frame_rate is irrelevant for us ATM as we have OMTC disabled for now (see #200 for details).

Regarding "greediness". This is how the Presentation Manager is designed. A PM application running a message queue is obliged to process all incoming messages as fast as possible in order for the whole PM desktop to function properly. There are many system messages which require immediate processing by the receiver (and that's besides application-specific notifications mentioned by @dryeo). If an application needs more than a dozen of milliseconds to process some message, it is supposed to do so asynchronously WRT reading and processing other incoming messages. And this is where the FF problem actually relies. It 1) emits too many messages 2) takes too long to process some of them synchronously (i.e. on the same thread that is supposed to process other incoming messages). And this negatively affects all the PM. While in theory one may indeed blame the PM design for that, it's what it is and we don't (and won't) have a different PM (and many other platforms have similar requirements). So it's FF which needs to behave properly here. It used to do so more or less but a lot of things have changed in it since then and modern platforms offer better parallelism and less strict requirements which FF seems to utilize w/o caring about older systems that don't. And here lies a fundamental problem as there might be just too many things to change in FF to make it behave a native PM citizen again — so many that it might be equivalent to writing some of its subsystems from scratch (which is apparently beyond our resources given the general complexity and quality of the FF codebase).

@dmik
Copy link
Contributor Author

dmik commented May 10, 2018

One thing that FF surely employs here is hardware acceleration of 2d rendering — something we miss almost completely on OS/2. On modern platforms some composition and paint operations take much less time than on OS/2 which means that a paint request (which always arrives on the main thread) is processed faster and so are all other upcoming messages. Another thing is that major platforms use OMTC for some time already — and OMTC is also about to transfer resource-greedy 2d rendering (especially if we take all those modern HTML5/CSS3 features into account) to other threads in order to reduce the main thread load and increase UI responsiveness. Things could have been slightly better if we had OMTC enabled on OS/2 but that's a task of its own.

@dmik
Copy link
Contributor Author

dmik commented May 11, 2018

While trying to debug & understand the complex (well, very complex, Id even say overcomplicated) FF messaging pipeline, I see one message that gets posted to the same window (most likely, the top one) at a very high rate (like 4-7 messages every 10 ms): 0xF588. This isn't any of the system messages but I still can't find where it originates from, it's not something WM_USER + XXX at least, all WM_USER based ones that I could find in the source don't exceed 0x403 (WM_USER + 3). Perhaps, its value is generated with WinAddAtom. I have to check that and find its origin. Most likely it's some wake up message. But still I wonder why it happens so often.

@dmik
Copy link
Contributor Author

dmik commented May 11, 2018

Ok, that was pretty simple. It's a special message used by nsAppShell to trigger native (PM) event processing from other (non-PM) event loops. Still, a question why that often. I need to analyze further.

@dmik
Copy link
Contributor Author

dmik commented May 11, 2018

Somehow I guess that this special event is a cause of high CPU load (and perhaps of #248 as well). There is a cross-platform logic like that (roughly): process native (in our case — PM) events for a maximum of 10 ms, then, when this maximum is reached, break this processing and let other Firefox events get processed. And there is also a check: if there are more native events pending when this happens, then another native processing cycle is be scheduled by posting this 0xF588 message to the native message queue. It turns out that under heavy load there is always some more messages to process so 0xF588 gets posted over and over from within its own handler at very high rates — until eventually there are no new pending messages within the next 10 ms. This gives extremely high CPU load when such "recursion" happens and the only way out of it is to wait for when the messages get sorted out (which heavily depends on the hardware and the complexity of the web content of course).

Other platforms have various means to prevent this from happening but in general it all looks too complex and hackish. They clearly overdid all the logic there. Given that there is also a merge with the Chromium message loop here (which also integrates with PM on its own), it becomes just a nightmare....

I will try to apply various hacks too to reduce the rate of this special message and see if it helps. I still don't fully understand the logic.

@dryeo
Copy link
Contributor

dryeo commented May 12, 2018 via email

@lerdmann
Copy link

lerdmann commented May 12, 2018 via email

@an64
Copy link

an64 commented May 13, 2018

@dmik
Can you please tell me what is the latest release where plugins work for you?
flash not work with any release of 45, only in 38
Now i tested npwv (warpwision plugin) and it works with t5 and plugin-container.exe from t4
Flash not works, i have new odin , but maybe need for new npflos2.dll with your fixes, can i download it somwere?

@an64
Copy link

an64 commented May 13, 2018

@dmik
regarding your sound issues, I doubt it has anything to do with FF per se. Please try to install the latest UNIAUD from Netlabs
I'm not sure, vlc with kai interface plays well and sound drops only when cpu usage is near 100%

@an64
Copy link

an64 commented May 13, 2018

Setting layout.frame_rate=10000 gives menu sliders and input fields slow reaction , youtube video very slow framerate and youtube page contents never shown, only video

@dmik
Copy link
Contributor Author

dmik commented May 14, 2018

@an64 thanks for testing plugins. I will then just include plugin-container.exe into the archive. Re Odin, I believe libodin RPM 0.9.0-1 from netlabs-exp contains the latest version with the necessary fixes. Can you try it? If it's fine I will move it to rel. Re frame_rate, you results are really strange as they don't match what others are seeing. Are you sure you restarted FF etc?

@dmik
Copy link
Contributor Author

dmik commented May 14, 2018

BTW, applying Windows hacks does seem to help with #265 but doesn't help with the Gmail issue.

@dmik
Copy link
Contributor Author

dmik commented May 18, 2018

Seems that that this issue is more or less gone with the recent fixes. Closing this.

@dmik dmik closed this as completed May 18, 2018
@dmik
Copy link
Contributor Author

dmik commented May 20, 2018

@an64 re flash, IIRC, it is only available to you if you have a Software Subscription from Mensys/ArcaOS or such.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants