-
-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
youtube.com takes really long time to load #266
Comments
Note that builds from Dave Yeo (e.g. https://bitbucket.org/dryeo/dry-comm-esr31/downloads/firefox-45.9.0.en-US.os2.zip) don't make any difference here. I couldn't make it hang so far but I can't my own test builds (or our official RPM builds) hang either any more. I doubt that the compiler optimization options (the only thing which is different in Dave's builds apart that he also uses the unofficial gcc 5.1.0 build) play any significant role here. It more looks like a timing issue which gets triggered at different moments in different builds depending on the optimization options. Which means that Dave's builds are subject to these hangs too sooner or later. One possible workaround for the issue is to use a clean Mozilla profile (by renaming |
@dmik After several days of running the GA1.1 drop I noticed that the video performance did actually get worse as compared to the previous test builds. Specifically, the native YouTube video coded (VP9) would simply hang at start (both GA1.0 and GA1.1), exhibiting the symptoms you described above. However, utilizing the h264ify add-on allowed me to play back videos smoothly. The looping sound was addressed with the upgraded UniAudio drivers, so prior to the GA1.1 build I finally had a well working YouTube playback. Following the upgrade the VP9 continued to be a problem, just as it was before, but to make matters worse the h264ify add-on no longer produced a smooth playback, it now stutters. A couple of days ago Dave released his build. I installed it and for the first time ever can successfully play back the YouTube VP9 stuff, no more need to run the h264ify add-on, the videos simply play. Smooth, no stuttering, etc, etc. His build also does not show the larger CPU consumption, although compared to the previous releases, such as GA1.0, the CPU usage is still higher. I previously commented on that in #265. I have a Fibre-120 (125 Mbit/s) network connection, with a sustained high data rates to YouTube, therefore, I believe it is not a network speed issue. Whatever changes Dave implemented, could be just the GCC 5.1.0 or the other optimizations, they seem to have been a step in the right direction from my perspective. Dave's release is also an i686 build, I would like to test a pentium4 build for a closer 'apples to apples' comparison. Not sure that has anything to do with it, but it's another question mark that we can address. |
@dspiatkowski still, what you say doesn't prove it's not a timing issue. I'm pretty sure you will see similar problems with the Dave's build one day. I will do a test build with the same optimizer options Dave did but even that won't be a proof. It simply must not depend on those options this way. And if it does, something is wrong somewhere else (kernel task scheduling, network stack, bad Firefox code design, whatever). |
@dmik I agree, not a hard proof at all, but it is the only working configuration I seem to have at the moment and so I share my experience in hopes of providing additional data points for analysis. For what it's worth, given that I see a consistent outcome (and a different video playback result) between these two builds I would be more than happy to do whatever additional data capture/debug you need to try to narrow this one down. In #264 I included visual screenshots of the YouTube video 'Nerd Info' for that very reason, to show a side-by-side VP9 vs h264ify outcomes. As others have pointed on in the OS2World thread, they are also seeing different results. There is something else I wanted to point out that pertains to Dave's build's behaviour that is consistent with what I have seen in the past, and that included the BWW FF builds. When using a i686 build of FF sometimes I will get a solid 100% CPU spike that persists for about 5-10 sec (usually) and which all of a sudden goes away. Your GA1.0 and 1.1 would never show this, but again, these was a pentium4 builds. Dave's test is a i686 build and it shows these very same spikes once again. This is part of the reason why I asked Dave to produce pentium4 build. Given the various results people are reporting I wonder if there is a specific corelation to the RPM/YUM platform settings and how this affects FF. After all, something as simple as having ffmpeg libraries installed for video playback will either get you a pentium4 of i686 versions. Maybe this is all that is required to cause the sort of timing issues that you are concerned about??? |
@dspiatkowski thanks for the feedback but so far I have no idea which info besides what you already said you could provide. We need at least some hint to understand what's behind all of it. But I doubt it's just machine type. Here I test both pentium4 as well as i686 and haven't noticed any consistent difference. May be finishing the profiler code (#264, your reference was to #242 I believe) and actually enabling profiling could shed some light on it. |
I've uploaded a test build with the same optimization as Dave's as http://rpm.netlabs.org/test/firefox-45.9.0-3.t1.i686.7z (and it's also i686, yes). Please test. And compare with the official GA1.1. |
This build plays YouTube videos fine. Never could get the previous build to play one with YouTube suggesting to restart my device and waiting or pressing F5 to reload didn't help. |
This build also cleanly shutdown unlike the previous build which would leave the icon hatched until using TOP or such to kill it. |
Confirming Dave's findings here as well. There appears to be as substantial difference in the GA1.1 and this T1 build. Not only does FF now play the YouTube VP9 videos, but they are pretty smooth (short of some stutter when the system is being heavily tasked by other processes - I actually stress tested it by having a faull-screen FF playback while openning OpenOffice document), but even attempting to run h264ify add-on also gives smooth playback. Basically VP9 quality now matches (if not actually exceeds) the h264ify playback quality here. Also, what I noticed is that this test build appears to be less CPU cycle hungry. I've only got a few hours of runtime at the moment so this could be just the result of limitted browsing when FF is generaly well bahaved. However, even compared to a similar runtime on GA1.1 the CPU appears to be less tasked. Further update to follow, so this last comment may be a bit premature... |
@dryeo are you sure about GA1.1? Can you just wait for several minutes? Can you let FF create a new profile? As I said, both work here but I had to wait and Ctrl-R my GA1.1 build to make it work. And since then it works like a charm. And I still don’t see how compiler optimization could break logic here. So it still looks like a random (=timing) issue for me which just less rare with different optimization options for some reason. At least we proved that GCC 5.1.0 is not involved here. I will also try full -O3 here and pentium4 to see if it makes any difference. Regarding the default of -O2, it’s not an OS/2 only thing - this is default on many other platforms too. I don’t know reasons behind that though. I guess it’s just a balance between the size and the speed. Again, this should not make a difference function wise. If it does, something is seriously wrong somewhere else. |
And I will just state it once again that the only difference between ga1.1 and t1 is forcing -O3 for JS and -Os for the rest and -march=i686 instead of pentium4. The rest is fully identical. |
Note also that it’s not -march=pentium4 either as 45.5.0 released in May 2017 is also i686 and it was broken here too when I found 45.9.0 to be broken. Now both work. |
With t1, I get good YouTube playback on Lenovo ThinkCentre Tiny M92p |
Using GA1.1, new profile, waiting for 10 minutes just sees different ads
trying to play, same with CTRL-R, F5, and CTRL-F5 all result in the same
hang. National Film Board of Canada (NFB) videos do play, but they're
likely MP4.
This is a LTE connection if that matters.
See the hang on exit again as well.
I suspect it is JS that needs the -O3, though that needs testing and
then testing to see which option that is required, which at 2.5 hours a
build would be slow :(
We could be seeing a compiler bug.
|
Well, different ads? Hmm, are we talking about the same thing? Here I saw a blank page in in all 45.* builds when it broke and some constant loading progress of the background content (like scripts and so on). Can you give me a screen shot of GA1.1 when it’s trying to open YouTube.com? Maybe several if it changes while you wait. |
Yes, Youtube wants me to watch ads before the video I clicked, it would
paint the first frame and then do the circle thing and eventually
suggest restarting my device.
I have to go to work but will try to post a screenshot later.
|
Today I built Firefox with GCC 5.1.0 and -O2 -march=pentium4 and it consistently plays videos here. |
I now tried dmiks test build:
I have tried this with the 14.106 SMP kernel as well as with the latest OS4 kernel. I have an 8-core AMD system. What I could try is to enable only one core per package (where a package contains 2 cores) to avoid and HT issues but the past has shown that that helps only marginally. |
On Wed, 25 Apr 2018 23:08:24 -0700 lerdmann wrote:
I now tried dmiks test build:
1) playability of Youtube videos is entirely tied to using h264ify
add-on. If I have it installed I
can play Youtube videos, if I don't have it installed I cannot. It does
not matter what build I use.
By saying you have it "installed", do you mean installed AND Enabled?
My testing was always done with the add-on installed but in Enabled and
Disabled states. I would expect the Disabled setting to entirely move that
code out of execution, but maybe that is not the case afterall?
|
@dryeo well, ok, your situation is different from what this ticket originates from. Your case looks more like #242. Anyway, I'm sure the reason behind both problems is the same. However, -O2 and -march=pentium4 is exactly what the GA1.1 build in 7z is (except GCC but my t1 is a proof that it doesn't matter). Which, in turn, proves that compiler options are just a side effect here and the main problem is timing. My current guess is that the new JVM "task scheduler" (that was heavily changed after Firefox 38) works very unstable on OS/2 so that some JS scripts start starving like hell especially when several are executed in parallel. Looks like we miss some platform-specific implementation detail and the generic code is just too dumb (it's also possible that the new JVM scheduler design is just not good). All this requires quite a complex research. Anyway, I will try another build: -O3 and -march=pentium4 to see if it makes any difference. @lerdmann re |
Interesting that this problem only seems to exist in the latest release.
I just tried the firefox 45.9.0-2 rpm and that plays youtube videos fine.
I'm also pretty sure that the parent.lock file has always stayed in the
profile
|
BTW, I'm looking at optimization options of other platforms now. In fact, current platforms like Linux and Darwin use -O3 by default. I'm also seeing they play around with -freorder-functions and -freorder-blocks which reorder code in the executable to make it more local and reduce the number of branches. And -freorder-functions actually requires support from the linker which we might not have. The GCC documentation is a bit contradictive about these options (as usual), one place says that both are enabled in -O2, -O3 and -Os while the other place says that -Os disables -freorder-blocks as well as some code alignment optimizations. This might be the reason why timings change that much when using different options. I might also try to disable -freoder-functions on OS/2 given there is no support in the linker. I see that Android builds use -Os for XUL, -O3 for JS and in both cases disable function reordering but enable block reordering. BTW, Linux also uses -Os for XUL and -O3 for JS. However, it doesn't disable function reordering. |
Reminds me of trying to load Youtube on dial-up, which sometimes sorta
worked and sometimes similarly hung.
|
It must be a timing problem or such. I now went back to GA 1.1 with h264ify still being activated. As to "parent.lock": it's fairly obvious that this serves some sort of notification (the file always has zero content). And in the past people stated that having that file in your profile will eventually lead to Firefox asking you to refresh your profile (killing all your preferences and such). So yes I am fairly sure it has to go away on FF shutdown. |
BTW, an attempt to build XUL with -O3 failed, XPCSHELL.EXE fails with this:
Given that there is no any significant benefit from -O3 (and Linux builds specifically use -Os for some reason), I'm not going to debug this. So, here is the other test build: http://rpm.netlabs.org/test/firefox-45.9.0-3.t2.pentium4.7z. It has JS compiled with |
@dmik T2 test result: FF traps, the window frame shows up and it's "game over" after that. I've attached the trap dump. Exception C0000005 - Access Violation Process: G:\APPS\TCPIP\FIREFOX\FIREFOX.EXE (04/27/2018 08:18:16 50,900) Module: XUL Failing Instruction 11484AD6 MOV EAX, [ESI+0x8] (8b46 08) |
Ok guys, please test another build: http://rpm.netlabs.org/test/ff45_9_0_t5.7z. The only difference from T4 is that I use @an64 well, |
@dmik Build T5 feedback: working fine here, no crash, no issues to report so far. Compared to builds T3 & T4 there is a noticable improvement in YouTube VP9 playback, I would say it is on-par with the previous working test release, which I think was T1 b/c T2 was trapping right away (sorry, I should have been logging these things here, or labelling my ticket updates accordingly - like I'm doing with this one). |
@dspiatkowski Interesting. T1 was also -Os. Are you sure it's not an observer's effect? -Os is optimize for size which should be generally slower than any other type of optimization (for the price of a better in-memory footprint). Do you have any numbers indicating that Youtube behaves better on T5 than on T4? |
@dmik Well, I am basing my YouTube feedback on how actual playback proceeds. It is either smooth, no video and/or audio drop-outs, or stutters, hangs, etc., or it is not. I have no hard numbers to use, other than the few screenshots of YouTube "nerd data" I have captured in the past as we were looking at potential network speed issues. All of these pointed to a significant amount of network speed availability. Would capturing the YouTube "nerd data" for the very same video in both T4 & T5 help in any way? |
t5 here shows slightly lower cpu usage on vp9 and h264 playback, no hangs |
Trying http://rpm.netlabs.org/test/ff45_9_0_t5.7z: with the Youtube web page open, a main window resize takes an extended period of time (I did not try this with http://rpm.netlabs.org/test/ff45_9_0_t4.7z, maybe I should).At least, initially. Sluggish scrolling behavior. Youtube video playback is sluggish, moving the mouse cursor over the Firefox window will get the video out of sync with the sound (but I think it has been this way for quite some time). I am using "layout.frame_rate" = 0. |
@an64 hmm I refreshed my memory and you are right, thanks for popping it up. Plugins should generally work in OOP mode now and no reason to remove/disable plugin-container.exe anymore. At least it was the case when I closed that issue. Can you please tell me what is the latest release where plugins work for you? |
@lerdmann remember t4 and t5 differ only in -O3 vs -Os for XUL.DLL (all code except JS). Please test t4 then. Though I don't think these options make that much difference so please make sure you don't have some weird frame_rate value (and restart FF each time you change it — it might not pick it up everywhere on the fly). And why you still see high CPU load when minimized and frame_rate != 0 is also a puzzle since nobody else is seeing that. |
@dspiatkowski re new YouTube nerd data, if you see any significant difference between the builds, then yes, post it. @an64 regarding your sound issues, I doubt it has anything to do with FF per se. Please try to install the latest UNIAUD from Netlabs — there are reports it helps with sound in FF. |
I wonder what "layers.offmainthreadcomposition.frame_rate" variable can do for us. It is set to -1. I think FF is much too greedy in releasing its threads. Why would you want to handle ANY messages if the program is in the background ? I think that this WinPeekMsg is really counterproductive. |
On 05/09/18 10:03 AM, lerdmann wrote:
I think FF is much too greedy in releasing its threads. Why would you
want to handle ANY messages if the program is in the background ? I
think that this WinPeekMsg is really counterproductive.
Notifications. Here for downloads finishing or Chatzilla conversations
mentioning me by name cause SM to display a notification on the lower
right of the desktop. There's other notifications that are supported on
other platforms.
|
@lerdmann okay, I see. Thanks for testing! Regarding "greediness". This is how the Presentation Manager is designed. A PM application running a message queue is obliged to process all incoming messages as fast as possible in order for the whole PM desktop to function properly. There are many system messages which require immediate processing by the receiver (and that's besides application-specific notifications mentioned by @dryeo). If an application needs more than a dozen of milliseconds to process some message, it is supposed to do so asynchronously WRT reading and processing other incoming messages. And this is where the FF problem actually relies. It 1) emits too many messages 2) takes too long to process some of them synchronously (i.e. on the same thread that is supposed to process other incoming messages). And this negatively affects all the PM. While in theory one may indeed blame the PM design for that, it's what it is and we don't (and won't) have a different PM (and many other platforms have similar requirements). So it's FF which needs to behave properly here. It used to do so more or less but a lot of things have changed in it since then and modern platforms offer better parallelism and less strict requirements which FF seems to utilize w/o caring about older systems that don't. And here lies a fundamental problem as there might be just too many things to change in FF to make it behave a native PM citizen again — so many that it might be equivalent to writing some of its subsystems from scratch (which is apparently beyond our resources given the general complexity and quality of the FF codebase). |
One thing that FF surely employs here is hardware acceleration of 2d rendering — something we miss almost completely on OS/2. On modern platforms some composition and paint operations take much less time than on OS/2 which means that a paint request (which always arrives on the main thread) is processed faster and so are all other upcoming messages. Another thing is that major platforms use OMTC for some time already — and OMTC is also about to transfer resource-greedy 2d rendering (especially if we take all those modern HTML5/CSS3 features into account) to other threads in order to reduce the main thread load and increase UI responsiveness. Things could have been slightly better if we had OMTC enabled on OS/2 but that's a task of its own. |
While trying to debug & understand the complex (well, very complex, Id even say overcomplicated) FF messaging pipeline, I see one message that gets posted to the same window (most likely, the top one) at a very high rate (like 4-7 messages every 10 ms): 0xF588. This isn't any of the system messages but I still can't find where it originates from, it's not something WM_USER + XXX at least, all WM_USER based ones that I could find in the source don't exceed 0x403 (WM_USER + 3). Perhaps, its value is generated with WinAddAtom. I have to check that and find its origin. Most likely it's some wake up message. But still I wonder why it happens so often. |
Ok, that was pretty simple. It's a special message used by |
Somehow I guess that this special event is a cause of high CPU load (and perhaps of #248 as well). There is a cross-platform logic like that (roughly): process native (in our case — PM) events for a maximum of 10 ms, then, when this maximum is reached, break this processing and let other Firefox events get processed. And there is also a check: if there are more native events pending when this happens, then another native processing cycle is be scheduled by posting this 0xF588 message to the native message queue. It turns out that under heavy load there is always some more messages to process so 0xF588 gets posted over and over from within its own handler at very high rates — until eventually there are no new pending messages within the next 10 ms. This gives extremely high CPU load when such "recursion" happens and the only way out of it is to wait for when the messages get sorted out (which heavily depends on the hardware and the complexity of the web content of course). Other platforms have various means to prevent this from happening but in general it all looks too complex and hackish. They clearly overdid all the logic there. Given that there is also a merge with the Chromium message loop here (which also integrates with PM on its own), it becomes just a nightmare.... I will try to apply various hacks too to reduce the rate of this special message and see if it helps. I still don't fully understand the logic. |
On 05/10/18 06:13 AM, Dmitriy Kuminov wrote:
One thing that FF surely employs here is hardware acceleration of 2d
rendering — something we miss almost completely on OS/2
Actually, if using SNAP, we do get hardware acceleration, using DIVE.
Case in point, for the hell of it, I installed the latest beta of ArcaOS
with SNAP instead of Panorama. Drawing PM programs is super slow, doing
a window drag with animation on is very jerky and scrolling is very
slow, a large text file, I can spin the mouse wheel and sit back and
watch it scroll for minutes, with the PM blocked.
With SeaMonkey, I get fast scrolling, quick page draws and such and they
actually feel as fast or faster then with Panorama.
Typing this message in TB is still slow though.
|
You get hardware acceleration for those chips that are supported by SNAP.
Else you get the same old lousy support that GENGRADD has to offer which
of course is worse than Panorama (if you have shadow buffering enabled
in Panorama which is a trick to speed up things).
DIVE does not mean HW acceleration. It just means that the device driver can
write data ("draw") to the screen aperture directly instead of going
through GPI calls.
…On 12.05.18 08.16, Dave Yeo wrote:
On 05/10/18 06:13 AM, Dmitriy Kuminov wrote:
> One thing that FF surely employs here is hardware acceleration of 2d
> rendering — something we miss almost completely on OS/2
Actually, if using SNAP, we do get hardware acceleration, using DIVE.
Case in point, for the hell of it, I installed the latest beta of ArcaOS
with SNAP instead of Panorama. Drawing PM programs is super slow, doing
a window drag with animation on is very jerky and scrolling is very
slow, a large text file, I can spin the mouse wheel and sit back and
watch it scroll for minutes, with the PM blocked.
With SeaMonkey, I get fast scrolling, quick page draws and such and they
actually feel as fast or faster then with Panorama.
Typing this message in TB is still slow though.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#266 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHVM3HVzwIv1-SpczebMqpvIinIHRPzzks5txn43gaJpZM4ThgYz>.
|
@dmik |
@dmik |
Setting layout.frame_rate=10000 gives menu sliders and input fields slow reaction , youtube video very slow framerate and youtube page contents never shown, only video |
@an64 thanks for testing plugins. I will then just include |
BTW, applying Windows hacks does seem to help with #265 but doesn't help with the Gmail issue. |
Seems that that this issue is more or less gone with the recent fixes. Closing this. |
@an64 re flash, IIRC, it is only available to you if you have a Software Subscription from Mensys/ArcaOS or such. |
It appears that sometimes opening https://youtube.com takes minutes before anything appears on the page — in the mean time you only see the progress ring spinning and something like
Read www.youtube.com
,Connecting to s.ytimg.com...
and alike on the status tooltip at the bottom left corner of the page. You may make it work faster by reloading the page several times with Ctrl+R (in this part it's similar to #242).All 45.9.0 builds as well as 45.5.0 from May 2017 are affected while 38.x and earlier builds seem to be not.
My current guess is still that it has something to do with the network connection. However, it looks like it's on the Firefox side, not on the TCP/IP stack side. It might be some security issues, missing certificates or such and delays in the connection caused by them. At least I sometimes see a lot of errors in JS regarding certificates. I need to study it closer. The problem is, as usual, that the failure is irregular and once it starts working, it's quite hard to make it fail again.
The text was updated successfully, but these errors were encountered: