I was getting consistent very slow performance on my Zenbook UX32VD (the FullHD one) of about 8-9 fps on 1920x1080 in Trine 2 - regardless of the game settings. I tried changing the resolution to others and still the performance was sub par. Then I discovered that if the resolution width was different than the standard resolutions (for instance if I set 1919x1079), FPS almost tripled to about 25. Same goes for lower resolutions.
Unfortunately this also affects fullscreen and I can't change the resolution there. Is this a known issue?
Interesting. No, it's not a known issue. What GPU do you have, the 620M? Does image quality look the same in 1919x1079? It could be that some costly image enhancement is omitted in odd resolutions (do you really mean "regardless of game settings"?). Also, please paste logs with PRIMUS_VERBOSE=2 for both the slow and the fast case (you might need to remove nohup and redirections from the launcher).
620M, yes. The image quality is the same in the different resolution.
Yes, regardless of game settings is exactly that - no matter what I change in options.txt, the image quality changes but the FPS doesn't, both in slow and in the fast mode, which leads me to the conclusion that 620M is able to handle the quality at reasonable high fps, and primus/bumblebee are the bottleneck. I can check how Windows can handle Trine 2 on the same machine if you want?
I'll post the logs later.
Slow log (1920x990):
primus: profiling: readback: 0.1 fps, 99.8% app, 0.2% map, 0.0% wait
primus: profiling: display: 0.1 fps, 99.0% wait, 1.0% upload, 0.1% draw+swap
primus: profiling: readback: 6.5 fps, 25.2% app, 9.0% map, 65.9% wait
primus: profiling: display: 6.5 fps, 16.4% wait, 82.7% upload, 0.9% draw+swap
primus: profiling: readback: 7.8 fps, 0.0% app, 10.3% map, 89.7% wait
primus: profiling: display: 7.8 fps, 0.0% wait, 98.9% upload, 1.1% draw+swap
primus: profiling: readback: 7.7 fps, 0.0% app, 10.4% map, 89.6% wait
primus: profiling: display: 7.7 fps, 0.0% wait, 98.6% upload, 1.4% draw+swap
primus: profiling: readback: 7.7 fps, 0.0% app, 9.9% map, 90.1% wait
primus: profiling: display: 7.7 fps, 0.0% wait, 99.2% upload, 0.8% draw+swap
AL lib: FreeDevice: (0xe680cdf0) Deleting 12 Buffer(s)
Fast log (1919x990):
primus: profiling: readback: 0.1 fps, 99.8% app, 0.2% map, 0.0% wait
primus: profiling: display: 0.1 fps, 99.7% wait, 0.3% upload, 0.1% draw+swap
primus: profiling: readback: 19.8 fps, 23.2% app, 25.2% map, 51.6% wait
primus: profiling: display: 19.8 fps, 22.1% wait, 77.2% upload, 0.8% draw+swap
primus: profiling: readback: 24.8 fps, 0.0% app, 32.2% map, 67.8% wait
primus: profiling: display: 24.9 fps, 0.0% wait, 99.0% upload, 0.9% draw+swap
primus: profiling: readback: 25.1 fps, 0.0% app, 33.4% map, 66.6% wait
primus: profiling: display: 25.0 fps, 0.0% wait, 99.2% upload, 0.8% draw+swap
primus: profiling: readback: 25.3 fps, 0.1% app, 57.4% map, 42.5% wait
primus: profiling: display: 25.3 fps, 1.5% wait, 97.6% upload, 0.8% draw+swap
AL lib: FreeDevice: (0xe690cdf0) Deleting 12 Buffer(s)
No changes in settings, just one pixel :)
Hm, you seem to be bottlenecked on the Intel side. What is your Mesa version? Texture upload was optimized in Mesa/i965 in August, you might get better results with Mesa 9.
But it's really odd that it looks like you're limited to just 7 fps.
I'm with Mesa 9, Intel driver 2.20.15 :)
I also tried with UXA and SNA, no visible difference.
Also kind of strange why a pixel will make a difference (this happens even on low resolutions - for instance 720p is slow (~20 fps), but 1279x720 is fast (~50 fps)
Just tested with Intel's 2.20.3 driver, same thing.
An interesting observation: the performance ratio (7.8/25.3 fps) is very similar to your Intel GPU base/max clock frequencies ratio (350/1150 MHz) (assuming i7-3517U). Could be a red herring though: Intel GPU shouldn't be the frame limiter, especially not at 7 nor 25 fps. Is there something else competeing for Intel GPU resources (a compositor)? Have you tried other tests, not just Trine2?
Yes, there is a compositor involved - Cinnamon's one (Muffin I think). However I tried with MATE, Gnome 3 Fallback mode and Cinnamon 2D and it was slow everywhere.
I also tried other games and the Steam's Big Picture and performance was the same slow in fullscreen.
The CPU is 3517, yes, though max clock should be at 1.9GHz if I'm not mistaken.
Just tested - you're correct, the slow resolutions fail to clock the CPU at 1.9GHz, while the fast ones do... Gets weirder by the minute :)
Now the question is just why :)
Just tried with TWM, same thing - the CPU clock stays at 800MHz - even if I run Trine directly on the Intel CPU...
Update: Changed the governor to performance, cpufreq-info reports 2.4GHz, however FPS stays the same...
Another update: running Trine directly on the CPU doesn't scale the frequency, but it also doesn't have the different FPS behavior - different resolutions are equally slow.
I know this is a dirty hack, but changing primus.dfns.glTexImage2D() on line 358 to init with width+1 instead of width increases FPS in the slow resolutions. I can probably live with this until better solution is found.
I meant the GPU clock, not the CPU, when mentioning 1150 MHz. You can try to ask about this lack of GPU frequency scaling in #intel-graphics irc channel at irc.freenode.net
Ah, you're right, the GPU scaling frequencies are 350/1150.
Since I'm on 3.7 rc8, I've used the new sysfs means to control the IGC scaling and it seems scaling has no effect on the performance (maybe a placebo but forcing down 350 MHz on IGC actually added 1 FPS to overall perf). Also it seems the GPU is correctly scaled when used.
In any way, Intel's GPU is probably not to blame after all.
The fact that your hack works is very important, because it eliminates the possibility that something is fishy on the nVidia side. I don't see why you say that Intel GPU is not to blame.
Yes, 620M seems to be working okay too, Dunno what to think anymore, just that up or down clocking the Intel GPU doesn't seem to have any effect on the bad or even the good performance. So it shouldn't be the scaling. There should be something else I'm overlooking.
Hmm, I managed to somewhat "reproduce" the bad performance from the good one :)
If I use texture format in glTexImage2D() that requires auto conversion (tried with GL_RGBA4), the perfomance drops to about 7 FPS. Is it possible that something like that happens with the bad resolutions?
That would be odd. Try asking in #intel-graphics
Hm, discovered that stopping texture tiling has a positive effect on the framerate - at least lowering fullscreen resolution now ups the framerate instead of having no effect whatsoever. For instance Trine 2 in 1280x720 fullscreen now runs with ~35-40 FPS instead of ~25.
It's been a while. Any new news on this issue?
Unfortunately can't test now as I have some issues with one of my SSDs on the laptop, have to do a clean OS install on the other first.
And what's new as of today?
If the problem is observed again, use perf and INTEL_DEBUG=perf PRIMUS_VERBOSE=2 to investigate further. The situation is probably different with newer Intel drivers, and it's not a primus bug anyway, so: closing.
Yes, I can confirm now that the issue has been resolved with newer Intel drivers. Thanks.