Skip to content

Slow performance on some video modes #33

Closed
bundyo opened this Issue Dec 10, 2012 · 23 comments

3 participants

@bundyo
bundyo commented Dec 10, 2012

I was getting consistent very slow performance on my Zenbook UX32VD (the FullHD one) of about 8-9 fps on 1920x1080 in Trine 2 - regardless of the game settings. I tried changing the resolution to others and still the performance was sub par. Then I discovered that if the resolution width was different than the standard resolutions (for instance if I set 1919x1079), FPS almost tripled to about 25. Same goes for lower resolutions.

Unfortunately this also affects fullscreen and I can't change the resolution there. Is this a known issue?

@amonakov
Owner

Interesting. No, it's not a known issue. What GPU do you have, the 620M? Does image quality look the same in 1919x1079? It could be that some costly image enhancement is omitted in odd resolutions (do you really mean "regardless of game settings"?). Also, please paste logs with PRIMUS_VERBOSE=2 for both the slow and the fast case (you might need to remove nohup and redirections from the launcher).

@bundyo
bundyo commented Dec 10, 2012

620M, yes. The image quality is the same in the different resolution.

Yes, regardless of game settings is exactly that - no matter what I change in options.txt, the image quality changes but the FPS doesn't, both in slow and in the fast mode, which leads me to the conclusion that 620M is able to handle the quality at reasonable high fps, and primus/bumblebee are the bottleneck. I can check how Windows can handle Trine 2 on the same machine if you want?

I'll post the logs later.

@bundyo
bundyo commented Dec 10, 2012

Slow log (1920x990):

primus: profiling: readback: 0.1 fps, 99.8% app, 0.2% map, 0.0% wait
primus: profiling: display: 0.1 fps, 99.0% wait, 1.0% upload, 0.1% draw+swap
primus: profiling: readback: 6.5 fps, 25.2% app, 9.0% map, 65.9% wait
primus: profiling: display: 6.5 fps, 16.4% wait, 82.7% upload, 0.9% draw+swap
primus: profiling: readback: 7.8 fps, 0.0% app, 10.3% map, 89.7% wait
primus: profiling: display: 7.8 fps, 0.0% wait, 98.9% upload, 1.1% draw+swap
primus: profiling: readback: 7.7 fps, 0.0% app, 10.4% map, 89.6% wait
primus: profiling: display: 7.7 fps, 0.0% wait, 98.6% upload, 1.4% draw+swap
primus: profiling: readback: 7.7 fps, 0.0% app, 9.9% map, 90.1% wait
primus: profiling: display: 7.7 fps, 0.0% wait, 99.2% upload, 0.8% draw+swap
AL lib: FreeDevice: (0xe680cdf0) Deleting 12 Buffer(s)

Fast log (1919x990):

primus: profiling: readback: 0.1 fps, 99.8% app, 0.2% map, 0.0% wait
primus: profiling: display: 0.1 fps, 99.7% wait, 0.3% upload, 0.1% draw+swap
primus: profiling: readback: 19.8 fps, 23.2% app, 25.2% map, 51.6% wait
primus: profiling: display: 19.8 fps, 22.1% wait, 77.2% upload, 0.8% draw+swap
primus: profiling: readback: 24.8 fps, 0.0% app, 32.2% map, 67.8% wait
primus: profiling: display: 24.9 fps, 0.0% wait, 99.0% upload, 0.9% draw+swap
primus: profiling: readback: 25.1 fps, 0.0% app, 33.4% map, 66.6% wait
primus: profiling: display: 25.0 fps, 0.0% wait, 99.2% upload, 0.8% draw+swap
primus: profiling: readback: 25.3 fps, 0.1% app, 57.4% map, 42.5% wait
primus: profiling: display: 25.3 fps, 1.5% wait, 97.6% upload, 0.8% draw+swap
AL lib: FreeDevice: (0xe690cdf0) Deleting 12 Buffer(s)

No changes in settings, just one pixel :)

@amonakov
Owner

Hm, you seem to be bottlenecked on the Intel side. What is your Mesa version? Texture upload was optimized in Mesa/i965 in August, you might get better results with Mesa 9.

But it's really odd that it looks like you're limited to just 7 fps.

@bundyo
bundyo commented Dec 10, 2012

I'm with Mesa 9, Intel driver 2.20.15 :)

@bundyo
bundyo commented Dec 10, 2012

I also tried with UXA and SNA, no visible difference.

Also kind of strange why a pixel will make a difference (this happens even on low resolutions - for instance 720p is slow (~20 fps), but 1279x720 is fast (~50 fps)

Just tested with Intel's 2.20.3 driver, same thing.

@amonakov
Owner

An interesting observation: the performance ratio (7.8/25.3 fps) is very similar to your Intel GPU base/max clock frequencies ratio (350/1150 MHz) (assuming i7-3517U). Could be a red herring though: Intel GPU shouldn't be the frame limiter, especially not at 7 nor 25 fps. Is there something else competeing for Intel GPU resources (a compositor)? Have you tried other tests, not just Trine2?

@bundyo
bundyo commented Dec 10, 2012

Yes, there is a compositor involved - Cinnamon's one (Muffin I think). However I tried with MATE, Gnome 3 Fallback mode and Cinnamon 2D and it was slow everywhere.

I also tried other games and the Steam's Big Picture and performance was the same slow in fullscreen.

The CPU is 3517, yes, though max clock should be at 1.9GHz if I'm not mistaken.

@bundyo
bundyo commented Dec 10, 2012

Just tested - you're correct, the slow resolutions fail to clock the CPU at 1.9GHz, while the fast ones do... Gets weirder by the minute :)

Now the question is just why :)

Just tried with TWM, same thing - the CPU clock stays at 800MHz - even if I run Trine directly on the Intel CPU...

Update: Changed the governor to performance, cpufreq-info reports 2.4GHz, however FPS stays the same...

Another update: running Trine directly on the CPU doesn't scale the frequency, but it also doesn't have the different FPS behavior - different resolutions are equally slow.

@bundyo
bundyo commented Dec 10, 2012

I know this is a dirty hack, but changing primus.dfns.glTexImage2D() on line 358 to init with width+1 instead of width increases FPS in the slow resolutions. I can probably live with this until better solution is found.

@amonakov
Owner

I meant the GPU clock, not the CPU, when mentioning 1150 MHz. You can try to ask about this lack of GPU frequency scaling in #intel-graphics irc channel at irc.freenode.net

@bundyo
bundyo commented Dec 11, 2012

Ah, you're right, the GPU scaling frequencies are 350/1150.

Since I'm on 3.7 rc8, I've used the new sysfs means to control the IGC scaling and it seems scaling has no effect on the performance (maybe a placebo but forcing down 350 MHz on IGC actually added 1 FPS to overall perf). Also it seems the GPU is correctly scaled when used.

In any way, Intel's GPU is probably not to blame after all.

@amonakov
Owner

The fact that your hack works is very important, because it eliminates the possibility that something is fishy on the nVidia side. I don't see why you say that Intel GPU is not to blame.

@bundyo
bundyo commented Dec 11, 2012

Yes, 620M seems to be working okay too, Dunno what to think anymore, just that up or down clocking the Intel GPU doesn't seem to have any effect on the bad or even the good performance. So it shouldn't be the scaling. There should be something else I'm overlooking.

@bundyo
bundyo commented Dec 11, 2012

Hmm, I managed to somewhat "reproduce" the bad performance from the good one :)

If I use texture format in glTexImage2D() that requires auto conversion (tried with GL_RGBA4), the perfomance drops to about 7 FPS. Is it possible that something like that happens with the bad resolutions?

@amonakov
Owner

That would be odd. Try asking in #intel-graphics

@bundyo
bundyo commented Dec 12, 2012

Okay, thanks.

@bundyo
bundyo commented Dec 13, 2012

Hm, discovered that stopping texture tiling has a positive effect on the framerate - at least lowering fullscreen resolution now ups the framerate instead of having no effect whatsoever. For instance Trine 2 in 1280x720 fullscreen now runs with ~35-40 FPS instead of ~25.

@amonakov
Owner

It's been a while. Any new news on this issue?

@bundyo
bundyo commented Mar 2, 2013

Unfortunately can't test now as I have some issues with one of my SSDs on the laptop, have to do a clean OS install on the other first.

@ArchangeGabriel

And what's new as of today?

@amonakov
Owner

If the problem is observed again, use perf and INTEL_DEBUG=perf PRIMUS_VERBOSE=2 to investigate further. The situation is probably different with newer Intel drivers, and it's not a primus bug anyway, so: closing.

@amonakov amonakov closed this May 19, 2013
@bundyo
bundyo commented May 20, 2013

Yes, I can confirm now that the issue has been resolved with newer Intel drivers. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.