-
-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backends: xrender less laggy than glx? #620
Comments
I am on nvidia (proprietary driver) and also noticed that glx tends to be more laggy, but in other way - gpu accelerated terminals (alacritty, kitty) take ~1.5x more time to start compared to xrender: TestsWith
Alacritty takes about ~190ms to launch. Same with legacy backend:
Now with xrender: Experimental:
Legacy:
Notice how it takes only ~125ms on average now. |
Probably related: #641 |
This, specifically there have been changes since v8.2 that affect floating window performance. (I noticed this very clearly while doing my testing). @ro-i If you try the current git branch (next), I think you will observe much less floating window lag. @MahouShoujoMivutilde Which version of picom are you using? The release version (8.2) or the current branch? Also, since you're on NVIDIA, I think you'll benefit from both the current branch and flag testing I'm doing. |
@kwand i am on latest Driver version 460.56, from nvidia-all, kernel is linux-tkg 5.10.y lts with Also my gpu is quite old - gtx 970. So i tried your patch, but sadly it seems there is no significant difference. Tests
test.sh #!/usr/bin/env sh
pid=/tmp/picom.pid
run() {
echo "$1"
picom -b --write-pid-path="$pid" --experimental-backends --no-vsync --backend="$1" --config=/dev/null
sleep 0.5s
hyperfine 'alacritty -e false'
kill $(cat $pid)
sleep 0.5s
}
run glx
run xrender
(I made sure patch applied, tests are on new picom, etc) Also, you guys been talking about triple buffering, so i think it's may be worth noting that i have it enabled in xorg config: /etc/X11/xorg.conf.d/20-nvidia.conf
|
I've been talking about disabling triple buffering. The fact that you have it enabled is might conflict with the patch, which tries to turn it off for picom. There's no real reason to have triple buffering enabled for a compositor, unless you want to prioritize smoothness at the cost of responsiveness. Though, I will note the difference with my patch is only slight. There is another method that I tried (it's mentioned in the first post of the PR), which may or may not benefit you, but I've abandoned it now since it introduces a bunch of old glitches and will require a major rewrite of the code, for just 5-10ms less latency. Don't get me wrong that this amount of latency reduction is still significant - it's just that the rewrite is many times more significant that I can't really justify it (at least for my implementation) Also, I don't really believe your GPU is old enough to be the problem here. I think it's likely it is not even boosting while rendering picom, which should make it perform almost on par with newer GPUs. |
Also, I just noticed that you have this line in your xorg config. Is this turned on? (i.e. ForceCompositionPipeline and ForceFullCompositionPipeline? I don't really understand the syntax of this config file unfortunately) If it is, I would turn it off while testing - I've noticed increased lag when it's on and it's a bit superfluous as picom is already trying to do the same thing. The comment you have above seems to say the same thing: "for vsync without compositor" I still can't quite explain why glX takes 1.5x longer though - even in theory. My best guess (without evidence) is that glX may be more graphically demanding than xrender? If that is the case, I would try re-performing the test with "Prefer Maximum Performance" enabled, as shown here: This should force the GPU to run at its maximum boost clock. |
Actually, it seems that triple buffering gives me significantly better latency.No triple buffering,
No triple buffering and no
No
* when i write "no something", what i mean is that i just commenting this thing out from config above. Should not make any difference, according to defaults.
Doesn't hurt to try it anyway. And thank you for trying to come up with improvements 👍
Yeah, makes sense.
Yes, but it doesn't actually add any significant latency (see above), in fact - the biggest increase in latency regarding xorg config was from disabling
I wrote that myself, and that's why all my tests are with Theoretically, having vsync on driver level should give less latency compared to However, at the moment, fastest runs were with
I just tried that, and results are the same. P state switches to 4 even with default setting fast enough to not matter. |
Well, this was a complete oversight of mine. I didn't notice that at all! I can see why my patch probably has no effect then since I believe it would only work if vsync is enabled. Thank you for doing the testing - I'm currently unable to do much due to time constraints. They're all quite interesting, and unfortunately, I have no idea how to explain the results. (Actually, it's possible I was mistaken about enabling triple buffering in the xorg conf forces picom to use triple buffering as well. To reiterate, this does not really matter in your case since you disabled vsync, but it's possible you're seeing gains b/c of enabling/disabling triple buffering for alacritty.)
Would love to see the results for this as well, whenever you have the time, since my PR mainly improves vsync inside picom. |
@kwand Okay, i tested whole bunch of things. Buckle up"Sync to vblank" and "allow flipping" are from nvidia OpenGL settings. I found ffcp is Patched means #641 applied. Each row is 100 counted runs + 3 warm up runs.
For reference, this is how latency looks without picom running
Same table in json [
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 113,
"median": 114,
"stddev": 3.56
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 113,
"median": 114,
"stddev": 3.46
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 113,
"median": 114,
"stddev": 3.41
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 113,
"median": 114,
"stddev": 3.77
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 114,
"median": 113,
"stddev": 2.93
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 114,
"median": 114,
"stddev": 2.62
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 114,
"median": 114,
"stddev": 3.34
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 114,
"median": 114,
"stddev": 3.6
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 114,
"median": 114,
"stddev": 2.88
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 114,
"median": 114,
"stddev": 5.0
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 114,
"median": 114,
"stddev": 2.94
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 114,
"median": 114,
"stddev": 3.17
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 114,
"median": 114,
"stddev": 3.26
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 114,
"median": 114,
"stddev": 3.4
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 114,
"median": 114,
"stddev": 3.28
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 114,
"median": 114,
"stddev": 2.4
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 114,
"median": 114,
"stddev": 3.14
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 115,
"median": 114,
"stddev": 3.71
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 115,
"median": 114,
"stddev": 3.23
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 115,
"median": 114,
"stddev": 3.54
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 115,
"median": 114,
"stddev": 3.28
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 115,
"median": 114,
"stddev": 4.08
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 115,
"median": 114,
"stddev": 2.99
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 115,
"median": 114,
"stddev": 4.35
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 115,
"median": 114,
"stddev": 3.17
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 115,
"median": 114,
"stddev": 2.9
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 115,
"median": 114,
"stddev": 3.42
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 115,
"median": 114,
"stddev": 3.43
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 115,
"median": 114,
"stddev": 2.82
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 115,
"median": 114,
"stddev": 3.53
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 115,
"median": 114,
"stddev": 3.14
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 115,
"median": 114,
"stddev": 3.39
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 115,
"median": 115,
"stddev": 3.88
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 115,
"median": 114,
"stddev": 2.99
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 116,
"median": 115,
"stddev": 3.83
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 116,
"median": 114,
"stddev": 3.63
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 116,
"median": 114,
"stddev": 4.21
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 117,
"median": 115,
"stddev": 5.02
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 118,
"median": 116,
"stddev": 4.97
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 118,
"median": 115,
"stddev": 5.13
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 137,
"median": 137,
"stddev": 10.22
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 138,
"median": 134,
"stddev": 10.57
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 142,
"median": 146,
"stddev": 9.62
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 142,
"median": 147,
"stddev": 9.51
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 144,
"median": 147,
"stddev": 8.18
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 144,
"median": 147,
"stddev": 8.42
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 147,
"median": 148,
"stddev": 9.89
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 147,
"median": 148,
"stddev": 8.99
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 148,
"median": 148,
"stddev": 10.72
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 149,
"median": 148,
"stddev": 9.88
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 149,
"median": 148,
"stddev": 9.46
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 149,
"median": 148,
"stddev": 9.92
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 150,
"median": 148,
"stddev": 11.08
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 150,
"median": 148,
"stddev": 9.64
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 150,
"median": 148,
"stddev": 9.62
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 150,
"median": 148,
"stddev": 10.61
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 150,
"median": 148,
"stddev": 9.51
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 151,
"median": 148,
"stddev": 10.77
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 151,
"median": 148,
"stddev": 9.63
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 151,
"median": 148,
"stddev": 10.3
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 152,
"median": 148,
"stddev": 10.12
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 152,
"median": 148,
"stddev": 9.54
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 152,
"median": 148,
"stddev": 10.22
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 152,
"median": 148,
"stddev": 8.93
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 153,
"median": 148,
"stddev": 9.55
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 153,
"median": 148,
"stddev": 9.68
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 153,
"median": 148,
"stddev": 8.19
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 153,
"median": 148,
"stddev": 9.62
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 153,
"median": 148,
"stddev": 9.51
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 154,
"median": 148,
"stddev": 10.67
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 154,
"median": 148,
"stddev": 9.42
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 154,
"median": 148,
"stddev": 8.89
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 154,
"median": 148,
"stddev": 10.25
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 154,
"median": 148,
"stddev": 10.16
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 155,
"median": 148,
"stddev": 9.48
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 156,
"median": 163,
"stddev": 8.89
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 156,
"median": 156,
"stddev": 9.73
},
{
"backend": "xrender",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 156,
"median": 151,
"stddev": 8.25
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 156,
"median": 162,
"stddev": 8.97
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 156,
"median": 149,
"stddev": 8.47
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 156,
"median": 163,
"stddev": 9.56
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 156,
"median": 164,
"stddev": 9.01
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 157,
"median": 164,
"stddev": 8.74
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 157,
"median": 164,
"stddev": 8.67
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 157,
"median": 164,
"stddev": 8.57
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 157,
"median": 164,
"stddev": 8.63
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 158,
"median": 164,
"stddev": 8.63
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 158,
"median": 164,
"stddev": 9.89
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 158,
"median": 164,
"stddev": 8.25
},
{
"backend": "xrender",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 158,
"median": 164,
"stddev": 8.26
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 158,
"median": 164,
"stddev": 8.26
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 159,
"median": 164,
"stddev": 9.93
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 159,
"median": 164,
"stddev": 11.03
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 159,
"median": 164,
"stddev": 10.28
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 161,
"median": 164,
"stddev": 14.34
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 161,
"median": 164,
"stddev": 13.07
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 162,
"median": 164,
"stddev": 10.37
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": false,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 164,
"median": 164,
"stddev": 15.33
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 168,
"median": 165,
"stddev": 6.59
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 169,
"median": 166,
"stddev": 6.1
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 169,
"median": 166,
"stddev": 8.23
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 169,
"median": 170,
"stddev": 16.2
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 169,
"median": 166,
"stddev": 8.9
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 169,
"median": 165,
"stddev": 7.09
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 170,
"median": 168,
"stddev": 10.32
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": false,
"mean": 170,
"median": 165,
"stddev": 8.91
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": true,
"patched": true,
"mean": 172,
"median": 168,
"stddev": 10.47
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 175,
"median": 177,
"stddev": 10.42
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 200,
"median": 198,
"stddev": 5.73
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 203,
"median": 198,
"stddev": 10.89
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 205,
"median": 198,
"stddev": 11.78
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 206,
"median": 198,
"stddev": 14.83
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": true,
"mean": 207,
"median": 213,
"stddev": 9.38
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 209,
"median": 213,
"stddev": 7.82
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": true,
"ffcp": false,
"patched": false,
"mean": 209,
"median": 213,
"stddev": 7.88
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 210,
"median": 214,
"stddev": 14.74
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 211,
"median": 213,
"stddev": 13.85
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 212,
"median": 206,
"stddev": 16.12
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 213,
"median": 214,
"stddev": 12.3
},
{
"backend": "glx",
"picom_vsync": false,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 215,
"median": 214,
"stddev": 16.15
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 234,
"median": 231,
"stddev": 14.11
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": false,
"mean": 236,
"median": 232,
"stddev": 12.62
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 236,
"median": 232,
"stddev": 14.75
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": true,
"patched": true,
"mean": 238,
"median": 232,
"stddev": 14.03
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 283,
"median": 281,
"stddev": 21.35
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": false,
"mean": 284,
"median": 289,
"stddev": 19.28
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": true,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 284,
"median": 281,
"stddev": 22.15
},
{
"backend": "glx",
"picom_vsync": true,
"sync_to_vblank": false,
"allow_flipping": true,
"triple_buffer": false,
"ffcp": false,
"patched": true,
"mean": 288,
"median": 290,
"stddev": 20.69
}
] So what i noticed:
Note: If you use a tiling window manager, it is important to launch alacritty MethodologyTo at least partially automate testing i wrote a new scripttest2.shIt assumes running in tmux (to collect hyperfine's output as it was on screen), It runs picom/nvidia with different options and tests them on alacritty via It can't, however, change Xorg options (ffcp, triple buffering), so that is It also doesn't install patched/unpatched picom for you. First argument is expected to be a name for results folder of particular Xorg options This folder will be created under #!/usr/bin/env bash
pid=/tmp/picom.pid
[ -z "$1" ] && echo -e "usage: \ntest2.sh 'results-folder-name'" && exit 1
clear
results="$HOME/picom/$1"
mkdir -p $results
run() {
opts="$1, $4, SyncToVBlank=$2, AllowFlipping=$3"
echo "Running with $opts"
nvidia-settings -a "SyncToVBlank=$2" >/dev/null
nvidia-settings -a "AllowFlipping=$3" >/dev/null
picom $4 -b --write-pid-path="$pid" \
--experimental-backends --backend="$1" \
--config=/dev/null
sleep 0.5s
hyperfine --export-json "$results/$opts.json" \
-w 3 -r 100 \
'alacritty --class floatme,floatme -e false'
kill "$(cat $pid)"
sleep 0.5s
}
figlet 'Xorg options:'
grep -P '(Triple|ForceComp)' /etc/X11/xorg.conf.d/20-nvidia.conf
echo -e '\n\n'
figlet 'GLX'
echo -e '\n*** --no-vsync ***\n'
run glx 0 0 --no-vsync
run glx 0 1 --no-vsync
run glx 1 0 --no-vsync
run glx 1 1 --no-vsync
echo -e '\n*** --vsync ***\n'
run glx 0 0 --vsync
run glx 0 1 --vsync
run glx 1 0 --vsync
run glx 1 1 --vsync
echo -e '\n\n'
figlet 'XRENDER'
echo -e '\n*** --no-vsync ***\n'
run xrender 0 0 --no-vsync
run xrender 0 1 --no-vsync
run xrender 1 0 --no-vsync
run xrender 1 1 --no-vsync
echo -e '\n*** --vsync ***\n'
run xrender 0 0 --vsync
run xrender 0 1 --vsync
run xrender 1 0 --vsync
run xrender 1 1 --vsync
tmux capture-pane -pS -1000000 > "$results.txt" sort-times.pyThis script goes through every folder (Xorg options) under set #!/usr/bin/env python3
import json
import os
from typing import Iterable
# https://pypi.org/project/tabulate/
from tabulate import tabulate
# $results from test2.sh
results = './'
def find(results: str) -> Iterable[str]:
'''
Finds every .json file in subdirectories of the `results` directory,
ignores files in `results`'s root.
'''
for dirpath, _, filenames in os.walk(results):
for fn in filenames:
if os.path.splitext(fn)[1] == '.json' and dirpath != './':
yield os.path.join(dirpath, fn)
def parse(fp: str) -> dict:
'''
Takes file path to .json result of a run, returns dict with parsed options used / latency
'''
run = {}
name = os.path.splitext(os.path.basename(fp))[0]
run['backend'] = name.split(',')[0]
run['picom_vsync'] = name.find('--vsync') != -1
run['sync_to_vblank'] = name.find('SyncToVBlank=1') != -1
run['allow_flipping'] = name.find('AllowFlipping=1') != -1
dir = os.path.basename(os.path.dirname(fp))
run['triple_buffer'] = dir.find('no-3b') == -1
run['ffcp'] = dir.find('no-ffcp') == -1
run['patched'] = dir.find('def') == -1
with open(fp, 'r', encoding='utf-8') as f:
timings = json.load(f)
# to ms
run['mean'] = round(timings['results'][0]['mean'] * 1000)
run['median'] = round(timings['results'][0]['median'] * 1000)
run['stddev'] = round(timings['results'][0]['stddev'] * 1000, 2)
return run
runs = [parse(fp) for fp in find(results)]
runs = sorted(runs, key=lambda r: r['mean'])
with open('./runs.json', 'w', encoding='utf-8') as ff:
json.dump(runs, ff, indent=2)
# for markdown set tablefmt='github'
print(tabulate(runs, tablefmt="fancy_grid", headers='keys')) Here is a notebook selecting rows with various options to get a feel for latency. Also, if you notice that after reboot everything is slow again, remember that
from nvidia-settings man page. So don't forget to add |
@MahouShoujoMivutilde Very sorry for the late reply. This is not much of an update, but I just wanted to let you know that I have read your reply and actually switched to using settings that give the lowest latency (as per your results) a month ago. The results seem to be right as I seem to notice some latency improvement. But it still puzzles me why there's such a huge difference and how we could improve picom's performance (OK, maybe not a "huge difference". 22-23ms seems to be just slightly more than one frame of latency worse than FFCP, assuming your display is 60Hz) I have yet to run your tests on my own machine though, but I imagine I'll probably discover something once I do (as I'm now using a 160Hz display). I also have access to a laptop running AMD graphics now, so I want to do some investigation in that area to see if this is NVIDIA-specific problem or something inherent in picom. Sorry, I've just been really short on time lately, though I would really love to fix this problem myself (I get pretty annoyed at the latency difference when I notice how fast everything is when I need to kill If anyone has the time, I think this is a pretty high-priority issue that could be looked into. Investigating how KDE handles their compositing v-sync algorithm might also be worthwhile, since apparently they have a superior algorithm than I haven't daily-driven Plasma on my main machine yet (running |
sorry to notify literally any one associated with this thread, but just a question:
for me picom can take up to 20% cpu (i use animations) and gpu takes 70% with 70W average. |
Hi! :) This is not really a bug report, but rather a question out of curiosity.
I wonder why with my current configuration and setup, the
xrender
backend is less laggy than theglx
backend. Both prevent tearing, but when I move a floating window around, it lags much more with theglx
backend than with thexrender
backend.Note: I am referring to the "experimental" backends!
I would be really interested to learn why this could be the case or if I did something wrong in my config. I'm a long-term user of compton/picom, but most of the time I used the
intel
Xorg driver with theTearFree
option and disabledvsync
in the compositor. But since a few months, I finally switched to themodesetting
driver because there has been a bug in theintel
driver that affected me (and because themodesetting
driver is said to be more performant anyway).Platform
Fedora 34 (pre-release), kernel 5.11.14-300.fc34.x86_64
GPU, drivers, and screen setup
Intel Corporation WhiskeyLake-U GT2 [UHD Graphics 620], modesetting driver for Xorg, external 1920x1080 monitor connected to laptop via DisplayPort (over USB-C).
vainfo: Driver version: Intel i965 driver for Intel(R) Coffee Lake - 2.4.1
glxinfo -B
:Environment
i3 :)
picom version
picom --diagnostics
:Configuration:
grep '^[^#]' .config/picom.conf
:Thank you very much! ❤️
The text was updated successfully, but these errors were encountered: