softgpu: Optimize (bi-)linear texture filtering #17609

fp64 · 2023-06-21T17:16:09Z

Seeing as SampleLinearLevel is near the top in the profiler, optimize actual bilinear filtering using SSE2. Solid win in the synthetic benchmark (https://godbolt.org/z/fqh3xvbGx, also doubles as correctness check), no visible difference in actual PPSSPP. Note: profiler suggests that hot part of SampleLinearLevel is elsewhere.

hrydgard · 2023-06-21T17:39:45Z

I keep making various optimizations myself that locally look like great wins but seems to have barely a measurable effect overall... but it's hard to measure. Machines clock up and down according to load, etc.

This one has to be a win on some dimension, maybe power consumption :P I'm all for merging it, though I'll let @unknownbrackets click merge.

fp64 · 2023-06-21T18:22:05Z

Well, in my case "observable difference" would constitute going from 7 FPS on average to 8 - 12.5% improvement, pretty significant for a single function change. It oscillating between 6 and 9 FPS does not help measuring.

Offtopic, but while eyeing softgpu for more optimization opportunities, I have several questions, which I'm not sure where to ask. The discord would seem a logical choice... if the damn thing would actually work for me. Maybe I'll just create "softgpu optimization opportunities" issue, or something.

unknownbrackets · 2023-06-21T18:31:29Z

This would only apply for 32-bit Intel, you're not going to end up in this function on x86_64. So it probably won't actually make any difference for most users. I'd tried to avoid over optimizing this code for SSE given that we're already using a jit for it that is much faster (especially with AVX2.)

-[Unknown]

hrydgard · 2023-06-21T18:34:20Z

Oh right, forgot about that, hah.

Do feel free to create a discussion issue if you want.

fp64 · 2023-06-21T18:44:50Z

Oh, looks like I'm blind. I somehow thought that DrawPixelX86.cpp was the only special JIT path, but SamplerX86.cpp is a thing too.
While I care about 32-bit perf on x86 (and am mildly hopeful about improving it to more palatable levels there in softgpu), I realize that most people don't.

unknownbrackets

Well, this seems reasonable, so I'll merge.

-[Unknown]

hrydgard added this to the v1.16.0 milestone Jun 21, 2023

hrydgard added the Software Rasterizer label Jun 21, 2023

hrydgard approved these changes Jun 21, 2023

View reviewed changes

unknownbrackets approved these changes Jun 22, 2023

View reviewed changes

unknownbrackets merged commit 76990ae into hrydgard:master Jun 22, 2023

fp64 mentioned this pull request Jun 22, 2023

SoftGPU perf opportunities #17613

Open

2 tasks

fp64 deleted the optimize-softgpu-tex-linear branch June 30, 2023 16:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

softgpu: Optimize (bi-)linear texture filtering #17609

softgpu: Optimize (bi-)linear texture filtering #17609

fp64 commented Jun 21, 2023

hrydgard commented Jun 21, 2023

fp64 commented Jun 21, 2023

unknownbrackets commented Jun 21, 2023

hrydgard commented Jun 21, 2023 •

edited

Loading

fp64 commented Jun 21, 2023 •

edited

Loading

unknownbrackets left a comment

softgpu: Optimize (bi-)linear texture filtering #17609

softgpu: Optimize (bi-)linear texture filtering #17609

Conversation

fp64 commented Jun 21, 2023

hrydgard commented Jun 21, 2023

fp64 commented Jun 21, 2023

unknownbrackets commented Jun 21, 2023

hrydgard commented Jun 21, 2023 • edited Loading

fp64 commented Jun 21, 2023 • edited Loading

unknownbrackets left a comment

Choose a reason for hiding this comment

hrydgard commented Jun 21, 2023 •

edited

Loading

fp64 commented Jun 21, 2023 •

edited

Loading