Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow default pass begin/end calls #867

Closed
ArtemkaKun opened this issue Aug 11, 2023 · 5 comments
Closed

Slow default pass begin/end calls #867

ArtemkaKun opened this issue Aug 11, 2023 · 5 comments

Comments

@ArtemkaKun
Copy link

ArtemkaKun commented Aug 11, 2023

Hi, I'm creating a game with sokol in V (there is an "official" wrappers for sokol called gg). I need to have a stable 144 FPS on mobile and currently my game us struggling with that.

When I profiled a game, it seems sokol's begin/end functions take a lot of time to complete, even if I'm not rendering anything.

I'm not sure if I profiled correctly, but why gg begin/end functions takes so much times to execute?

Function: gg__Context_end, Self Time: 6897599ns, Calls: 4235
Function: sokol__gfx__begin_default_pass, Self Time: 6895704ns, Calls: 4235

Please, ask me questions if you have any, since the area of the problem is so wide that I don't know where to start.

Right now I'm using sokol app and probably sokol_gl "API" instead of raw gfx calls. On mobile I'm using gles 3 (V version of sokol headers are outdated for 7 months, so there is still gles 2 support).

@ArtemkaKun
Copy link
Author

Ok, the problem exists because of vsync. I disabled it with vblank_mode=0 env var. Is there a way to disable vsync on Android?

@floooh
Copy link
Owner

floooh commented Aug 11, 2023

Not hitting the display refresh rate with an empty render pass sounds strange, it shouldn't be related to vsync.

For typical rendering work it's not unusual in GL that expensive actions are delayed until later in the frame though, measuring individual GL calls is pretty much useless unfortunately because it's entirely unpredictable what happens inside (e.g. a command buffer with queued up work might need to be flushed which then causes an otherwise cheap function to look very expensive).

Self Time: 6897599ns, Calls: 4235

Does this mean 6897599ns for one call, or for 4235 calls? If it is for all calls, then I wouldn't call that all that surprising, especially on a mobile device (it would mean 1600 nanoseconds for one call, which is 1.6 microseconds, which is 0.0016 milliseconds for up to around 10 GL calls that might happen inside sg_begin_pass().

If I would analyze the problem I would start with searching for Android graphics debuggers/profilers which could help getting a clearer picture where the time is actually spent on the GPU side (sometimes mobile GPU vendors offer such tools).

PS: not sure how vsync can be disabled programmatically on Android, but I would be careful with that because if not throttled otherwise that's a sure way to heat up the device and empty the battery very quickly.

@ArtemkaKun
Copy link
Author

Thanks for the response 👍

Facts so far:

  • with disabled vsync on Linux (on PC) profiler now reports much better results
gg__Context_end, Self Time: 58091ns, Calls: 7763
sokol__gfx__begin_default_pass, Self Time: 40569ns, Calls: 7763
  • this time (6897599ns) - is the time (avg I think) of one call of the function. And this is on Linux PC. I didn't try to profile on Android with V profiler yet.

  • I have a custom game logic loop (with fixed frametime) and call draws from it. So I "disabled" sokol_app draw loop.

@floooh
Copy link
Owner

floooh commented Aug 13, 2023

If the 6.9 million nanoseconds (or 6.9 milliseconds) is the time for one call to sg_begin_default_pass() with vsync enabled, then this looks like the underlying GL implementation is most likely waiting for a swapchain surface to become available for clearing (e.g. the GPU has run ahead so much that no free swapchain surfaces are available because they are all currently waiting to be presented, and as soon as a flip happened, the oldest swapchain surface becomes available again and the CPU-side render loop can continue).

This 6.9 milliseconds is exactly the frame duration for a 144Hz refresh rate, so that would totally make sense, and it also explains why the time goes down when disabling vsync (because then the swapchain will just run unthrottled and flip through the surfaces as fast as possible, which will essentially discard most rendered frames without being shown).

This 6.9ms wait doesn't indicate a problem though, at some point the CPU-side render loop needs to be throttled to the vsync frequency even if there's nothing to render (and as I said above, where this waiting happens is pretty much unpredictable in GL). As soon as you start to add some noticeable rendering payload, the time spent waiting in sg_begin_default_pass() should go down as long as the "rendering payload" fits into 6.9ms, above that and you'd start to miss frames.

@ArtemkaKun
Copy link
Author

Thanks for extended explanation. So I assume these profile results don't indicate any problem with rendering and this issue can be closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants