Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Texture output by replacing SDL's Lock/UnlockTexture with TextureUpdate #177

Closed
kas1e opened this issue Feb 16, 2020 · 17 comments · Fixed by #222
Closed

Improve Texture output by replacing SDL's Lock/UnlockTexture with TextureUpdate #177

kas1e opened this issue Feb 16, 2020 · 17 comments · Fixed by #222
Assignees
Labels
enhancement New feature or enhancement of existing features

Comments

@kas1e
Copy link

kas1e commented Feb 16, 2020

Found lately, that SDL2 version of DosBox (be it original patch from Vogons, or be it current staging version) do things between locking and unlocking of texture (while shouldn't). It seems that on popular oses such as Win, Unixes, and macOS, OS can deal with it, but on less robust oses doing a lot of things between lock/unlock causes deadlocks.

In the case of DosBox, we have between a lock and unlock calls to SDL_SetWindwoTitle(which should be outside of course), as well as do call SDL_GetTicks, SDL_Events and probably some random Prinfs() there and there.

So, the solution for that is to remove the lock and unlock, and replace it in favor of SDL_UpdateTexture, which will mean, that all lock/unlock will happens internally inside of SDL_UpdateTexture, and DosBox can't do anything in-between so deadlocks should go.

Also, as krcroft point out on Vogons, in that thread: https://forums.libsdl.org/viewtopic.php?t=9728
authors explain that usage of SDL_UpdateTexture can be even a little bit faster, which also will improve things for DosBox.

Is anyone in interest to make some tests about? One of our programmers suggest about what we need to do probably:

  1. Create SDL_Surface similar (width, height, format) to SDL_Texture
  2. Remove SDL_LockTexture, give DB pixels and pitch to the surface
  3. Remove SDL_UnlockTexture, call SDL_UpdateTexture instead of
  4. When SDL_Texture is destroyed, remember to destroy SDL_Surface too

I for myself, not a programmer so only can help with the idea and testing, so if anyone able to try, I will be happy to test / measure / check it all.

Thanks!

@kas1e
Copy link
Author

kas1e commented Feb 16, 2020

Ok, with help of one developer was able to get rid of lock/unlock in favor of texture_update, so now will made test to see if deadlocks gone.

But speed, as I can see, didn't drop.

@kas1e
Copy link
Author

kas1e commented Feb 16, 2020

Ok so, deadlocks are gone once SDL_LockTexture() and SDL_UnlockTexture removed in favor of SDL_UpdateTexture(). Speed seems the same (if not faster). That what we do for:

At top: SDL_Surface *surface_test;

Then, where we create a texture, we made a similar SDL_Surface:

		/* SDL_PIXELFORMAT_ARGB8888 is possible with most
		rendering drivers, "opengles" being a notable exception */
		sdl.texture.texture = SDL_CreateTexture(sdl.renderer, SDL_PIXELFORMAT_ARGB8888,
		                                        SDL_TEXTUREACCESS_STREAMING, width, height);

#ifdef NOLOCK
		surface_test = SDL_CreateRGBSurface(0, width, height, 32, 0, 0, 0, 0);
#endif

		/* SDL_PIXELFORMAT_ABGR8888 (not RGB) is the
		only supported format for the "opengles" driver */
		if (!sdl.texture.texture) {
			if (flags & GFX_RGBONLY) goto dosurface;
			sdl.texture.texture = SDL_CreateTexture(sdl.renderer, SDL_PIXELFORMAT_ABGR8888,
			                                        SDL_TEXTUREACCESS_STREAMING, width, height);
		}

Then, in the GFX_StartUpdate() , remove SDL_LockTexture and give DB pixels and pitch:

#if SDL_VERSION_ATLEAST(2,0,0)
    case SCREEN_TEXTURE:
    {
        void * texPixels;
        int texPitch;

#ifdef NOLOCK
        pixels=(Bit8u *)surface_test->pixels;
        pitch = surface_test->pitch;
#else
        if (SDL_LockTexture(sdl.texture.texture, NULL, &texPixels, &texPitch) < 0)
            return false;
        pixels = (Bit8u *)texPixels;
        pitch = texPitch;
#endif
        sdl.updating=true;
        return true;
    }

And at the end, in the GFX_EndUpdate, remove SDL_UnlockTexture and call SDL_UpdateTexture instead:

#if SDL_VERSION_ATLEAST(2,0,0)
    case SCREEN_TEXTURE:

#ifdef NOLOCK
        SDL_UpdateTexture(sdl.texture.texture, NULL /* update whole texture */, surface_test->pixels, surface_test->pitch);
#else
        SDL_UnlockTexture(sdl.texture.texture);
#endif
        SDL_RenderClear(sdl.renderer);
        SDL_RenderCopy(sdl.renderer, sdl.texture.texture, NULL, &sdl.clip);
        SDL_RenderPresent(sdl.renderer);
        break;
#else    // !SDL_VERSION_ATLEAST(2,0,0)

And after SDL_DestroyTexture(sdl.texture.texture); also add SDL_FreeSurface(surface_test);.

So this way it works, no deadlocks, etc. Sure it can be done better probably, but the idea works, and worth implementing.

@kcgen
Copy link
Member

kcgen commented Feb 16, 2020

@kas1e , another great investigation!
Can you make a diff of the modified file, versus the existing master-branch file? (or just attach the modified file as-is here; rename to .txt or zip it). We can then test and prepare a commit against it.

Given Ryan recommends using update if possible instead of locking/unlocking, it shouldn't be too controversial either.

Edit: I now see your message above, and this should be sufficient to prepare a commit; thank you!

@kas1e
Copy link
Author

kas1e commented Feb 16, 2020

edit: about another SDL2 issue create another ticket: #178

@kcgen kcgen changed the title Imrove SDL2's Texture output by removing SDL_LockTexture/SDL_UnlockTexture in favor of SDL_TextureUpdate Improve Texture output by replacing SDL_(Un)LockTexture with SDL_TextureUpdate Feb 17, 2020
@kcgen kcgen changed the title Improve Texture output by replacing SDL_(Un)LockTexture with SDL_TextureUpdate Improve Texture output by replacing SDL's Lock/UnlockTexture with TextureUpdate Feb 17, 2020
@dreamer
Copy link
Member

dreamer commented Feb 17, 2020

Code suggested above is not for dosbox-staging - it is for DOSBox upstream with SDL 2.0 patch applied (as I can tell by #if SDL_VERSION_ATLEAST(2,0,0), which are absent from dosbox-staging code).

I for myself, not a programmer so only can help with the idea and testing, so if anyone able to try, I will be happy to test / measure / check it all.

Setting help wanted label - anyone who has spare cycles - feel free to take on this issue and either assign yourself or ask for being assigned :)

@dreamer dreamer added the help wanted Community help wanted label Feb 17, 2020
@kas1e
Copy link
Author

kas1e commented Feb 17, 2020

@dreamer
As far as i can read from latest krcroft's post he will do commit about. I for sure shouldnt be one commited anything to dosbox repo as i not programmer for sure. Who know if the way i do it correct at all, and shouldnt be done better :)

@dreamer
Copy link
Member

dreamer commented Feb 17, 2020

@kas1e if something wouldn't be right, we would point it out during review (in PR), asking for adjustments, before new code would reach master branch.

I really need to prepare contributing guidelines - I have it on my TODO list for too long already - basically I am ok with accepting changes from non-programmers (not only code needs fixes! documentation, translations, packaging work, etc).

I guess @krcroft will assign himself as soon as he'll start working on this - until then, this improvement is up for grabs for anyone :)

@kcgen
Copy link
Member

kcgen commented Feb 17, 2020

Confirming: I'm finishing up CD-DA, then this will be next (if it's not picked off by someone else).

@kcgen kcgen added this to Queued in Ongoing tasks Feb 19, 2020
@dreamer dreamer self-assigned this Feb 22, 2020
@dreamer dreamer added enhancement New feature or enhancement of existing features and removed help wanted Community help wanted labels Feb 22, 2020
@dreamer
Copy link
Member

dreamer commented Feb 23, 2020

That discussion on SDL forum was from 2013, it might not reflect the current state of affairs. Right now, documentation for SDL_UpdateTexture says:

This is a fairly slow function, intended for use with static textures that do not change often.

This is not our use-case, we want to update screen as fast as possible.

It also says:

If the texture is intended to be updated often, it is preferred to create the texture as streaming and use the locking functions referenced below.

Which makes sense, if the locked texture is indeed optimised for write-only operation.

Anyways, there's an important note in the documentation for SDL_UnlockTexture:

See SDL bug #1586 before using this function!

Which is: Bug 1586 - SDL_LockTexture has inconsistent behaviour with different render drivers

Closed as WONTFIX, because inconsistent behaviour is deliberate optimization.

Where do we go from here

We need to measure and put some numbers and facts in here instead of mere assertions. More info is needed.

In the case of DosBox, we have between a lock and unlock calls to SDL_SetWindwoTitle(which should be outside of course), as well as do call SDL_GetTicks, SDL_Events and probably some random printfs() there and there.

SDL_SetWindowTitle happens sometimes between locks, sometimes outside locks, same for all other ones.

But even if those calls were happening always while texture lock is on - why is this a problem? Writing to texture has nothing to do with those calls - I event think it's GOOD that those operations are happening while the texture is locked, as it should keep SDL busy while we write to the texture.

I guess the real problem here are deadlocks in SDL2 implementation on AmigaOS, right? Those happen with all texture_renderers or only some specific ones? Can you loop in developers you consulted, who suggested replacing lock/unlock with an update call?

As for the code: I don't like the ifdefs in here - this means an additional build configuration, so one would always be under-tested - it's bad from a maintenance perspective, as it doubles testing effort when developing new features. Also, I don't want to implement a seemingly random solution without fully understanding what is the root cause of the problem.

@kas1e please confirm if deadlocks are happening with all available texture renderers.

I will try to devise some tests to measure the performance difference between lock/unlock/update and various renderers, but I consider this issue a medium priority (because output=opengl is default in dosbox-staging).

@kas1e
Copy link
Author

kas1e commented Feb 23, 2020

Yes, deadlocks happen with opengl and texture for sure. They did not happen with surface.

About why you can do stuck between locks: i, not skilled programmer, but I know the golden rule of all times: if you lock something, better to unlock it faster, without doing much in between.

And yes, about SDL_UpdateTexture it strange: in a wiki, they say it slower, but Rian itself says it's in another way around faster and better: https://forums.libsdl.org/viewtopic.php?t=9728

My tests show, that at least on my side, SDL_UpdateTexture feels even a bit faster. At least by tests via pcpbench it's not slower for sure.

As for ifdefs, of course they didn't mean be there. I only show how it in my local build now, but they should be used as default, and lock/unlock needs to be removed. Of course, if you think you can do so. I can easily do it myself locally for amigaos4 builds.

@dreamer
Copy link
Member

dreamer commented Feb 23, 2020

Yes, deadlocks happen with opengl and texture for sure. They did not happen with surface.

You mean with output=opengl? If that's the case, then source of the problem is not in texture lock/unlock.

About why you can do stuck between locks: i, not skilled programmer, but I know the golden rule of all times: if you lock something, better to unlock it faster, without doing much in between.

In this case it's not really a synchronization primitive… Functions Lock/Unlock could be named GetFastWriteableSurface/UploadFastWriteableSurface.

And yes, about SDL_UpdateTexture it strange: in a wiki, they say it slower, but Rian itself says it's in another way around faster and better: https://forums.libsdl.org/viewtopic.php?t=9728

My tests show, that at least on my side, SDL_UpdateTexture feels even a bit faster. At least by tests via pcpbench it's not slower for sure.

As for ifdefs, of course they didn't mean be there. I only show how it in my local build now, but they should be used as default, and lock/unlock needs to be removed. Of course, if you think you can do so. I can easily do it myself locally for amigaos4 builds.

If it'll turn out to be faster overall on other OSes, then I will likely switch, I just need some empirical data to make an informed decision. I would prefer to avoid ifdefs or introducing new user settings.

It is possible, that source of lockups is something completely different, and if we won't address the real issue here, it will come back and more workarounds will be required.

@dreamer
Copy link
Member

dreamer commented Mar 13, 2020

Interesting development: I made test implementation of this and it fixed #149 on Windows 10 (at least for texture_renderer=direct3d11).

@kcgen
Copy link
Member

kcgen commented Mar 13, 2020

Lots of complexity reduction in that commit, @dreamer - this is great.

In the prior code, Lock() and Unlock() existed across separate functions, conditionally called in yet another layer of separate functions. In these frozen-screen scenarios, I suspect these two are able to get out of sync with each-other leaving the texture locked (and not unlocked).

I suspect if We dropped print statements beside each, we would see a tit-for-tat pattern in the good scenarios (Lock->Unlock; Lock->Unlock), but not in the bad (Lock, Lock->Unlock, Unlock, Lock).

New code boils that away and does it atomically :-) Very nice.

@dreamer
Copy link
Member

dreamer commented Mar 14, 2020

I did some performance testing for Lock/Unlock vs UpdateTexture; tests so far were rather simple, but here are the results:

Quake 1, timedemo 1, output=texture, texture_renderer=opengl, in window:

Lock/Unlock
320x200 109.9 111.1 111.5
640x480  47.9  40.7  40.6

UpdateTexture
320x200 107.2 110.5 108.4
640x480  48.3  40.7  41.1

So it seems like a tiny loss of performance using low resolution (bordering statistical error) and no significant change on medium resolution. I think I will do some small code adjustments to mitigate possible performance loss and it'll be ok to replace old implementation.

@kas1e
Copy link
Author

kas1e commented Mar 14, 2020

Yeah, sounds good ! thanks!

@dreamer
Copy link
Member

dreamer commented Mar 21, 2020

@kas1e Can be tested on branch po/gfx-splash-6-ppc, as other related fixes. Please confirm if it works ok :)

@dreamer
Copy link
Member

dreamer commented Mar 26, 2020

If the problem returns (or was not completely fixed), feel free to reopen this issue :)

@dreamer dreamer moved this from To do to Done in First release (0.75.x) Mar 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or enhancement of existing features
Projects
No open projects
3 participants