New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Texture output by replacing SDL's Lock/UnlockTexture with TextureUpdate #177
Comments
Ok, with help of one developer was able to get rid of lock/unlock in favor of texture_update, so now will made test to see if deadlocks gone. But speed, as I can see, didn't drop. |
Ok so, deadlocks are gone once SDL_LockTexture() and SDL_UnlockTexture removed in favor of SDL_UpdateTexture(). Speed seems the same (if not faster). That what we do for: At top: Then, where we create a texture, we made a similar SDL_Surface: /* SDL_PIXELFORMAT_ARGB8888 is possible with most
rendering drivers, "opengles" being a notable exception */
sdl.texture.texture = SDL_CreateTexture(sdl.renderer, SDL_PIXELFORMAT_ARGB8888,
SDL_TEXTUREACCESS_STREAMING, width, height);
#ifdef NOLOCK
surface_test = SDL_CreateRGBSurface(0, width, height, 32, 0, 0, 0, 0);
#endif
/* SDL_PIXELFORMAT_ABGR8888 (not RGB) is the
only supported format for the "opengles" driver */
if (!sdl.texture.texture) {
if (flags & GFX_RGBONLY) goto dosurface;
sdl.texture.texture = SDL_CreateTexture(sdl.renderer, SDL_PIXELFORMAT_ABGR8888,
SDL_TEXTUREACCESS_STREAMING, width, height);
} Then, in the GFX_StartUpdate() , remove SDL_LockTexture and give DB pixels and pitch: #if SDL_VERSION_ATLEAST(2,0,0)
case SCREEN_TEXTURE:
{
void * texPixels;
int texPitch;
#ifdef NOLOCK
pixels=(Bit8u *)surface_test->pixels;
pitch = surface_test->pitch;
#else
if (SDL_LockTexture(sdl.texture.texture, NULL, &texPixels, &texPitch) < 0)
return false;
pixels = (Bit8u *)texPixels;
pitch = texPitch;
#endif
sdl.updating=true;
return true;
} And at the end, in the GFX_EndUpdate, remove SDL_UnlockTexture and call SDL_UpdateTexture instead: #if SDL_VERSION_ATLEAST(2,0,0)
case SCREEN_TEXTURE:
#ifdef NOLOCK
SDL_UpdateTexture(sdl.texture.texture, NULL /* update whole texture */, surface_test->pixels, surface_test->pitch);
#else
SDL_UnlockTexture(sdl.texture.texture);
#endif
SDL_RenderClear(sdl.renderer);
SDL_RenderCopy(sdl.renderer, sdl.texture.texture, NULL, &sdl.clip);
SDL_RenderPresent(sdl.renderer);
break;
#else // !SDL_VERSION_ATLEAST(2,0,0) And after So this way it works, no deadlocks, etc. Sure it can be done better probably, but the idea works, and worth implementing. |
@kas1e , another great investigation! Given Ryan recommends using Edit: I now see your message above, and this should be sufficient to prepare a commit; thank you! |
edit: about another SDL2 issue create another ticket: #178 |
Code suggested above is not for dosbox-staging - it is for DOSBox upstream with SDL 2.0 patch applied (as I can tell by
Setting |
@dreamer |
@kas1e if something wouldn't be right, we would point it out during review (in PR), asking for adjustments, before new code would reach master branch. I really need to prepare contributing guidelines - I have it on my TODO list for too long already - basically I am ok with accepting changes from non-programmers (not only code needs fixes! documentation, translations, packaging work, etc). I guess @krcroft will assign himself as soon as he'll start working on this - until then, this improvement is up for grabs for anyone :) |
Confirming: I'm finishing up CD-DA, then this will be next (if it's not picked off by someone else). |
That discussion on SDL forum was from 2013, it might not reflect the current state of affairs. Right now, documentation for SDL_UpdateTexture says:
This is not our use-case, we want to update screen as fast as possible. It also says:
Which makes sense, if the locked texture is indeed optimised for write-only operation. Anyways, there's an important note in the documentation for SDL_UnlockTexture:
Which is: Bug 1586 - SDL_LockTexture has inconsistent behaviour with different render drivers Closed as WONTFIX, because inconsistent behaviour is deliberate optimization. Where do we go from hereWe need to measure and put some numbers and facts in here instead of mere assertions. More info is needed.
SDL_SetWindowTitle happens sometimes between locks, sometimes outside locks, same for all other ones. But even if those calls were happening always while texture lock is on - why is this a problem? Writing to texture has nothing to do with those calls - I event think it's GOOD that those operations are happening while the texture is locked, as it should keep SDL busy while we write to the texture. I guess the real problem here are deadlocks in SDL2 implementation on AmigaOS, right? Those happen with all As for the code: I don't like the ifdefs in here - this means an additional build configuration, so one would always be under-tested - it's bad from a maintenance perspective, as it doubles testing effort when developing new features. Also, I don't want to implement a seemingly random solution without fully understanding what is the root cause of the problem. @kas1e please confirm if deadlocks are happening with all available texture renderers. I will try to devise some tests to measure the performance difference between lock/unlock/update and various renderers, but I consider this issue a medium priority (because |
Yes, deadlocks happen with opengl and texture for sure. They did not happen with surface. About why you can do stuck between locks: i, not skilled programmer, but I know the golden rule of all times: if you lock something, better to unlock it faster, without doing much in between. And yes, about SDL_UpdateTexture it strange: in a wiki, they say it slower, but Rian itself says it's in another way around faster and better: https://forums.libsdl.org/viewtopic.php?t=9728 My tests show, that at least on my side, SDL_UpdateTexture feels even a bit faster. At least by tests via pcpbench it's not slower for sure. As for ifdefs, of course they didn't mean be there. I only show how it in my local build now, but they should be used as default, and lock/unlock needs to be removed. Of course, if you think you can do so. I can easily do it myself locally for amigaos4 builds. |
You mean with
In this case it's not really a synchronization primitive… Functions Lock/Unlock could be named GetFastWriteableSurface/UploadFastWriteableSurface.
If it'll turn out to be faster overall on other OSes, then I will likely switch, I just need some empirical data to make an informed decision. I would prefer to avoid ifdefs or introducing new user settings. It is possible, that source of lockups is something completely different, and if we won't address the real issue here, it will come back and more workarounds will be required. |
Interesting development: I made test implementation of this and it fixed #149 on Windows 10 (at least for |
Lots of complexity reduction in that commit, @dreamer - this is great. In the prior code, I suspect if We dropped print statements beside each, we would see a tit-for-tat pattern in the good scenarios (Lock->Unlock; Lock->Unlock), but not in the bad (Lock, Lock->Unlock, Unlock, Lock). New code boils that away and does it atomically :-) Very nice. |
I did some performance testing for Quake 1, timedemo 1,
So it seems like a tiny loss of performance using low resolution (bordering statistical error) and no significant change on medium resolution. I think I will do some small code adjustments to mitigate possible performance loss and it'll be ok to replace old implementation. |
Yeah, sounds good ! thanks! |
@kas1e Can be tested on branch |
If the problem returns (or was not completely fixed), feel free to reopen this issue :) |
Found lately, that SDL2 version of DosBox (be it original patch from Vogons, or be it current staging version) do things between locking and unlocking of texture (while shouldn't). It seems that on popular oses such as Win, Unixes, and macOS, OS can deal with it, but on less robust oses doing a lot of things between lock/unlock causes deadlocks.
In the case of DosBox, we have between a lock and unlock calls to SDL_SetWindwoTitle(which should be outside of course), as well as do call SDL_GetTicks, SDL_Events and probably some random Prinfs() there and there.
So, the solution for that is to remove the lock and unlock, and replace it in favor of SDL_UpdateTexture, which will mean, that all lock/unlock will happens internally inside of SDL_UpdateTexture, and DosBox can't do anything in-between so deadlocks should go.
Also, as krcroft point out on Vogons, in that thread: https://forums.libsdl.org/viewtopic.php?t=9728
authors explain that usage of SDL_UpdateTexture can be even a little bit faster, which also will improve things for DosBox.
Is anyone in interest to make some tests about? One of our programmers suggest about what we need to do probably:
I for myself, not a programmer so only can help with the idea and testing, so if anyone able to try, I will be happy to test / measure / check it all.
Thanks!
The text was updated successfully, but these errors were encountered: