-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First pass at backwards compatible pygame alpha blit #2213
Conversation
Accidentally left the test on for SDL1 builds.
Some speed tests: new-default SDL1-style alpha-blit (using SSE2):
old-default SDL2 style blit (uses some form of intrinsics in SDL2):
new-default SDL1 style alpha-blit (without any SSE2/NEON intrinsics)
So the new SDL1 blitter is fairly comparable, though slightly slower, than the default SDL2 one and both of them are a lot faster than not using intrinsics. It's possible we could improve the speed a bit without affecting the accuracy, I'm not an SSE2 intrinsics expert. |
Was using this to do speed testing (changing the flag to switch to the SDL2 blitter):
|
There is a separate path for the arm blitters in SDL2 and it and, at least in the SDL version used in the CI, outputs different results than on other platforms. This behaviour is likely to change in the future as changes have gone into the arm blitters. This is not something we can fix as this test is specifically for the SDL 2 blitter.
Various SSE2 shift functions require compile time constants so if we wanted RGBA surfaces to have SSE2 optimised blends they would need their own version of the function.
This will bring them into line with SDL1 but they are about half as fast as 16bit surfaces using the SDL2 blitter.
This doesn't bring us into line with SDL1 *exactly* in this case but it does put the alpha blits down the same path. The new SDL1 mimic-ing blitter seems closer than the SDL2 blitter to the original SDL1 blitter here. A test was also added but then disabled because the blit is still slightly off the SDL1 original. Surfaces with colorkeys have also been excluded from the new path.
Also makes sure that surfaces with just surface alpha get passed through, though they can't use the SSE2 enhanced blit so we need to see how that works with surfaces with per pixel alpha AND surface alpha. Clarified some test descriptions as well and re-enabled original test for this issue as it now passes.
Adding this means we can have pixel alpha and surface alpha in the same source surface for a blit, but ti does slow the whole blit function down doing the extra maths. Think the next step may be to split the SSE2 blitter into several with different feature support, this seems to be the approach taken by SDL2. Also this needs a test.
Supporting surface alpha is makes the blitter slower, but because it is a surface level thing we now test for it at the start and send the blitter down two different SSE2 paths - one with surface alpha support, and one without. The one with support seems faster than the SDL2 version. Also added a test for source surfaces with source alpha.
Checked over the test cases in the original report (#1289) and they all look/print like they do in SDL1 now. So that's nice. |
Thought the name was a bit clearer, env var is actually 'PYGAME_BLEND_ALPHA_SDL2'. I added getting the environment variable into pygame_init() because I didn't want to get it on every blit. adding: os.environ['PYGAME_BLEND_ALPHA_SDL2'] = '1' before you call pygame.init() should now set the blit mode for every blit.
With this PR you should now be able to do:
To set the blitter back to how it worked before this PR. That's in addition to the |
…aque This version seems equivalent to SDL2 speed wise which is good as I expect alpha surfaces to an opaque background is a fairly common operation.
Ok, that is probably enough fiddling. To summarise:
This should resolve all the cases raised in #1289 and maybe will help with some others in the tracker, I haven't checked through carefully yet. |
Speed wise, in my tests (rough code posted above):
I don't have any particular ideas for improving that right now. |
Nice work :) This also fixes the issue in #742 This example used to error with:
import pygame as pg
pg.init()
surf = pg.display.set_mode((320,240))
font = pg.font.SysFont("Arial", 24)
image = font.render("Test", 0, (255,255,255), (0,0,0))
print(image.get_colorkey()) #returns None in both SDL1/SDL2.
image.set_alpha(255)
surf.blit(image,(0,0)) But now it doesn't crash :) I committed this test case to the branch in font_test.py. |
Hard to debug this remotely but I have a suspicion that the issues might be caused by big endian bit shifting. This commit attempts to alter that by not bit-shifting the alpha around and flipping some masks I was using. I understand that SIMD registers are always little endian so any shifting done there should not be affected.
Previously we were checking the Ashift was 24, which it won't be if this is an endianness caused issue.
Just setting the src alpha to 128 to see what happens in the output, if anything.
Remote debugging is tricky.
Looks like the arm CI build doesn't use the SSE2 optimised path at all, so the test issue was likely coming from a lack of support for surface_alpha() in the non-sse2 version of the alpha blend which I've now changed. Fingers crossed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_/=====\_
| great! |
---------\
`_
🦜
Hopefully this doesn't cause chaos in everyone's apps, but I'm bracing for it just in case :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's something weird going on with solarwolf.
The masks were 16 bit rather than 32 bit. Oops.
This is my current solarwolf related issue with this PR on windows: import pygame
pygame.init()
def textshadowed(color, text, center=None, pos='center'):
font = pygame.font.Font(None, 32)
darkcolor = [int(c // 2) for c in color]
if text is None: text = ' '
try:
img1 = img2 = font.render(text, 1, color)
img2 = font.render(text, 1, darkcolor)
except (pygame.error, TypeError):
img1 = img2 = pygame.Surface((10, 10))
newsize = img1.get_width() + 2, img1.get_height() + 2
img = pygame.Surface(newsize)
print("start:")
print("img.get_flags():", img.get_flags())
print("img.get_at((0, 0)):", img.get_at((0, 0)))
print("img.get_colorkey():", img.get_colorkey())
print("img.get_shifts():", img.get_shifts())
print("img.get_masks():", img.get_masks())
print("img.get_bitsize():", img.get_bitsize())
print("img.get_bitsize():", img.get_bytesize())
img.blit(img1, (0, 0))
img = img.convert()
print("\nafter blit:")
print("img.get_flags():", img.get_flags())
print("img.get_at((0, 0)):", img.get_at((0, 0)))
print("img.get_colorkey():", img.get_colorkey())
print("img.get_shifts():", img.get_shifts())
print("img.get_masks():", img.get_masks())
print("img.get_bitsize():", img.get_bitsize())
print("img.get_bitsize():", img.get_bytesize())
img.set_colorkey((0, 0, 0), pygame.RLEACCEL)
print("\npost set_colorkey:")
print("img.get_flags():", img.get_flags())
print("img.get_at((0, 0)):", img.get_at((0, 0)))
print("img.get_colorkey():", img.get_colorkey())
print("img.get_shifts():", img.get_shifts())
print("img.get_masks():", img.get_masks())
print("img.get_bitsize():", img.get_bitsize())
print("img.get_bitsize():", img.get_bytesize())
return img
screen = pygame.display.set_mode((320, 240))
screen.fill((255,255,255))
text_render = textshadowed((160, 200, 250), 'A', (190, 170), "midright")
screen.blit(text_render, (50, 50))
running = True
while running:
for event in pygame.event.get():
if event.type == pygame.QUIT:
running = False
pygame.display.update() So far I can't figure out why the colorkey isn't working after using the new blitter before it is set. |
…ygame into alpha-blend-back-compat
Tracked this back to SDL and it seems colorkeys for RGB surfaces always have 00 in the alpha spot in SDL even if it reports 255 on the pygame side.
This adds an SDL1 style blitter for alpha blending where both source and destination surfaces have an alpha channel. It also adds a flag to disable this blitter in favour of reverting to the SDL2 one.
To Do
Currently we are only handling ARGB surfaces in the SSE2 version, can we tweak this for other 32 bit surface formats e.g. RGBA. Do we need to?Looks like SDL only handles 'ARGB/ABGR' formats in my tests, basically where the alpha is at shift 24, though whether to call this ARGB or RGBA with big endian and little endian seems to be a minefield in itself. The current code should handle any 32bit surfaces where the alpha is shifted by 24bits anyway.is it worth doing anything for these? They should be handled by the slow, non-sse2, blitter but we aren't testing that right now.These are now handled by the non-sse2 blitter & tested. It is about half the speed of the SDL2 16 bit surface blitter, if you care about speed over accuracy in 16bit blits you'll want to use the SDL2 blitter.blitting surface with alpha to surface without and visa versa - does this need handling or do the SDL1 & SDL2 blitters behave the same?Investigated this, it appears that SDL1 goes down a different path when the destination surface is opaque so the new blitter path doesn't match the SDL1 output. I put these blits down the new path anyway because it looks more accurate to SDL1 than the SDL2 blitter.