New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenGL: Always use a PBO in EncodeToRamUsingShader #4505
Conversation
To be honest, I have no clue why this should be faster. But I'm all for less code. Using the PBO redundantly shouldn't be any issue either, so LGTM. |
I'm a bit wary of having the memcpy loop when it's not needed to be a loop (i.e. when it would go to the other branch that was removed), so when that condition would have been false, perhaps make it so that it's just one big memcpy instead of a loop? |
Apparently this does make the transition A LOT faster. Weird. Edit: More testing has made it apparent that it's a small amount on the transition. Still 3 - 5% average across many transitions. |
nullptr); | ||
u8* pbo = (u8*)glMapBufferRange(GL_PIXEL_PACK_BUFFER, 0, dstSize, GL_MAP_READ_BIT); | ||
|
||
for (size_t i = 0; i < dstHeight; ++i) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
// But instead we always copy the data via a PBO, because macOS inexplicably prefers this for some | ||
// reason. | ||
glBindBuffer(GL_PIXEL_PACK_BUFFER, s_PBO); | ||
glBufferData(GL_PIXEL_PACK_BUFFER, dstSize, nullptr, GL_STREAM_READ); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
// reason. | ||
glBindBuffer(GL_PIXEL_PACK_BUFFER, s_PBO); | ||
glBufferData(GL_PIXEL_PACK_BUFFER, dstSize, nullptr, GL_STREAM_READ); | ||
glReadPixels(0, 0, (GLsizei)(dst_line_size / 4), (GLsizei)dstHeight, GL_BGRA, GL_UNSIGNED_BYTE, |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This improves performance significantly on macOS, particularly noticeably in the Super Mario Sunshine transition, which goes from ~5FPS to ~17FPS.
1ebb1b9
to
801d1d1
Compare
I switched to doing the |
I'm fine as it is. |
This improves performance significantly on macOS, particularly noticeably in the Super Mario Sunshine transition, which goes from ~5FPS to ~17FPS.
I'm new to OpenGL and GPU stuff, so I have no idea if there's a performance penalty on other drivers/platforms, or if this change even makes sense, but it's definitely a more-than-3x speedup in some situations.