OpenGL: Always use a PBO in EncodeToRamUsingShader #4505

hthh · 2016-12-09T08:50:24Z

This improves performance significantly on macOS, particularly noticeably in the Super Mario Sunshine transition, which goes from ~5FPS to ~17FPS.

I'm new to OpenGL and GPU stuff, so I have no idea if there's a performance penalty on other drivers/platforms, or if this change even makes sense, but it's definitely a more-than-3x speedup in some situations.

degasus · 2016-12-09T09:14:35Z

To be honest, I have no clue why this should be faster. But I'm all for less code. Using the PBO redundantly shouldn't be any issue either, so LGTM.

endrift · 2016-12-09T09:15:55Z

I'm a bit wary of having the memcpy loop when it's not needed to be a loop (i.e. when it would go to the other branch that was removed), so when that condition would have been false, perhaps make it so that it's just one big memcpy instead of a loop?

JMC47 · 2016-12-09T09:21:37Z

Apparently this does make the transition A LOT faster. Weird.

Edit: More testing has made it apparent that it's a small amount on the transition. Still 3 - 5% average across many transitions.

Source/Core/VideoBackends/OGL/TextureConverter.cpp

+               nullptr);
+  u8* pbo = (u8*)glMapBufferRange(GL_PIXEL_PACK_BUFFER, 0, dstSize, GL_MAP_READ_BIT);
+
+  for (size_t i = 0; i < dstHeight; ++i)


Source/Core/VideoBackends/OGL/TextureConverter.cpp

+  // But instead we always copy the data via a PBO, because macOS inexplicably prefers this for some
+  // reason.
+  glBindBuffer(GL_PIXEL_PACK_BUFFER, s_PBO);
+  glBufferData(GL_PIXEL_PACK_BUFFER, dstSize, nullptr, GL_STREAM_READ);


Source/Core/VideoBackends/OGL/TextureConverter.cpp

+  // reason.
+  glBindBuffer(GL_PIXEL_PACK_BUFFER, s_PBO);
+  glBufferData(GL_PIXEL_PACK_BUFFER, dstSize, nullptr, GL_STREAM_READ);
+  glReadPixels(0, 0, (GLsizei)(dst_line_size / 4), (GLsizei)dstHeight, GL_BGRA, GL_UNSIGNED_BYTE,


This improves performance significantly on macOS, particularly noticeably in the Super Mario Sunshine transition, which goes from ~5FPS to ~17FPS.

hthh · 2016-12-12T09:45:11Z

I switched to doing the memcpy for the whole block when possible, as suggested by endrift and stenzek. Let me know if there's anything else anyone wants changed, but I'm happy for it to be merged as-is.

degasus · 2016-12-12T09:56:09Z

I'm fine as it is.

stenzek reviewed Dec 9, 2016

View reviewed changes

Tinob added a commit to Tinob/Ishiiruka that referenced this pull request Dec 10, 2016

Merge dolphin-emu/dolphin#4505

abd2570

OpenGL: Always use a PBO in EncodeToRamUsingShader

801d1d1

This improves performance significantly on macOS, particularly noticeably in the Super Mario Sunshine transition, which goes from ~5FPS to ~17FPS.

hthh force-pushed the macos-likes-pbos branch from 1ebb1b9 to 801d1d1 Compare December 12, 2016 09:34

hthh changed the title ~~[RFC] OpenGL: Always use a PBO in EncodeToRamUsingShader~~ OpenGL: Always use a PBO in EncodeToRamUsingShader Dec 12, 2016

degasus merged commit 989cdc0 into dolphin-emu:master Dec 19, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenGL: Always use a PBO in EncodeToRamUsingShader #4505

OpenGL: Always use a PBO in EncodeToRamUsingShader #4505

hthh commented Dec 9, 2016

degasus commented Dec 9, 2016

endrift commented Dec 9, 2016

JMC47 commented Dec 9, 2016 •

edited

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

hthh commented Dec 12, 2016

degasus commented Dec 12, 2016

OpenGL: Always use a PBO in EncodeToRamUsingShader #4505

OpenGL: Always use a PBO in EncodeToRamUsingShader #4505

Conversation

hthh commented Dec 9, 2016

degasus commented Dec 9, 2016

endrift commented Dec 9, 2016

JMC47 commented Dec 9, 2016 • edited

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

hthh commented Dec 12, 2016

degasus commented Dec 12, 2016

JMC47 commented Dec 9, 2016 •

edited