-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
D3D11: Implement non-blocking CPU EFB access #3652
Conversation
I'd really prefer if we could have one implementation of that, however it's done. |
Some performance numbers: Monster Hunter Tri: Before - 4 FPS, After - 71 FPS, OGL - 3 FPS Fixes issue 8427 and issue 8247, one of which are regression. Thus, this should be considered for 5.0. Also, look at those perf numbers. |
I don't think this should go into 5.0. It's a new feature, not a fix. As we're an emulator, of course most new features are fixes for games, but not fixes for the emulator itself. But the byte order fix might go to 5.0, indeed. |
Correct, however this is more a precaution against drivers being silly. Keeping a texture mapped while calling Present (or, submitting commands) seems like something that would never pop up during "normal" workloads and is probably an untested path. |
CD3D11_RECT src_rect(0, 0, m_target_width, m_target_height); | ||
CD3D11_VIEWPORT vp(0.0f, 0.0f, EFB_WIDTH, EFB_HEIGHT); | ||
D3D::context->RSSetViewports(1, &vp); | ||
D3D::context->OMSetRenderTargets(1, &m_efb.depth_read_texture->GetRTV(), nullptr); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
if you are interested in make it work for dx12 here you have it implemented. |
Review status: 0 of 9 files reviewed at latest revision, 3 unresolved discussions. Source/Core/VideoBackends/D3D/FramebufferManager.cpp, line 124 [r1] (raw file): Comments from the review on Reviewable.io |
Review status: 0 of 9 files reviewed at latest revision, 3 unresolved discussions. Source/Core/VideoBackends/D3D/FramebufferManager.cpp, line 124 [r1] (raw file): Comments from the review on Reviewable.io |
Huge speedup, more accurate, what's not to like? Verified to work at > 1x IR now. |
Code duplication and feature freeze? :) |
Review status: 0 of 9 files reviewed at latest revision, 7 unresolved discussions. Source/Core/VideoBackends/D3D/FramebufferManager.cpp, line 265 [r2] (raw file): Source/Core/VideoBackends/D3D/FramebufferManager.cpp, line 282 [r2] (raw file): Source/Core/VideoBackends/D3D/FramebufferManager.cpp, line 292 [r2] (raw file): Source/Core/VideoBackends/D3D/FramebufferManager.cpp, line 302 [r2] (raw file): Comments from the review on Reviewable.io |
@Tinob It's not safe to typecast across pointer types like what is in your code. Consider using memcpy instead. |
I'm tempted to not call this an "EFB cache" anymore, since we're just accessing a texture that's potentially in vram, not making a copy and storing it in system memory. A note on accuracy: For EFB pokes, should we be applying the same logic and handling 16-bit Z/16-bit color in the same way as peeks? IRT to sharing code with GL, I'm not sure there is much that can actually be shared. The majority of the work is in FramebufferManager::MapEFB{Color,Depth}Copy, and this is all D3D-specific code. The only stuff that could be shared is in Renderer::AccessEFB, and that's just a few lines now. I've implemented the same idea for GL using pixel buffer objects, but it seems slower when only reading a few pixels (but faster when reading pixels further apart, or a larger count than the block sizes). D3D has to go through a copy/map path always, so there isn't really a comparison here. So I'm not sure what the best option is? |
Any update on this? |
What's holding this PR back? |
"Code duplication and feature freeze? :)" |
@BhaaLseN It fixes a regression. |
This is done through a copy of the buffers that is invalidated once the next draw occurs. If multiple accesses occur inbetween draw calls, the read won't block.
@RisingFog It looks like just fixing what lioncash and possibly Tinob have mentioned on Reviewable. |
for me this causes performance problems |
@emko What is "for me"? Which GPU do you have? |
Are you sure it isn't because D3D EFB Access got fixed recently? |
just to clarify emko report: he was testing performance betwen master and ishiiruka and find out that there is a big performance diff betwen dx11 master vs dx11 ishiruka, we started to track down the source and i finish discovering that the difference was that ishiirukka have this pr already merged, i send him a link to this version to be sure and yes the speed lost for him was the same for this pr with master code. for his profile i see that he use a nvidia 970 with a intel 2600k, it seems to be a problem in that specific card because the rest of the configurations i tested have huge improvements. |
@degasus for me because i see others reporting improved performance, could it be my GPU GTX 970? i will test out the games JMC47 seen performance improvements in maybe its just this game |
High IR 4X native what i use most of the time lower the resolution to 3X native i get Before - 27 FPS, After - 27FPS |
Now that we've released 5.0 and are out of feature freeze, is there anything holding this PR back? (Other than a rebase) |
Trying implementing the EFB cache as a staging texture that is kept mapped until it is invalidated. Will be interesting to see what the overheads are here compared to the GL implementation which copies blocks at a time.
Should also work for MSAA and >1xIR cases.