Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GSDX-DX11: Partial port of OGL SW blend to DX11 / changes to blend map of both OGL/DX11. #2973

Merged
merged 3 commits into from Jun 13, 2019

Conversation

Projects
None yet
6 participants
@hibye8313
Copy link
Contributor

commented May 27, 2019

Changes:
GSDevice11.h:

  • class PSConstantBuffer: Removed variable (MinF) that was not being used and structured to be closer to OGL side.
  • class PSSelector: Added variable to for selecting accu blend in the shader.

GSRenderer11.cpp:

  • EmulateTextureSampler(): Removed variable (MinF) that was not being used and structured to be similar to OGL.
  • DrawPrims(): Added checking/enabling accu blend. Small restructure to alpha test code so it is clearer.

GSRendererOGL.cpp:

  • DrawPrims(): Same restructure as above for alpha test clarity.

GSDeviceOGL.h:

  • class PSConstantBuffer: Some things seem incorrect here, I'm not sure. Seems as if Update() does not fully update the buffer as it is. I put in a FIXME comment.

GSTextureFX11.cpp:

  • SetupPS(): Added defines for shader to enable accu blend.
  • SetupOM(): Added changes to the HW blend state when accu blend is enabled.

tfx.fx:

  • cbuffer cb1: Changed structure to match OGL side and the changes made above to PSConstantBuffer in
    GSDevice11.h.
  • added PS_ACCU_BLEND define
  • ps_main(): Added code for doing the accu blend when enabled. Also added formatting to 16 bit output when accu blend is enabled since needed for fixing Tag Team Racing shadows.

GSVertexTrace.cpp:

  • CorrectDepthTrace(): seems like this function is unnecessarily complex. Added a FIXME as a comment.

Maybe I made too many changes and it will be hard to review them all? Let me know if I should remove some of structural changes that were not 100% necessary.

@hibye8313 hibye8313 changed the title ported accu blend to DX11. some small changes to code stucture. GSDX-DX11: ported accu blend to DX11. some small changes to code stucture. May 27, 2019

@lightningterror lightningterror added this to the Release 1.6 milestone May 27, 2019

@lightningterror

This comment has been minimized.

Copy link
Member

commented May 27, 2019

There seems to be an issue with the shadows, they appear fully black.

@lightningterror

This comment has been minimized.

Copy link
Member

commented May 27, 2019

I recommend only doing the accumulation blend changes and everything else reverted.
Anything else can be done in a future pr. It's quite a bit messy atm.

Edit: Otherwise you can also split the commit in smaller ones: accu blend, texture sampler changes, constant buffer changes for example

@hibye8313

This comment has been minimized.

Copy link
Contributor Author

commented May 27, 2019

OK, sounds good I'll split the commits into several smaller changes and remove the unnecessary changes like you said. Might be till tomorrow that I can fully revise, have to go to work soon =(.

@gregory38

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

Take your time. A good and tidy PR is 100 times faster to review than a bad one.

Besides it would be easier to spot bug and discuss on your code

@gregory38

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

By the way the depth test code purpose was to avoid the branch (but as it is never hit probably not important). Early exit seems a good idea but truth is that depth will be equal nearly all times.

It would be nicer to benchmark code because I don't know which code is faster.

@gregory38

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

Otherwise it is much better with only accu change

@hibye8313

This comment has been minimized.

Copy link
Contributor Author

commented May 28, 2019

@gregory38 Hmm, didn't think of that but yeah makes sense the old way if most of the time we have to loop the whole way anyway.

@lightningterror It seems to be working from my end with default DX11 setting and compiling for debug build. Is it showing black in the full shadow volume or just the ground?
Also I thought accu blend was only useful when CLAMP=0 and using HDR, or am I wrong? Otherwise the blend equation can be done fully in HW and since there is no wrapping there is no problem for HW to clamp values right? Or perhaps the reason for using accu blend is also so that HW will not clamp Af/As values? Just to clarify, does HW blending on DX11/OGL clamp the blend factors (source alpha/color, fixed blend factor) to range [0, 1] before the blending or just clamp the final color/alpha when writing to the framebuffer? i.e if we used As = 192/128 = 1.5 will HW use 1.5 when blending or 1.0? I know I should probably test this myself but its easier to ask : )

Edit: I checked the OGL side in GSRendererOGL.cpp, EmulateBlending() where is does the check for COLCLAMP==0, accumulation blend seems to be only used with HDR algo.

I revised the commit to only contain the changes needed for the accu blend, hope that is better.

I think it is still important to make the changes to constant buffer so that will be more maintainable in the future. Is it ok to do this as a separate commit on top of the current one or should that be a separate pull request?

Thanks for feedback both.

@lightningterror

This comment has been minimized.

Copy link
Member

commented May 28, 2019

Fifa Street and Genji should be a good testcase for accumulation blend.

Fifa Street fbmask.zip
GENJI [SCPS 15095] (J) Misaligned Bloom - Leaves.zip

Keep in mind both of them use fbmask, basic for fifa, genji high on d3d.

@hibye8313

This comment has been minimized.

Copy link
Contributor Author

commented May 28, 2019

Fifa seems to be working correclty on DX11 as compared with SW renderer. However on Genji I am getting problems on both OGL side and DX11. On DX11 the colors are not bright enough in the leaves so probably some alpha multiplier is wrong, on OGL the colors in the leaves are good but the door on the right side it not showing correctly (tried highest blending accuracy). Looking into the issues now.
DX11:
image
OGL:
image
SW:
image

@gregory38

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

Feel like a missing effect on dx (and a texture cache issue on gl)

@gregory38

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

Btw, GS uses integer blending. So it doesn't round but trunc. Even with standard clamping it could have an impact. Need to check god of war, I think it uses it at the start of the game but don't remember for what.

@hibye8313

This comment has been minimized.

Copy link
Contributor Author

commented May 28, 2019

@gregory38 Right, but what I was referring to was on the GS, alpha = 0x80 means multiply by 1.0. But if alpha > 0x80, say 0xC0 then the multiply will be by 1.5 so colors get brighter than normal. Is this true or does the GS clamp alpha to [0, 0x80] before blending? I'm 95% sure on GPU's alpha/color output from the pixel shader are clamped to [0,1] before blending so I'm guessing that could lead to certain blending inaccuracy when HW blending.

@gregory38

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

Ah yes. GPU will clamp to 1 whereas GS can go up to 255/128. Maybe the reason I did it also for clamping and 32 bits. I don't remember but you need to do it for all cases.

@gregory38

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

Actually GPU it depends on the format. With float you can have any alpha but also any color. as far I know It isn't possible to clamp only color (which would solve most of our issue).

@lightningterror

This comment has been minimized.

Copy link
Member

commented May 28, 2019

The issue with Genji fbmask breaking is it because it also needs sw blending (not sure if accumulation or not) for fbmask to work correctly.

Anyway what I said earlier about these two being good test cases was because they use accumulation blend when clamp is 1. Current dx pr code doesn't allow using accumulation when clamp is 1.

@hibye8313

This comment has been minimized.

Copy link
Contributor Author

commented May 29, 2019

@gregory38 I see, thank you.

@lightningterror Working on it now, I think I might have broken something. Ok I looked at the OGL code in GSRendererOGL::EmulateBlending() again and you are right accumulation blend is used in cases where COLCLAMP=1, I will fix this.

Show resolved Hide resolved plugins/GSdx/res/tfx.fx Outdated
@hibye8313

This comment has been minimized.

Copy link
Contributor Author

commented Jun 2, 2019

@lightningterror Just general question to make sure I understand the purpose of accumulation blend without the HDR algo for COLCLAMP=0. I think we want to compute the Value of CsAf/CsAs in the pixel shader to prevent early clamping of Af/As since it is better to have CsAf/CsAs clamped incorrectly than As/Af (which are probably > 1 more often)?

@lightningterror

This comment has been minimized.

Copy link
Member

commented Jun 9, 2019

Sure, we can do the optimizations later.

@gregory38

This comment has been minimized.

Copy link
Contributor

commented Jun 9, 2019

My feeling about this PR. Short story

Pr should be split into 2 pr.
1st with accu blend (can keep sw blending infrastructure on shader and such)
2nd with remaining bits of sw blending.

You can keep mostly everything but the part that enable sw_blending in renderer.
Basically impossible blend, alpha check, fbmask check.... shall be removed

Accu blend is really accuracy. Sw blending in this pr is about fixing broken code with broken differently code.

@hibye8313

This comment has been minimized.

Copy link
Contributor Author

commented Jun 9, 2019

@gregory38 Okey doke, this PR was originally supposed to be only for accu blend so I will keep it that way. I structured the PR again into 3 commits as before but on the last commit kept only accu blend in EmulateBlending().

On pixel shader every opcode counts. A PS could run millions of time. Nothing is free.
Masking doesn't make sense when you need to clamp. It is either clamping or masking.

I see, makes sense why you made it like that then. I removed the unnecessary mask from DX side as well.
COLCLAMP=0 is now handled only by HDR algo so COLCLIP is not needed in the shader and so is removed.

@lightningterror Fixed the PR titles, added the CLR1 check to EmulateBlending() and early exit when blending is disabled. Also removed the isCLR1() method from OMBSelector struct since it was no longer needed.

@lightningterror lightningterror self-requested a review Jun 10, 2019

@hibye8313

This comment has been minimized.

Copy link
Contributor Author

commented Jun 11, 2019

Latest update it mostly just fixing my own mistakes/minor structure changes.

I see now that alpha is already formatted correctly in the D3D11 shader in the alpha correction stage so I removed it from the blend stage (I had added it the last update).

@gregory38 I still don't know exactly what to do about the private/protected/public modifiers. I think it might be the simplest/cleanest to just make m_blendMap a protected (instead of private) member of GSDevice even though it will give access to GSDeviceOGL/GSDevice11.

@lightningterror

This comment has been minimized.

Copy link
Member

commented Jun 11, 2019

You'd still end up with two same modifiers depending on where the code is placed, either two protected or two public because code is read in order. Honestly I'd leave it as it is.

I'm fine with the current changes, some cleanups/improvements can be done here and there but let's get the base pr merged, they can be done afterwards.
If Greg is fine with the changes I'll merge it.

hibye8313 added some commits Jun 6, 2019

GSdx-d3d11: Partial port of EmulateBlending() from OGL renderer to DX…
…11 renderer and ps_blend() from OGL shader to DX11 shader (only accumulation blend).
@hibye8313

This comment has been minimized.

Copy link
Contributor Author

commented Jun 12, 2019

@gregory38 yes, I moved the struct out of the class so now we can have just three access modifiers, private, protected, public. I also changed the #defines for blend flags to an enum since that's how it seems to be handled for other constants in this file like ShaderConvert/ChannelFetch. I changed ONE, ZERO => CONST_ONE, CONST_ZERO, and ADD, SUBTRACT, REV_SUBTRACT => OP_ADD, OP_SUBTRACT, OP_REV_SUBTRACT.

@lightningterror Great, glad to see things are moving forward with this PR. Of course, much thanks/credit to you and gregory for helping to improve it a lot.

@lightningterror lightningterror merged commit 60cf62f into PCSX2:master Jun 13, 2019

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@lightningterror

This comment has been minimized.

Copy link
Member

commented Jun 13, 2019

Merged, I did some minor cleanups myself as well, anyway nice work :)
Now we can focus on other stuff, maybe the constant buffer issues can be next.

@hibye8313

This comment has been minimized.

Copy link
Contributor Author

commented Jun 15, 2019

@lightningterror Thanks for merging the PR! Unfortunately not sure how much time I'm going to have for future issues for at least some time because of job stuff coming up : (, but I'm glad we managed to finish this one.

@avih

This comment has been minimized.

Copy link
Member

commented Jun 15, 2019

Unfortunately not sure how much time I'm going to have for future issues for at least some time because of job stuff coming up : (, but I'm glad we managed to finish this one.

@hibye8313 Code is rarely "final", and many times new code requires fixes and improvements after the initial merge, etc. As it's currently your code, you're expected to stay around and handle related issues, if such issues come up.

@lightningterror

This comment has been minimized.

Copy link
Member

commented Jun 15, 2019

@hibye8313 We can include BLEND_NO_BAR too, needs to be tested if it's actually useful for games that use it.

@ssakash

This comment has been minimized.

Copy link
Member

commented Jun 15, 2019

@avih

As it's currently your code, you're expected to stay around and handle related issues, if such issues come up.

Not really, @hibye8313 is just a contributor making changes to a module in PCSX2. He's under no obligation to stick around and fix issues in case the code ends up being wrong. The responsibility falls on the person who merges the code.

The way you phrase it, it makes it seem like there's some clause bounding him to stick to the project and look into the issues, you could've been a bit verbose to avoid a potential of it being interpreted as an imperative tone.

@avih

This comment has been minimized.

Copy link
Member

commented Jun 15, 2019

No one is under any obligation, issues do come up occasionally with new code, and the best person to handle them on such case is typically whoever wrote the new code (assuming it's not some minor patch).

Please let's not turn this into anything which it isn't.

@ssakash

This comment has been minimized.

Copy link
Member

commented Jun 15, 2019

No one is under any obligation, issues do come up occasionally with new code,


I'm personally aware that you didn't mean it that way, just saying there's the possibility of it being interpreted that way, which could've been avoided.

the best person to handle them on such case is typically whoever wrote the new code

Yes, but that doesn't mean the team can pester contributors to keep maintaining the code by having unnecessary expectations on them.

@avih

This comment has been minimized.

Copy link
Member

commented Jun 15, 2019

The only one pestering here is you. Just drop it. If my clarification isn't enough for you then please just go on and on and on. Preferably elsewhere.

@lightningterror

This comment has been minimized.

Copy link
Member

commented Jun 15, 2019

@avih Sometimes the author can no longer maintain said feature for various reasons. In this case responsibility falls to team members to take care of (worst case scenario code can be disabled/reverted) until fixed properly.

What you should be doing here as a fellow team member which has some responsibility is encourage contribution, if someone can't continue maintaining it then that's perfectly fine, maintainers/team members can decide the best course of action so that everyone is happy.

@avih

This comment has been minimized.

Copy link
Member

commented Jun 15, 2019

you two really take it the wrong way. So be it, apologies all around. I'm out of here.

@gregory38

This comment has been minimized.

Copy link
Contributor

commented Jun 15, 2019

I think you did a god contribution to the code. You likely spent quite some times to understand GS drawing. And then read my awful GL code. Hopefully you can enjoy your favourite games with less bug and that all that matters :)

If you have time and are willing to contribute, you are more than welcome. Don't hesitate to ping me or Lt (whatever for an idea of contribution or specific issue to tackle)

@lightningterror

This comment has been minimized.

Copy link
Member

commented Jun 16, 2019

Regarding Genji fbmask maybe it is better to detect it based on alpha abcd value and disable it if it matches instead of mask value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.