Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AtlasEngine: Reduce shader power draw with explicit branching #12552

Merged
1 commit merged into from
Mar 14, 2022

Conversation

lhecker
Copy link
Member

@lhecker lhecker commented Feb 22, 2022

Many articles I read while writing this engine claimed that GPUs can't
do branches like CPUs can. One common approach to branching in GPUs is
apparently to "mask" out results, a technique called branch predication.
The GPU will simply execute all instructions in your shader linearly,
but if a branch isn't taken, it'll ignore the computation results.
This is unfortunate for our shader, since most branches we have are
only very seldomly taken. The cursor for instance is only drawn
on a single cell and underlines are seldomly used.

But apparently modern GPUs (2010s and later?) are actually entirely
capable of branching, if all lanes ("pixels") processed by a
wave (""GPU core"") take the same branch.

On both my Nvidia GPU (RTX 3080) and Intel iGPU (Intel HD Graphics 530)
this change has a positive impact on power draw. Most noticeably on the
latter this reduces power draw from 900mW down to 600mW at 60 FPS.

PR Checklist

  • I work here
  • Tests added/passed

Validation Steps Performed

It seems to work fine on Intel and Nvidia GPUs.
Unfortunately I don't have a AMD GPU to test this on, but I suspect it can't be worse.

@miniksa
Copy link
Member

miniksa commented Feb 22, 2022

How are you testing power draw? I might be able to find an AMD GPU to test.

@lhecker
Copy link
Member Author

lhecker commented Feb 22, 2022

How are you testing power draw? I might be able to find an AMD GPU to test.

OpenHardwareMonitor

@lhecker lhecker added Area-AtlasEngine Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Product-Terminal The new Windows Terminal. labels Feb 23, 2022
@lhecker lhecker force-pushed the dev/lhecker/atlas-engine-power-draw-reduction branch from 149f236 to 69ae3d2 Compare March 3, 2022 19:34
Comment on lines -791 to -792
desc.AlphaMode = DXGI_ALPHA_MODE_IGNORE;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is redundant with a ternary statement a few lines above, where we already set the correct AlphaMode if a HWND is present.

@@ -79,7 +79,7 @@ float4 DWrite_GrayscaleBlend(float4 gammaRatios, float grayscaleEnhancedContrast
float3 foregroundStraight = DWrite_UnpremultiplyColor(foregroundColor);
float contrastBoost = isThinFont ? 0.5f : 0.0f;
float blendEnhancedContrast = contrastBoost + DWrite_ApplyLightOnDarkContrastAdjustment(grayscaleEnhancedContrast, foregroundStraight);
float intensity = DWrite_CalcColorIntensity(foregroundColor.rgb);
float intensity = DWrite_CalcColorIntensity(foregroundStraight);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a minor bug in my DirectWrite implementation.

Comment on lines +135 to +151
float3 foregroundStraight = DWrite_UnpremultiplyColor(fg);
float blendEnhancedContrast = DWrite_ApplyLightOnDarkContrastAdjustment(enhancedContrast, foregroundStraight);

[branch] if (useClearType)
{
// See DWrite_ClearTypeBlend
float3 contrasted = DWrite_EnhanceContrast3(glyph.rgb, blendEnhancedContrast);
float3 alphaCorrected = DWrite_ApplyAlphaCorrection3(contrasted, foregroundStraight, gammaRatios);
color = float4(lerp(color.rgb, foregroundStraight, alphaCorrected * fg.a), 1.0f);
}
else
{
// See DWrite_GrayscaleBlend
float intensity = DWrite_CalcColorIntensity(foregroundStraight);
float contrasted = DWrite_EnhanceContrast(glyph.a, blendEnhancedContrast);
color = fg * DWrite_ApplyAlphaCorrection(contrasted, intensity, gammaRatios);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I inlined the two DWrite functions to offset the binary size cost for the shader a bit. Due to all the [branch] annotations the compiler can't inline as much which increases binary size of the shader by about 50%.

@miniksa
Copy link
Member

miniksa commented Mar 4, 2022

Sorry Leonard, it turns out I didn't have AMD hardware at home anymore.

@lhecker
Copy link
Member Author

lhecker commented Mar 4, 2022

It'll probably be fine... 🙂

@miniksa miniksa added the Needs-Second It's a PR that needs another sign-off label Mar 11, 2022
Copy link
Member

@zadjii-msft zadjii-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@lhecker lhecker added the AutoMerge Marked for automatic merge by the bot when requirements are met label Mar 14, 2022
@ghost
Copy link

ghost commented Mar 14, 2022

Hello @lhecker!

Because this pull request has the AutoMerge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@ghost ghost merged commit 5964060 into main Mar 14, 2022
@ghost ghost deleted the dev/lhecker/atlas-engine-power-draw-reduction branch March 14, 2022 13:39
@ghost
Copy link

ghost commented May 24, 2022

🎉Windows Terminal Preview v1.14.143 has been released which incorporates this pull request.:tada:

Handy links:

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-AtlasEngine Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues AutoMerge Marked for automatic merge by the bot when requirements are met Needs-Second It's a PR that needs another sign-off Product-Terminal The new Windows Terminal.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants