Skip to content

Ensure minimum contrast ratio as in xterm.js#8420

Merged
kovidgoyal merged 10 commits intokovidgoyal:masterfrom
arne314:min-contrast-ratio
Mar 19, 2025
Merged

Ensure minimum contrast ratio as in xterm.js#8420
kovidgoyal merged 10 commits intokovidgoyal:masterfrom
arne314:min-contrast-ratio

Conversation

@arne314
Copy link
Copy Markdown
Contributor

@arne314 arne314 commented Mar 9, 2025

This PR adds an alternative algorithm (same as in xterm.js) to ensure readability in low contrast scenarios, as discussed in #8417.

The algorithm wil be used when a negative value is provided to the text_fg_override_threshold config option. To for example meet WCAG 2 level AA a value of -4.5 could be provided.

I am very much open to changes and happy to provide documentation as well.

@kovidgoyal
Copy link
Copy Markdown
Owner

Avoid control flow in shaders, it's quite costly. If you read the kitty shaders you will see I go to great lengths to avoid control flow. I can see you have basically copied the xterm.js code, which is fine for a first stab at it, but it can be improved in a couple of ways:

  1. xterm.js avoids doing actual HSL conversion for performance, but we are doing these calculations on a massively parallel GPU so we can afford HSL conversion.
  2. Rewrite the algorithm to avoid control flow, that means no loops and no if statements.
    The way to do this is to basically calculate both values of the branch and mix them together using the mix() and step() functions. This is fairly painful to do but also a fun intellectual challenge :)

Finally please add docs to the option in options/definition.py

@arne314
Copy link
Copy Markdown
Contributor Author

arne314 commented Mar 9, 2025

Thank you for the feedback. Though, it seems to me (not very familiar with branchless programming) that some kind of iterative approach will be necessary to converge towards an L (as in HSL) value that fulfills the required luminance.

Maybe this is a crazy approach, but we could have a lookup table (maybe a 3d texture to provide efficient interpolation, generated in python) to map from a given H, S and target luminance to the required L. I think the resolution (especially on the S axis) could be kept reasonably small.

I have created a simple plot to illustrate my idea where the red line marks the L required for 60% luminance.
image

@kovidgoyal
Copy link
Copy Markdown
Owner

kovidgoyal commented Mar 10, 2025 via email

@kovidgoyal
Copy link
Copy Markdown
Owner

From a quick look at the xterm.js algorithm it is trying to change the contrast ratio, which is just a function of the luminance of the two colors. But it does so by adjusting the rgb colors. Rather than doing that implement the algorithm in HSL space. Adjust the luminance in that space it should be easy to derive a closed form function to do that given the under and over luminances. And once adjusted convert from HSL space back to RGB.

@arne314
Copy link
Copy Markdown
Contributor Author

arne314 commented Mar 10, 2025

The problem is that there is a difference between the lightness as in the HSL model and the perceived one (what we call luminance). For example, at the same lightness we perceive the color yellow to have a higher luminance than blue, which is why in the plot the luminance graphs (upper ones) don't scale linearly despite the linear lightness on the y-axis.

The plot was created in the most straightforward way, I looped over all L and H values and on the L value that first met the target luminance (60% in this case) I inserted the red pixel. The luminance is computed from an RGB value (which I needed for the lower graphs anyway) using the same formula as in the shader code (which is default for sRGB space).

Regarding the uniform limit, I wonder if using textures which are stored in a different memory is still fast enough as we would literally just need one lookup per cell.

@kovidgoyal
Copy link
Copy Markdown
Owner

kovidgoyal commented Mar 10, 2025 via email

@kovidgoyal
Copy link
Copy Markdown
Owner

Here is GLSL code for HSLuv:
https://github.com/williammalo/hsluv-glsl

I havent reviewed that code to see how performant it is however.

@kovidgoyal
Copy link
Copy Markdown
Owner

Quickly glancing over the code, it has some simple if statements and ? operators that should be easy to replace with branchless versions. No loops.

@arne314
Copy link
Copy Markdown
Contributor Author

arne314 commented Mar 10, 2025

This plot shows the difference between the L from HSLuv and the luminance formula from the shader code, which is (I guess?) relatively small.
image
So HSLuv does indeed look like the solution to the problem, thanks for pointing that out. I will definitely take on the challenge to implement the conversion in GLSL with fewer branches :)

@arne314
Copy link
Copy Markdown
Contributor Author

arne314 commented Mar 11, 2025

Regarding the mix operations in the branchless hsluv implementation, I'm not entirely sure if they are safe without additional clamping in terms NaN/inf. Could you maybe take a quick look?

@kovidgoyal
Copy link
Copy Markdown
Owner

glsl has isnan() and isinf() functions you can use them to normalize values in combination with step. I dont think you have to worry about NaN IIRC that mostly happens multiplying zero by infinity. Inf will be more likely to happen since there are plenty of divisions with variable denominators. I suggest make a function name safe_divide(num, den, sentinel) or similar that divides, and returns sentinel when isinf() is true. And then replace all potentially unsafe divisions in the hsluv shader with that.

@kovidgoyal
Copy link
Copy Markdown
Owner

In fact coming to think of it you can implement safe_divide without isinf() by mixing on denominator == 0. Since GPUs dont trap on divide by zero this should actually be more performant.

@arne314
Copy link
Copy Markdown
Contributor Author

arne314 commented Mar 12, 2025

In fact coming to think of it you can implement safe_divide without isinf() by mixing on denominator == 0. Since GPUs dont trap on divide by zero this should actually be more performant.

But wouldn't that mix mean that for denom = 0 we get something like mix(x/denom, y, denom == 0) = 0 * (x/0) + 1 * y = 0 * inf + y = Nan + y = Nan?
Or did I misunderstand you there?

@kovidgoyal
Copy link
Copy Markdown
Owner

Yes, that's true, will need to be more subtle than that.

@kovidgoyal
Copy link
Copy Markdown
Owner

Depending on what you need the division to be in the zero case simply doing

num / max(0.0000001, denom)

might be acceptable.

@kovidgoyal
Copy link
Copy Markdown
Owner

kovidgoyal commented Mar 12, 2025

Or something like this:

float safe_divide(float numerator, float denominator) {
    float epsilon = 1e-10;
    float safe_denominator = abs(denominator) + epsilon;
    float result = numerator / safe_denominator;
    result *= sign(denominator);
    return result;
}

@kovidgoyal
Copy link
Copy Markdown
Owner

There are still a couple of branches left in the fragment shader and I dont think mix() works with a bool for the third argument.

Also, thinking about it rather than using negative numbers, lets use a unit suffix. so the setting can be

text_fg_override_threshold 50%
or
text_fg_override_threshold 4.5 ratio

when no unit is supplied it defaults to % for backwards compat.

In the future if we want to add another algorithm it can be easily accomodated by adding another unit.

@arne314
Copy link
Copy Markdown
Contributor Author

arne314 commented Mar 14, 2025

I do like the config solution a lot better. Unless built-in functions introduce branching, there is only one if statement left. Are you sure we should always perform the hsluv computations within it, as they contain quite a few expensive pow operations and usually only kick in for a few pixels at a time in rare scenarios anyway?

@kovidgoyal
Copy link
Copy Markdown
Owner

kovidgoyal commented Mar 14, 2025 via email

@arne314
Copy link
Copy Markdown
Contributor Author

arne314 commented Mar 14, 2025

Honestly, I don't know enough about shaders to take on that kind of refactoring. Also, how would that work with emojis which have more than one color? Is there a (straightforward) way to just ignore them in the contrast computations or a way to still process every pixel of them in the fragment shader while normal characters are processed in the vertex shader only?

@kovidgoyal
Copy link
Copy Markdown
Owner

kovidgoyal commented Mar 15, 2025 via email

@kovidgoyal
Copy link
Copy Markdown
Owner

kovidgoyal commented Mar 15, 2025

And just some general background on GPU branching. GPU's generally run the same code (machine instructions) on blocks of vertices/fragments. Older GPUs used to use larger blocks more modern ones have more shader execution units so use smaller blocks. This reduces the cost of branching. However even modern ones have blocks of ~10 fragments. Here we are talking of a "dynamic" branch, aka a branch that depends on per fragment data, this makes it not possible to execute the same code on a block, thereby giving us a significant slowdown per branch. Although it's more complicated than this, GPUs will actually perform both sides of the branch in parallel and mix the results anyway. So the actual cost is a halving of the available number of shader execution units. What that translates into in terms of actual performance is hard to say, it's very hardware dependent. Indeed many compilers will actually elide branches by doing the mixing themselves in the generated machine code.

I prefer to be explicit about, so we are not relying on compiler behavior.

@arne314
Copy link
Copy Markdown
Contributor Author

arne314 commented Mar 15, 2025

And I dont think this feature is that important for emoji, so it might be a worthwhile tradeoff.

I do very much agree with that.
Also, thank you for the clarification about branching on GPUs :)

@kovidgoyal
Copy link
Copy Markdown
Owner

Cool, am travelling fora few days will merge this after I return.

@kovidgoyal kovidgoyal merged commit 80db9b4 into kovidgoyal:master Mar 19, 2025
@kovidgoyal
Copy link
Copy Markdown
Owner

Merged, and moved code into vertex shader.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants