Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the texture filtering hardware to reduce the number of sample operations in the blur shader. #3122

Merged
merged 1 commit into from Dec 6, 2018
Merged
Changes from all commits
Commits
File filter...
Filter file types
Jump to…
Jump to file
Failed to load files.

Always

Just for now

Use the texture filtering hardware to reduce the number of sample ope…

…rations in

the blur shader.

Linear filtering allows us to evaluate the formula

    k₀c₀ + k₁c₁                 (Formula 1)

where c₀ and c₁ are the colors of adjacent texels and k₀ and k₁ are arbitrary
factors (in this case, the results of evaluating the Gaussian function) with a
single lookup. Linear filtering evaluates the following expression for some t
between 0 and 1:

    lerp(c₀, c₁, t)

It can be shown algebraically that Formula 1 is equivalent to:

                 ⎛           k₁  ⎞
    (k₀ + k₁)lerp⎜c₀, c₁, ───────⎟
                 ⎝        k₀ + k₁⎠

Which can be readily evaluated by letting `t = k₁/(k₀ + k₁)` and performing a
texture lookup.
  • Loading branch information
pcwalton committed Dec 6, 2018
commit 38ec7db6f165ff7aaad12d91eca7d7a5f51557b5
@@ -99,9 +99,6 @@ void main(void) {
// with a offset / weight uniform table and a constant
// loop iteration count!

// TODO(gw): Make use of the bilinear sampling trick to reduce
// the number of texture fetches needed for a gaussian blur.

void main(void) {
SAMPLE_TYPE original_color = SAMPLE_TEXTURE(vUv);

@@ -119,24 +116,47 @@ void main(void) {
gauss_coefficient.y = exp(-0.5 / (vSigma * vSigma));
gauss_coefficient.z = gauss_coefficient.y * gauss_coefficient.y;

float gauss_coefficient_sum = 0.0;
float gauss_coefficient_total = gauss_coefficient.x;
SAMPLE_TYPE avg_color = original_color * gauss_coefficient.x;
gauss_coefficient_sum += gauss_coefficient.x;
gauss_coefficient.xy *= gauss_coefficient.yz;

for (int i = 1; i <= vSupport; i++) {
vec2 offset = vOffsetScale * float(i);
// Evaluate two adjacent texels at a time. We can do this because, if c0
// and c1 are colors of adjacent texels and k0 and k1 are arbitrary
// factors, this formula:
//
// k0 * c0 + k1 * c1 (Equation 1)
//
// is equivalent to:
//
// k1
// (k0 + k1) * lerp(c0, c1, -------)
// k0 + k1
//
// A texture lookup of adjacent texels evaluates this formula:
//
// lerp(c0, c1, t)
//
// for some t. So we can let `t = k1/(k0 + k1)` and effectively evaluate
// Equation 1 with a single texture lookup.

for (int i = 1; i <= vSupport; i += 2) {
float gauss_coefficient_subtotal = gauss_coefficient.x;
gauss_coefficient.xy *= gauss_coefficient.yz;
gauss_coefficient_subtotal += gauss_coefficient.x;
float gauss_ratio = gauss_coefficient.x / gauss_coefficient_subtotal;

vec2 offset = vOffsetScale * (float(i) + gauss_ratio);

vec2 st0 = clamp(vUv.xy - offset, vUvRect.xy, vUvRect.zw);
avg_color += SAMPLE_TEXTURE(vec3(st0, vUv.z)) * gauss_coefficient.x;
avg_color += SAMPLE_TEXTURE(vec3(st0, vUv.z)) * gauss_coefficient_subtotal;

vec2 st1 = clamp(vUv.xy + offset, vUvRect.xy, vUvRect.zw);
avg_color += SAMPLE_TEXTURE(vec3(st1, vUv.z)) * gauss_coefficient.x;
avg_color += SAMPLE_TEXTURE(vec3(st1, vUv.z)) * gauss_coefficient_subtotal;

gauss_coefficient_sum += 2.0 * gauss_coefficient.x;
gauss_coefficient_total += 2.0 * gauss_coefficient_subtotal;
gauss_coefficient.xy *= gauss_coefficient.yz;
}

oFragColor = vec4(avg_color) / gauss_coefficient_sum;
oFragColor = vec4(avg_color) / gauss_coefficient_total;
}
#endif
Binary file not shown.
Binary file not shown.
@@ -23,7 +23,7 @@ platform(linux,mac) == inset-subpx.yaml inset-subpx.png
platform(linux,mac) fuzzy(1,4) == inset-downscale.yaml inset-downscale.png
platform(linux,mac) fuzzy(1,50) == box-shadow-cache.yaml box-shadow-cache.png
platform(linux,mac) fuzzy(1,685) == overlap1.yaml overlap1.png
== overlap2.yaml overlap2.png
fuzzy(1,61) == overlap2.yaml overlap2.png
platform(linux,mac) fuzzy(1,48) == no-stretch.yaml no-stretch.png
platform(linux,mac) fuzzy(1,9) == box-shadow-stretch-mode-x.yaml box-shadow-stretch-mode-x.png
platform(linux,mac) fuzzy(1,41) == box-shadow-stretch-mode-y.yaml box-shadow-stretch-mode-y.png
@@ -29,7 +29,7 @@ platform(linux,mac) fuzzy(1,133) == filter-large-blur-radius.yaml filter-large-b
== filter-saturate-blue-alpha-1.yaml filter-saturate-blue-alpha-1-ref.yaml
== filter-hue-rotate-1.yaml filter-hue-rotate-1-ref.yaml
== filter-hue-rotate-alpha-1.yaml filter-hue-rotate-alpha-1-ref.yaml
== filter-long-chain.yaml filter-long-chain.png
fuzzy(1,14) == filter-long-chain.yaml filter-long-chain.png
platform(linux,mac) == filter-drop-shadow.yaml filter-drop-shadow.png
platform(linux,mac) == filter-drop-shadow-on-viewport-edge.yaml filter-drop-shadow-on-viewport-edge.png
platform(linux,mac) == blend-clipped.yaml blend-clipped.png
@@ -14,7 +14,7 @@
!= shadow-clipped-text.yaml blank.yaml
!= non-opaque.yaml non-opaque-notref.yaml
== decorations.yaml decorations-ref.yaml
fuzzy(1,100) == decorations-suite.yaml decorations-suite.png
fuzzy(1,173) == decorations-suite.yaml decorations-suite.png
== 1658.yaml 1658-ref.yaml
== split-batch.yaml split-batch-ref.yaml
== shadow-red.yaml shadow-red-ref.yaml
@@ -54,7 +54,7 @@ platform(linux) == clipped-transform.yaml clipped-transform.png
platform(mac) == color-bitmap-shadow.yaml color-bitmap-shadow-ref.yaml
platform(linux) == writing-modes.yaml writing-modes-ref.yaml
platform(linux) == blurred-shadow-local-clip-rect.yaml blurred-shadow-local-clip-rect-ref.png
platform(linux) == two-shadows.yaml two-shadows.png
fuzzy(1,1) platform(linux) == two-shadows.yaml two-shadows.png
== shadow-clip.yaml shadow-clip-ref.yaml
== shadow-fast-clip.yaml shadow-fast-clip-ref.yaml
== shadow-partial-glyph.yaml shadow-partial-glyph-ref.yaml
ProTip! Use n and p to navigate between commits in a pull request.
You can’t perform that action at this time.