Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CPU shaders] Assertion failure in path_tiling stage when rendering Coat of Arms of Poland.svg #378

Closed
armansito opened this issue Oct 11, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@armansito
Copy link
Collaborator

To reproduce run the following:

 cargo run -p with_winit -- ./examples/assets/downloads/Coat\ of\ Arms\ of\ Poland.svg --use-cpu

Which fails with: thread 'main' panicked at 'assertion failed: xy0.x >= tile_xy.x && xy0.x <= tile_xy1.x', src/cpu_shader/path_tiling.rs:129:9

With some more debug information:

thread 'main' panicked at 'p0: Vec2 {
    x: -0.18100956,
    y: 8.772132e-14,
}, p1: Vec2 {
    x: 677.49286,
    y: 8.772132e-14,
}, xy0: Vec2 {
    x: -0.18100956,
    y: 8.772132e-14,
}, xy1: Vec2 {
    x: 16.0,
    y: 0.0,
}, tile_xy: Vec2 {
    x: 0.0,
    y: -16.0,
}, tile_xy1: Vec2 {
    x: 16.0,
    y: 0.0,
}', src/cpu_shader/path_tiling.rs:129:9

I'm triggering the assertion consistently with these values.

@armansito armansito added the bug Something isn't working label Oct 11, 2023
@raphlinus
Copy link
Contributor

raphlinus commented Oct 11, 2023

Minimal repro case, based on extracting the offending line segment from the assert failure:

<svg width="1200" height="1200" viewBox="0 0 1200 1200" fill="none" xmlns="http://www.w3.org/2000/svg">
  <path d="M-0.36201912 1.7544264e-13L1354.9857 1.7544264e-13 100 100" fill="#ff0"/>
</svg>

This renders with artifact on GPU, which is an additional indication that the assert is detecting a real problem.

@raphlinus
Copy link
Contributor

After a little more digging, I have a clearer picture what's going on. It's a horizontal line which is not exactly grid aligned (but obviously very close). If it were aligned, then the line would be discarded by the numerical robustness logic. On the CPU, a is being computed as 1, which is correct. However, b should be 1-epsilon, and it's being rounded up to 1. When computing floor(a*i+b) it should round down to i, but is instead i + 1. That in turn selects a grid square which is not part of the rasterization of the line, which triggers the assertion failure and other mischief.

The fix is not yet clear, though I've got the general outline. Obviously 0 <= b < 1 should be enforced, and if that's satisfied, then floor(a*i+b) should be 0 for the i=0 case. But I'm still worried about i=n-1 in particular.

Another thing that can go wrong is that a should be equal to 1 in the horizontal line case, but it can be <1 because it's calculated as dx * (dx + dy).recip(), which can have roundoff errors. As noted in the review for #374, if that's calculated on CPU using division, you get the right answer, but the guarantees of WGSL are weaker than IEEE floating point.

Obviously a patch to fix this specific error is to special case horizontal lines, but that's not satisfying, as it seems clear it wouldn't cover all failures. I think a good next step would be to carefully define what values for a, b are correct, then put in logic to enforce that, possibly some combination of clamping and verifying that floor(a * (n - 1) + b) is the right value. I'll think on it some more.

@raphlinus
Copy link
Contributor

Ok, I've slept on it, and I believe the following will work. Compute b by taking the min with 0.99999994, which will guarantee it's strictly less than 1. Then compute floor(a * (n - 1) + b) and check whether it's equal to max(1, ceil(max(x0, x1)) - floor(min(x0, x1))) - 1. Note: that's the same value as for computing the count, and it's equal to the width in grid cells of the rasterized line minus one. If not equal, bump a by some epsilon (I believe 1e-9 is fine) with the sign needed to make that equal.

There's more work that can be done to validate this, but I believe that would result in fully robust rendering, and also reduces the pressure on the division operation to compute a. Note that this would also apply to multisampled rendering in fine (see the reviews for the msaa PR, linked above).

@raphlinus
Copy link
Contributor

1e-9 is not fine, we're in f32 world, that's less than 1 ulp. I think 1e-7 is adequate, but the argument is subtle.

raphlinus added a commit that referenced this issue Oct 12, 2023
Apply additional logic to make sure that first and last tiles of conservative line rasterization land on the right tile.

Note: this fixes the assert and should avoid artifacts, but still doesn't render the poland svg correctly (there are black splotches).

Fixes #3376 and #378
raphlinus added a commit that referenced this issue Oct 12, 2023
Apply additional logic to make sure that first and last tiles of conservative line rasterization land on the right tile.

Note: this fixes the assert and should avoid artifacts, but still doesn't render the poland svg correctly (there are black splotches).

Fixes #376 and #378
@raphlinus
Copy link
Contributor

I believe this can be considered fixed as per the linked PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants