Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix WGPU Validation Error #1716

Merged
merged 1 commit into from
Jun 9, 2022
Merged

Fix WGPU Validation Error #1716

merged 1 commit into from
Jun 9, 2022

Conversation

zicklag
Copy link
Contributor

@zicklag zicklag commented Jun 4, 2022

Closes #1715 ( hopefully ).

I believe the cause behind #1715 is that the renderer will re-use index/vertex buffers from previous frames, if they exist. These re-used buffers may be longer than the current index buffer needs to be and in that case the data for the current index buffer will be written to our desired length and there will be some left-over data from the previous frame in the buffer after that.

When setting our index buffer for the current frame, though, we were previously setting the index buffer to the entire buffer, which could include extra indexes from previous frames. Now we properly slice the buffer to the index buffer length before rendering.

Also, apparently there is a scenario where the index buffer length is zero ( i.e and empty mesh ), in which case, wgpu didn't like me slicing a buffer to the range 0..0, so we just skip drawing those empty meshes. I'm not sure if the empty mesh scenario warrants further investigation or not, but I think that's separate from this PR ( if it's even an issue at all ).

@zicklag
Copy link
Contributor Author

zicklag commented Jun 4, 2022

Ah, it didn't work. I just reproduced the bug even with this PR, but luckily I think I found a scenario that consistently re-creates the bug, so hopefully I'll be able to trace it now. 🤞

@zicklag
Copy link
Contributor Author

zicklag commented Jun 4, 2022

Putting this code into my app and the demo app reproduces the error when you drag the slider, almost every time:

let mut ppp = ctx.pixels_per_point();
ui.add(egui::Slider::new(&mut ppp, 0.5..=5.0));
ctx.set_pixels_per_point(ppp);

Interestingly, it doesn't cause the problem in a minimal app with only a central panel a sidebar and a label.


Also, the original reason I thought the error was caused was invalid, because the draw call only renders as many triangles as we have. 🤦 I should have noticed that. 😄

rpass.draw_indexed(0..mesh.indices.len() as u32, 0, 0..1);

@zicklag
Copy link
Contributor Author

zicklag commented Jun 5, 2022

OK, I've boiled this down to an assertion that fails that we need to fix, but it's not much new info because it's essentially what WGPU told us:

here:

match primitive {
    Primitive::Mesh(mesh) => {
        let index_buffer = index_buffers.next().unwrap();
        let vertex_buffer = vertex_buffers.next().unwrap();

        // Fails
        more_asserts::assert_le!(
            mesh.indices.len(),
            index_buffer.size / 4,
            "Buffer size miss-match"
        );

        if let Some((_texture, bind_group)) = self.textures.get(&mesh.texture_id) {
            rpass.set_bind_group(1, bind_group, &[]);
            rpass.set_index_buffer(
                index_buffer.buffer.slice(..),
                wgpu::IndexFormat::Uint32,
            );
            rpass.set_vertex_buffer(0, vertex_buffer.buffer.slice(..));
            rpass.draw_indexed(0..mesh.indices.len() as u32, 0, 0..1);
        } else {
            tracing::warn!("Missing texture: {:?}", mesh.texture_id);
        }
    }

But I feel like I must be losing my mind, because it feels like something is somehow coming in and modifying the immutably borrowed self.index_buffers, because if I add this assertion above the failing assertion, it passes!

So it's like something is somehow modifying self.index_buffers while we've borrowed it, which is really spooky. 👀 I'm having a really hard time figuring out what to do about it.

let mut temp_mesh_idx = 0;
for clipped_primitive in paint_jobs {
    if let Primitive::Mesh(mesh) = &clipped_primitive.primitive {
        let index_buffer = &self.index_buffers[temp_mesh_idx];

        // Works!?
        more_asserts::assert_le!(
            mesh.indices.len(),
            index_buffer.size / 4,
            "Buffer size miss-match that somehow won't happen"
        );

        temp_mesh_idx += 1;
    }
}

match primitive {
    Primitive::Mesh(mesh) => {
        let index_buffer = index_buffers.next().unwrap();
        let vertex_buffer = vertex_buffers.next().unwrap();

        // Fails
        more_asserts::assert_le!(
            mesh.indices.len(),
            index_buffer.size / 4,
            "Buffer size miss-match"
        );

        if let Some((_texture, bind_group)) = self.textures.get(&mesh.texture_id) {
            rpass.set_bind_group(1, bind_group, &[]);
            rpass.set_index_buffer(
                index_buffer.buffer.slice(..),
                wgpu::IndexFormat::Uint32,
            );
            rpass.set_vertex_buffer(0, vertex_buffer.buffer.slice(..));
            rpass.draw_indexed(0..mesh.indices.len() as u32, 0, 0..1);
        } else {
            tracing::warn!("Missing texture: {:?}", mesh.texture_id);
        }
    }

So, right before running the assertion that fails, I literally loop through all of the paint jobs and make sure that their index buffers are within bounds. And there is no code between the check loop an the code that fails afterwards other than index_buffers.next().

And I've tried even removing that by using an index instead of an iterator for accessing the index buffer and the result is the same.

I tried to walk through it with a debugger, but I have to find a way to reproduce without me needing to interact with the UI, because I can't set a breakpoint without freezing the UI.

@zicklag
Copy link
Contributor Author

zicklag commented Jun 5, 2022

After doing lots of printing the values as it loops, I've found that for some crazy reason, when the assertion fails, a print of mesh.indices reveals that its somehow returning the indices for the next item in the array. So the for ClippedPrimitive { .. } in paint_jobs loop is somehow skipping an item in the paint_jobs slice somehow ( I think ).

@zicklag
Copy link
Contributor Author

zicklag commented Jun 5, 2022

🤦 I fixed it. Duh, "skipping the loop". The continue statement. I'm figured I'd be missing something obvious by now. Wow.

Oh well, it's fixed now. 🙂

It always seems that the hardest bugs for me to find are the ones that take the smallest amount of code changes to fix.

@zicklag zicklag force-pushed the fix-1715 branch 2 times, most recently from 7e4d278 to 4bb9e5e Compare June 6, 2022 00:46
@emilk emilk changed the title Attempt to Fix #1715 Fix WGPU Validation Error Jun 9, 2022
@emilk
Copy link
Owner

emilk commented Jun 9, 2022

Good job!

@emilk emilk merged commit 62a00c4 into emilk:master Jun 9, 2022
@zicklag zicklag deleted the fix-1715 branch June 9, 2022 14:17
@zicklag
Copy link
Contributor Author

zicklag commented Jun 9, 2022

Thanks!

Do you want to add that assertion you mentioned in a separate PR? ( I saw the Email apparently before you edited your comment ).

It sounded like a great idea to me.

@emilk
Copy link
Owner

emilk commented Jun 9, 2022

I removed the comment when I realized it wasn't a great idea - the buffers only ever expand and never shrink, so the assertions would fail

@zicklag
Copy link
Contributor Author

zicklag commented Jun 9, 2022

Oh, good point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WGPU Validation Error: Index Extends Beyond Limit
2 participants