Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wgpu_hal tries to call glFlushMappedBufferRange on buffer without setting GL_MAP_FLUSH_EXPLICIT_BIT #6137

Closed
jimblandy opened this issue Aug 20, 2024 · 3 comments · Fixed by #6178
Assignees

Comments

@jimblandy
Copy link
Member

Simple programs using the gles backend can get errors like this:

[... ERROR wgpu_hal::gles] GLES: [API/Error] ID 2 : GL_INVALID_OPERATION in glFlushMappedBufferRange(GL_MAP_FLUSH_EXPLICIT_BIT not set)

In GLES 3.2, calling glFlushMappedBufferRange is only allowed if the buffer was previously mapped with the GL_MAP_FLUSH_EXPLICIT flag set. doc But wgpu_hal as driven by wgpu_core violates this rule.

In wgpu_hal, the code in <gles::Device as Device>::create_buffer sets GL_MAP_FLUSH_EXPLICIT_BIT only if BufferUses::MAP_WRITE is set. But when wgpu_core::device::map_buffer is presented with a HostMap::Read mapping of a buffer with uninitialized ranges, it calls wgpu_hal::Device::flush_mapped_ranges, which calls glFlushMappedBufferRange even though the buffer was created only with BufferUses::MAP_READ.

@jimblandy
Copy link
Member Author

Here's a dumb test case that shows the problem.

4 files changed, 180 insertions(+)
tests/tests/bounds_checks/buffer.rs           | 168 ++++++++++++++++++++++++++
tests/tests/bounds_checks/buffer_storage.wgsl |  10 ++
tests/tests/bounds_checks/mod.rs              |   1 +
tests/tests/root.rs                           |   1 +

new file   tests/tests/bounds_checks/buffer.rs
@@ -0,0 +1,168 @@
+use wgpu::{BufferDescriptor, BufferUsages};
+use wgpu_test::{gpu_test, valid, GpuTestConfiguration, TestParameters};
+
+#[gpu_test]
+static BUFFER_STORAGE: GpuTestConfiguration =
+    GpuTestConfiguration::new()
+    .parameters(
+        TestParameters::default()
+            .test_features_limits(),
+    )
+    .run_async(buffer_storage);
+
+#[allow(unused_variables)]
+async fn buffer_storage(ctx: wgpu_test::TestingContext) {
+    // We want to bind a window in the middle of the storage buffer.
+    // Sometimes `min_storage_buffer_offset_alignment` is 256, so the
+    // shortest length that will let us have regions at the start and
+    // end of the buffer that are allegedly not visible to the shader
+    // is 3 * 256.
+    //
+    // But this length is in `u32` elements, so that's 3 * 64.
+    const TOTAL_LEN: usize = 3 * 64;
+    const WINDOW_START: usize = 64;
+    const WINDOW_LEN: usize = 64;
+
+    // Create the buffer for copying compute shader output to the CPU.
+    let readback = ctx.device.create_buffer(&BufferDescriptor {
+        label: Some("readback"),
+        size: u32_bytes(TOTAL_LEN),
+        usage: BufferUsages::MAP_READ | BufferUsages::COPY_DST,
+        mapped_at_creation: false,
+    });
+
+    valid(&ctx.device, || {
+        let label = Some("buffer_storage");
+        let bind_group_layout =
+            ctx.device
+                .create_bind_group_layout(&wgpu::BindGroupLayoutDescriptor {
+                    label,
+                    entries: &[
+                        wgpu::BindGroupLayoutEntry {
+                            binding: 0,
+                            visibility: wgpu::ShaderStages::COMPUTE,
+                            ty: wgpu::BindingType::Buffer {
+                                ty: wgpu::BufferBindingType::Storage { read_only: true },
+                                has_dynamic_offset: false,
+                                min_binding_size: Some(wgpu::BufferSize::new(4).unwrap()),
+                            },
+                            count: None,
+                        },
+                        wgpu::BindGroupLayoutEntry {
+                            binding: 1,
+                            visibility: wgpu::ShaderStages::COMPUTE,
+                            ty: wgpu::BindingType::Buffer {
+                                ty: wgpu::BufferBindingType::Storage { read_only: false },
+                                has_dynamic_offset: false,
+                                min_binding_size: Some(wgpu::BufferSize::new(4).unwrap()),
+                            },
+                            count: None,
+                        },
+                    ],
+                });
+
+        let pipeline_layout = ctx
+            .device
+            .create_pipeline_layout(&wgpu::PipelineLayoutDescriptor {
+                label,
+                bind_group_layouts: &[&bind_group_layout],
+                push_constant_ranges: &[],
+            });
+
+        let module = ctx
+            .device
+            .create_shader_module(wgpu::include_wgsl!("buffer_storage.wgsl"));
+
+        let pipeline = ctx
+            .device
+            .create_compute_pipeline(&wgpu::ComputePipelineDescriptor {
+                label,
+                layout: Some(&pipeline_layout),
+                module: &module,
+                entry_point: Some("push_boundaries"),
+                compilation_options: <_>::default(),
+                cache: None,
+            });
+
+        // Create the buffer the shader reads from. We'll only bind
+        // the middle third of this.
+        let input = ctx.device.create_buffer(&BufferDescriptor {
+            label: Some("input"),
+            size: u32_bytes(TOTAL_LEN),
+            usage: BufferUsages::STORAGE,
+            mapped_at_creation: true,
+        });
+
+        let mut view = input.slice(..).get_mapped_range_mut();
+        let words: &mut [u32] = bytemuck::cast_slice_mut(&mut view);
+        words[..].fill(42);
+        words[WINDOW_START..WINDOW_START + WINDOW_LEN].fill(0);
+        drop(view);
+        input.unmap();
+
+        // Create the buffer to which the compute shader copies the
+        // bound portion of `input`.
+        let output = ctx.device.create_buffer(&BufferDescriptor {
+            label: Some("output"),
+            size: u32_bytes(WINDOW_LEN),
+            usage: BufferUsages::STORAGE | BufferUsages::COPY_SRC,
+            mapped_at_creation: false,
+        });
+
+        let bind_group = ctx.device.create_bind_group(&wgpu::BindGroupDescriptor {
+            label,
+            layout: &bind_group_layout,
+            entries: &[
+                wgpu::BindGroupEntry {
+                    binding: 0,
+                    resource: wgpu::BindingResource::Buffer(wgpu::BufferBinding {
+                        buffer: &input,
+                        offset: u32_bytes(WINDOW_START),
+                        size: Some(wgpu::BufferSize::new(u32_bytes(WINDOW_LEN)).unwrap()),
+                    }),
+                },
+                wgpu::BindGroupEntry {
+                    binding: 1,
+                    resource: wgpu::BindingResource::Buffer(wgpu::BufferBinding {
+                        buffer: &output,
+                        offset: 0,
+                        size: None,
+                    }),
+                },
+            ],
+        });
+
+        let mut encoder = ctx
+            .device
+            .create_command_encoder(&wgpu::CommandEncoderDescriptor { label });
+        {
+            let mut pass = encoder.begin_compute_pass(&wgpu::ComputePassDescriptor {
+                label,
+                timestamp_writes: None,
+            });
+
+            pass.set_pipeline(&pipeline);
+            pass.set_bind_group(0, &bind_group, &[]);
+            pass.dispatch_workgroups(1, 1, 1);
+        }
+        encoder.copy_buffer_to_buffer(&output, 0, &readback, 0, u32_bytes(WINDOW_LEN));
+        ctx.queue.submit([encoder.finish()]);
+        readback
+            .slice(..)
+            .map_async(wgpu::MapMode::Read, Result::unwrap);
+
+        log::info!("I love you");
+    });
+
+    ctx.device.poll(wgpu::Maintain::Wait);
+    /*
+    let view = readback.slice(..).get_mapped_range();
+    let words: &[u32] = bytemuck::cast_slice(&view);
+    log::info!("words: {words:?}");
+    */
+}
+
+/// The size of `n` `u32` values, in bytes.
+const fn u32_bytes(n: usize) -> wgt::BufferAddress {
+    (n * std::mem::size_of::<u32>())  as wgt::BufferAddress
+}
new file   tests/tests/bounds_checks/buffer_storage.wgsl
@@ -0,0 +1,10 @@
+@group(0) @binding(0)
+var<storage, read> input: array<u32>;
+
+@group(0) @binding(1)
+var<storage, read_write> output: array<u32>;
+
+@compute @workgroup_size(64)
+fn push_boundaries(@builtin(local_invocation_index) index: u32) {
+  output[index] = input[index];
+}
new file   tests/tests/bounds_checks/mod.rs
@@ -0,0 +1 @@
+mod buffer;
modified   tests/tests/root.rs
@@ -11,6 +11,7 @@ mod regression {
 mod bgra8unorm_storage;
 mod bind_group_layout_dedup;
 mod bind_groups;
+mod bounds_checks;
 mod buffer;
 mod buffer_copy;
 mod buffer_usages;

@jimblandy
Copy link
Member Author

We believe the fix is to enable the GL_MAP_FLUSH_EXPLICIT_BIT flag unconditionally.

@teoxoy
Copy link
Member

teoxoy commented Aug 27, 2024

This is blocking #5714 since one of the tests added in that PR is running into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
2 participants