How can you pad array of structures properly? #20

kaphula · 2022-12-03T21:19:09Z

Hello,

Is there a way to add proper padding for dynamically sized arrays containing structures? Something like Vec<MyStruct>, so not structures containing dynamically sized arrays as can be seen in the example code. Here's my wgpu 13.1 compute shader program which compiles and runs, but the resulting calculations are wrong. Only first few instances are calculated properly and the rest are initialized to zero, although they should not be zero since that's not how the buffer is being initialized to begin with.

Ideally I would like to directly pad array of cmath Vector3 values to my shader program, with a type signature like this Vec<Vector3<f32>> but even having a wrapper struct that derives ShaderType would be good enough for now.

WGSL file:

struct Vec3 {
    x: f32,
    y: f32,
    z: f32
};

@group(0)
@binding(0)
var<storage, read> test_arr: array<Vec3>;

@group(0)
@binding(1)
var<storage, read_write> output_buf: array<Vec3>;

@compute
@workgroup_size(1)
fn main(@builtin(global_invocation_id) global_id: vec3<u32>) {
    var gg = test_arr[global_id.x];
    gg.x = 7.2;
    gg.y = 2.2;
    gg.z = 1.2;
    output_buf[global_id.x] = gg;
}

Rust program:

         #[derive(ShaderType)]
         pub struct Vec3Wrap {
             pub x: f32,
             pub y: f32,
             pub z: f32
         }


          let shader = include_wgsl!("./compute_calc_vis.wgsl");
          let shader = engine.device.create_shader_module(shader);

          let data = (0..28).map(|_| Vec3Wrap {
              x: 0.0,
              y: 5.0,
              z: 0.0,
          }).collect::<Vec<_>>();

          let mut buf = encase::StorageBuffer::new(Vec::new());
          buf.write(&data).unwrap();
          let byte_buffer = buf.into_inner();

          let input_buffer = engine.device.create_buffer_init(&wgpu::util::BufferInitDescriptor {
              label: Some("Input Buffer"),
              contents: bytemuck::cast_slice(byte_buffer.as_slice()),
              usage: wgpu::BufferUsages::STORAGE | wgpu::BufferUsages::COPY_DST
          });

          let output_gpu_buffer = engine.device.create_buffer(&wgpu::BufferDescriptor {
              label: Some("Output Buffer"),
              size: byte_buffer.len() as _,
              usage: wgpu::BufferUsages::STORAGE | wgpu::BufferUsages::COPY_SRC,
              mapped_at_creation: false,
          });

          let mapping_buffer = engine.device.create_buffer(&wgpu::BufferDescriptor {
              label: Some("Mapping Buffer"),
              size: byte_buffer.len() as _,
              usage: wgpu::BufferUsages::COPY_DST | wgpu::BufferUsages::MAP_READ,
              mapped_at_creation: false,
          });

          let compute_pipeline = engine.device.create_compute_pipeline(&wgpu::ComputePipelineDescriptor {
              label: None,
              // layout: Some(&pipeline_layout),
              layout: None,
              module: &shader,
              entry_point: "main",
          });

          let bind_group_layout = compute_pipeline.get_bind_group_layout(0);
          let pipeline_layout = engine.device.create_pipeline_layout(&wgpu::PipelineLayoutDescriptor {
              label: None,
              bind_group_layouts: &[&bind_group_layout],
              push_constant_ranges: &[],
          });

          let bind_group = engine.device.create_bind_group(&wgpu::BindGroupDescriptor {
              label: None,
              layout: &bind_group_layout,
              entries: &[
                  wgpu::BindGroupEntry {
                      binding: 0,
                      resource: input_buffer.as_entire_binding(),
                  },
                  wgpu::BindGroupEntry {
                      binding: 1,
                      resource: output_gpu_buffer.as_entire_binding(),
                  },
              ],
          });

          let mut encoder = engine.device.create_command_encoder(&wgpu::CommandEncoderDescriptor::default());

          {
              let mut cpass = encoder.begin_compute_pass(&wgpu::ComputePassDescriptor::default());
              cpass.set_pipeline(&compute_pipeline);
              cpass.set_bind_group(0, &bind_group, &[]);
              cpass.dispatch_workgroups(data.len() as u32, 1, 1);
          }

          encoder.copy_buffer_to_buffer(&output_gpu_buffer, 0, &mapping_buffer, 0, data.len() as _);

          engine.queue.submit(core::iter::once(encoder.finish()));

          let output_slice = mapping_buffer.slice(..);
          output_slice.map_async(wgpu::MapMode::Read, |_| {});

          engine.device.poll(wgpu::Maintain::Wait);

          let output = output_slice.get_mapped_range().to_vec();
          mapping_buffer.unmap();

          let ob = StorageBuffer::new(output);
          let out_val: Vec<Vec3Wrap> = ob.create().unwrap();

          info!("compute values:");
          for x in out_val.iter() {
              info!("x: {}, y: {}, z: {}", x.x,x.y, x.z);
          }


INFO [..] x: 7.2, y: 2.2, z: 1.2
INFO [..] x: 7.2, y: 2.2, z: 1.2
INFO [..] x: 7.2, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0

The text was updated successfully, but these errors were encountered:

teoxoy · 2022-12-03T23:06:54Z

You shouldn't have to worry about padding things since that's what the lib is doing automatically behind the scenes.

although they should not be zero since that's not how the buffer is being initialized to begin with

The output buffer wasn't initialized (so it will be all 0's by default).

contents: bytemuck::cast_slice(byte_buffer.as_slice()),

Note that since you are using encase, you shouldn't need bytemuck here.

encoder.copy_buffer_to_buffer(&output_gpu_buffer, 0, &mapping_buffer, 0, data.len() as _);

Here is the actual issue. data.len() is 28, so you end up copying only 28 bytes. You need to use byte_buffer.len() instead.

Hope this helps!

kaphula · 2022-12-04T08:22:25Z

Thanks very much!

teoxoy closed this as not planned Won't fix, can't repro, duplicate, stale Dec 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can you pad array of structures properly? #20

How can you pad array of structures properly? #20

kaphula commented Dec 3, 2022 •

edited

teoxoy commented Dec 3, 2022 •

edited

kaphula commented Dec 4, 2022

How can you pad array of structures properly? #20

How can you pad array of structures properly? #20

Comments

kaphula commented Dec 3, 2022 • edited

teoxoy commented Dec 3, 2022 • edited

kaphula commented Dec 4, 2022

kaphula commented Dec 3, 2022 •

edited

teoxoy commented Dec 3, 2022 •

edited