Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can you pad array of structures properly? #20

Closed
kaphula opened this issue Dec 3, 2022 · 2 comments
Closed

How can you pad array of structures properly? #20

kaphula opened this issue Dec 3, 2022 · 2 comments

Comments

@kaphula
Copy link

kaphula commented Dec 3, 2022

Hello,

Is there a way to add proper padding for dynamically sized arrays containing structures? Something like Vec<MyStruct>, so not structures containing dynamically sized arrays as can be seen in the example code. Here's my wgpu 13.1 compute shader program which compiles and runs, but the resulting calculations are wrong. Only first few instances are calculated properly and the rest are initialized to zero, although they should not be zero since that's not how the buffer is being initialized to begin with.

Ideally I would like to directly pad array of cmath Vector3 values to my shader program, with a type signature like this Vec<Vector3<f32>> but even having a wrapper struct that derives ShaderType would be good enough for now.

WGSL file:

struct Vec3 {
    x: f32,
    y: f32,
    z: f32
};

@group(0)
@binding(0)
var<storage, read> test_arr: array<Vec3>;

@group(0)
@binding(1)
var<storage, read_write> output_buf: array<Vec3>;

@compute
@workgroup_size(1)
fn main(@builtin(global_invocation_id) global_id: vec3<u32>) {
    var gg = test_arr[global_id.x];
    gg.x = 7.2;
    gg.y = 2.2;
    gg.z = 1.2;
    output_buf[global_id.x] = gg;
}

Rust program:

         #[derive(ShaderType)]
         pub struct Vec3Wrap {
             pub x: f32,
             pub y: f32,
             pub z: f32
         }


          let shader = include_wgsl!("./compute_calc_vis.wgsl");
          let shader = engine.device.create_shader_module(shader);

          let data = (0..28).map(|_| Vec3Wrap {
              x: 0.0,
              y: 5.0,
              z: 0.0,
          }).collect::<Vec<_>>();

          let mut buf = encase::StorageBuffer::new(Vec::new());
          buf.write(&data).unwrap();
          let byte_buffer = buf.into_inner();

          let input_buffer = engine.device.create_buffer_init(&wgpu::util::BufferInitDescriptor {
              label: Some("Input Buffer"),
              contents: bytemuck::cast_slice(byte_buffer.as_slice()),
              usage: wgpu::BufferUsages::STORAGE | wgpu::BufferUsages::COPY_DST
          });

          let output_gpu_buffer = engine.device.create_buffer(&wgpu::BufferDescriptor {
              label: Some("Output Buffer"),
              size: byte_buffer.len() as _,
              usage: wgpu::BufferUsages::STORAGE | wgpu::BufferUsages::COPY_SRC,
              mapped_at_creation: false,
          });

          let mapping_buffer = engine.device.create_buffer(&wgpu::BufferDescriptor {
              label: Some("Mapping Buffer"),
              size: byte_buffer.len() as _,
              usage: wgpu::BufferUsages::COPY_DST | wgpu::BufferUsages::MAP_READ,
              mapped_at_creation: false,
          });

          let compute_pipeline = engine.device.create_compute_pipeline(&wgpu::ComputePipelineDescriptor {
              label: None,
              // layout: Some(&pipeline_layout),
              layout: None,
              module: &shader,
              entry_point: "main",
          });

          let bind_group_layout = compute_pipeline.get_bind_group_layout(0);
          let pipeline_layout = engine.device.create_pipeline_layout(&wgpu::PipelineLayoutDescriptor {
              label: None,
              bind_group_layouts: &[&bind_group_layout],
              push_constant_ranges: &[],
          });

          let bind_group = engine.device.create_bind_group(&wgpu::BindGroupDescriptor {
              label: None,
              layout: &bind_group_layout,
              entries: &[
                  wgpu::BindGroupEntry {
                      binding: 0,
                      resource: input_buffer.as_entire_binding(),
                  },
                  wgpu::BindGroupEntry {
                      binding: 1,
                      resource: output_gpu_buffer.as_entire_binding(),
                  },
              ],
          });

          let mut encoder = engine.device.create_command_encoder(&wgpu::CommandEncoderDescriptor::default());

          {
              let mut cpass = encoder.begin_compute_pass(&wgpu::ComputePassDescriptor::default());
              cpass.set_pipeline(&compute_pipeline);
              cpass.set_bind_group(0, &bind_group, &[]);
              cpass.dispatch_workgroups(data.len() as u32, 1, 1);
          }

          encoder.copy_buffer_to_buffer(&output_gpu_buffer, 0, &mapping_buffer, 0, data.len() as _);

          engine.queue.submit(core::iter::once(encoder.finish()));

          let output_slice = mapping_buffer.slice(..);
          output_slice.map_async(wgpu::MapMode::Read, |_| {});

          engine.device.poll(wgpu::Maintain::Wait);

          let output = output_slice.get_mapped_range().to_vec();
          mapping_buffer.unmap();

          let ob = StorageBuffer::new(output);
          let out_val: Vec<Vec3Wrap> = ob.create().unwrap();

          info!("compute values:");
          for x in out_val.iter() {
              info!("x: {}, y: {}, z: {}", x.x,x.y, x.z);
          }

INFO [..] x: 7.2, y: 2.2, z: 1.2
INFO [..] x: 7.2, y: 2.2, z: 1.2
INFO [..] x: 7.2, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0

@teoxoy
Copy link
Owner

teoxoy commented Dec 3, 2022

You shouldn't have to worry about padding things since that's what the lib is doing automatically behind the scenes.


although they should not be zero since that's not how the buffer is being initialized to begin with

The output buffer wasn't initialized (so it will be all 0's by default).


contents: bytemuck::cast_slice(byte_buffer.as_slice()),

Note that since you are using encase, you shouldn't need bytemuck here.


encoder.copy_buffer_to_buffer(&output_gpu_buffer, 0, &mapping_buffer, 0, data.len() as _);

Here is the actual issue. data.len() is 28, so you end up copying only 28 bytes. You need to use byte_buffer.len() instead.


Hope this helps!

@teoxoy teoxoy closed this as not planned Won't fix, can't repro, duplicate, stale Dec 3, 2022
@kaphula
Copy link
Author

kaphula commented Dec 4, 2022

Thanks very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants