Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for leading_zeros, trailing_zeros and fix count_ones #213

Merged
merged 15 commits into from
Mar 21, 2025

Conversation

Firestar99
Copy link
Member

@Firestar99 Firestar99 commented Jan 29, 2025

In main leading_zeros and trailing_zeros require some obscure Intel Extension. This PR changes them to use the GLSL.std.450 functions FindILsb and FindUMsb, which do the exact same thing. Though they slightly mismatch when passing in 0, rust expects the bit count of the type but they return !0/-1. And they require to be used exclusively with 32bit integers, so some careful type and bitcasting is required to emulate it for all other types, like u8, u16 and u64.

Also fixes count_ones and bit_reverse, as VUID-StandaloneSpirv-Base-04781 also requires their args to always be 32bit integers as well but we were ignoring it.

While removing the Intel extension, some code managing enabled extensions was triggering an unused warning, so I removed it in 1fb0301 which we may revert if needed.

Closes #210 #215

@Firestar99

This comment was marked as outdated.

@LegNeato
Copy link
Collaborator

LegNeato commented Jan 29, 2025

Sweet! I actually looked at this too, just didn't put my WIP up. Differences between what I did:

I pushed a branch. I think the code is largely correct, but it is only lightly tested:

https://github.com/LegNeato/rust-gpu/tree/clz

@Firestar99 Firestar99 force-pushed the leading_trailing_zeros branch from 702cb97 to c53e03c Compare February 10, 2025 10:06
@Firestar99 Firestar99 changed the title add support for leading_zeros and trailing_zeros add support for leading_zeros, trailing_zeros and fix count_ones Feb 10, 2025
@Firestar99
Copy link
Member Author

Firestar99 commented Feb 10, 2025

This is ready for review, but completely untested beyond compile tests. Would love some integration tests or fuzzing right now, as it's quite easy to screw up bitcasts or misread documentation. But I guess some careful reviewing must do.

@Firestar99 Firestar99 marked this pull request as ready for review February 10, 2025 14:25
@LegNeato
Copy link
Collaborator

You could rebase on #216 ?

@Firestar99 Firestar99 force-pushed the leading_trailing_zeros branch from 10f0283 to c0c5879 Compare February 11, 2025 14:12
@Firestar99
Copy link
Member Author

Sure, there you go a rebase, git could just do it automatically. I assume CI will now do run some difftests?

@LegNeato
Copy link
Collaborator

Yep! Now you can add a runtime diff test for this if you want. But I'll also be revamping a bit and squashing that PR, so I don't want to make your git stuff more annoying. Up to you if it is worth it or want to land as-is!

@Firestar99 Firestar99 force-pushed the leading_trailing_zeros branch from c0c5879 to f22d523 Compare March 13, 2025 17:27
@Firestar99
Copy link
Member Author

Firestar99 commented Mar 13, 2025

I've removed the differential testing harness again, as you've said it isn't ready for review yet, and it tests rust-gpu vs wgsl whereas I need rust-gpu vs standard rust on the cpu.

Just did some basic diff testing in my engine against a software impl I've been using as a replacement, and something isn't right with trailing / leading zeros. Will look at it at a later date.

@Firestar99 Firestar99 force-pushed the leading_trailing_zeros branch from f891f5d to 1449527 Compare March 17, 2025 17:22
@Firestar99
Copy link
Member Author

Firestar99 commented Mar 21, 2025

I made a test in my bindless framework (shader cpu) and can confirm that u32 is working!

However, I'm running into issues testing out other unsigned types:

error: cannot load type u8 in an untyped buffer load
    --> /home/firestar99/.cargo/git/checkouts/rust-gpu-11142fd2aadc2318/57c7d9d/crates/spirv-std/src/byte_addressable_buffer.rs:97:9
     |
  97 |         buffer_load_intrinsic(self.data, byte_index)
     |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  error: cannot store type struct count_leading_trailing::CountLeadingTrailingZerosTransfer<u8, u8, u8> { count: u32, leading: u32, trailing: u32, input: u8, swap_bytes: u8, reverse_bits: u8 } in an untyped buffer store
     --> /home/firestar99/.cargo/git/checkouts/rust-gpu-11142fd2aadc2318/57c7d9d/crates/spirv-std/src/byte_addressable_buffer.rs:150:9
      |
  150 |         buffer_store_intrinsic(self.data, byte_index, value);
      |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      |
      = note: due to containing type 49

I may rework this to cast them to u64 and back, though I'd of course prefer if we fix that one as well. Enabling Int8, Int16, Int64, StorageBufferInt8, StorageBufferInt16 didn't help.

@Firestar99
Copy link
Member Author

u64 fails in the exact same way...

@LegNeato I say we just merge it for now and hope anything non-u32 works as well. At the very least I fixed u32.

@Firestar99 Firestar99 enabled auto-merge March 21, 2025 17:35
SpirvType::Integer(bits, signed) => {
let u32 = SpirvType::Integer(32, false).def(self.span(), self);

match bits {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a style thing, but I generally feel it is clearer to matched on signed as well, it makes all the cases clearer, similar to what you did in bitcast.

.emit()
.shift_right_logical(ty, None, arg, u32_32)
.unwrap();
let higher = self.emit().s_convert(u32, None, higher).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this need to change depending on sign?

@LegNeato
Copy link
Collaborator

u64 fails in the exact same way...

@LegNeato I say we just merge it for now and hope anything non-u32 works as well. At the very least I fixed u32.

I'm not sure I parse this...the comment above was saying unsigned types except for u32 do not work, and then confirms u64 does not work, and then says let's land and "hope anything non-u32" works?

@Firestar99
Copy link
Member Author

Firestar99 commented Mar 21, 2025

To correct things: I implemented it for u8, u16, u32 and u64 and today I wrote a test with my bindless stuff to debug why u32::trailing_zeros() was giving me crap values. But I was not able to easily adapt the test to anything other than u32, due to limitations with ByteAddresableBuffer not allowing you to read or write u8, u16 and u64 for whatever reason. (Which I was completely unaware of)

Anyways, I noticed that rustc actually does not call any of these methods on the signed types, as the rust std converts them to unsigned beforehand. So I can just remove all the signed conversions, except for [i8,i16,i32,i64]::leading_zeros() with non_zero: true. I have no idea how these are reachable, but they exist and we now support them by just converting to unsigned and "retrying".

Give me another iteration on this one, but I'd like to land it before we release.

@Firestar99
Copy link
Member Author

@LegNeato Did another iteration, hopefully a lot cleaner now. Removed the signed-ness everywhere it's not needed, with just one extra branch for the leading_zeros() special case, see previous edited comment.

Copy link
Collaborator

@LegNeato LegNeato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Firestar99 Firestar99 added this pull request to the merge queue Mar 21, 2025
Merged via the queue into main with commit 9a533a3 Mar 21, 2025
7 checks passed
@Firestar99 Firestar99 deleted the leading_trailing_zeros branch March 21, 2025 23:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

u32::leading_zeros intrinic requires weird extension to work
2 participants