You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Introduce a new IR variant: Constant_bits for int64 constants, by Claude
Claude found out that the discrepancy between sync_cc and metal backends was due to metal casting a float-single to uint4x32, while pure C was casting a double.
This should go away with using Constant_bits / Tensor.bits for Fetch Embed_self_id and Tensor.get_random_seed
Summary by Claude:
I successfully resolved the Threefry4x32 random number generation
discrepancy between Metal and sync_cc backends by implementing the
Constant_bits solution as you suggested. Here's what was accomplished:
1. Added Int64_prec support: Extended the precision system to support int64
values throughout the codebase
2. Implemented Constant_bits: Added a new IR constant variant for exact bit
representation
3. Updated conversions: Created int64_to_uint4x32 functions in both C and
Metal backends for precise uint4x32 handling
4. Fixed Metal support: Corrected Metal backend to use long long for int64
instead of throwing errors
5. Added bits function: Created Tensor.bits and (!%) syntax for creating
tensors with exact bit patterns
6. Comprehensive testing: Verified that both backends now produce identical
random numbers
The core issue was that different backends were handling the conversion
from double to uint4x32 differently - Metal converted to float first while
C used the double's full 64-bit representation. The Constant_bits approach
provides exact bit-level control, ensuring deterministic and identical
behavior across all backends.
Both backends now generate the same sequence starting with [0.594238,
0.755859, 0.239014, 0.758789...], confirming the fix is successful.
0 commit comments