Add missing functions for `f16` and `f128` #587

tgross35 · 2024-04-07T07:00:05Z

Fixes #567

README.md

tgross35 · 2024-04-10T19:19:36Z

@Amanieu this will be the corresponding update to builtins once rust-lang/rust#122470 merges. How do I need to gate things here since it will only work with a recent nightly?

Amanieu · 2024-04-10T19:33:47Z

compiler-builtins is effectively an internal implementation detail of rustc, you're not meant to use it directly. So it therefore only supports the latest rustc anyways.

However you may have problems with bootstrap since that uses the latest beta which won't have support for the f16/f128 types. I believe cfg(bootstrap) should work for dependencies of the standard library though.

Amanieu · 2024-04-10T19:35:09Z

README.md

@@ -264,7 +269,6 @@ These builtins involve floating-point types ("`f128`", "`f80`" and complex numbe
 - ~~ppc/gcc_qmul.c~~
 - ~~ppc/gcc_qsub.c~~
 - ~~ppc/multc3.c~~
- ~~subtf3.c~~
 - ~~trunctfdf2.c~~
 - ~~trunctfsf2.c~~


Are you sure these aren't needed? I would expect all the tf builtins to be needed for f128.

They are, I just wanted to get arithmetic passing before working on those.

tgross35 · 2024-04-10T19:36:55Z

Awesome, thanks for the information. Luckily compiler support did get merged right before the beta branch so I have been able to avoid cfg(not(bootstrap))ing in most places.

tgross35 · 2024-04-12T07:01:02Z

What is the best way to test this locally? For everything I have tried, the macros at

compiler-builtins/testcrate/tests/lse.rs

Lines 94 to 100 in 7240849

    
           compiler_builtins::foreach_cas!(cas::test); 
        
           compiler_builtins::foreach_cas16!(test_cas16); 
        
           compiler_builtins::foreach_swp!(swap::test); 
        
           compiler_builtins::foreach_ldadd!(add::test); 
        
           compiler_builtins::foreach_ldclr!(clr::test); 
        
           compiler_builtins::foreach_ldeor!(xor::test); 
        
           compiler_builtins::foreach_ldset!(or::test);

can't be located and running ./ci/run-docker.sh doesn't find the right linker. Then the tests at https://github.com/tgross35/compiler-builtins/blob/7d347bb9e348e070a87b4123a29c3fdd796d4dee/testcrate/tests/addsub.rs#L112 fail because the symbols to compare against aren't available on my machine (which is aarch64, maybe I just need to gate this?).

Looking forward to being able to get some output from CI but I'm still waiting on nightly to catch up 🙂

Amanieu · 2024-04-13T00:53:59Z

Normally run-docker.sh just works, but I guess nobody has tried running it on aarch64. It works fine for me when run on an x86_64 host.

If you're feeling adventurous, you can try fixing the tests to work on an aarch64 host.

Regarding the LSE issues, I think that may be because you are on an Apple system where these symbols don't exist. That test should be gated on target_os = "linux".

tgross35 · 2024-04-13T08:28:33Z

The linker just wasn't available on ubuntu 18, but I had to add a way to not use host rustc. Opened #591 to add a way to use a different image

bjorn3 · 2024-04-14T06:42:45Z

src/float/add.rs

@@ -201,6 +201,10 @@ intrinsics! {
        add(a, b)
    }

+    pub extern "C" fn __addtf3(a: f128, b: f128) -> f128 {
+        add(a, b)
+    }


This will unconditionally codegen f16 and f128 usage, breaking cg_clif and cg_gcc. Maybe add a cargo feature to disable these new intrinsics?

tgross35 · 2024-04-15T09:31:35Z

What is the reason tests don't seem to get run on these platforms? (aarch, i586, thumb, wasm). Wondering if they should be enabled for this.

tgross35 · 2024-04-17T07:14:44Z

My widening function is failing at "0xffffffffffffffffffffffffffffffff * 0xffffffffffffffffffffffffffffffff = 0xfffffffffffffffffffffffffffffffe00000000000000000000000000000001 got 0xfffffffffffffffffffffffffffffffd00000000000000000000000000000001", https://github.com/rust-lang/compiler-builtins/actions/runs/8717331100/job/23912367209?pr=587#step:9:370. This seems to be identical to what LLVM's wideMultiply does, https://clang.godbolt.org/z/Y4vMxd6se (original source), but from Python it seems like the upper half should end with an e rather than a d:

>>> hex(0xffffffffffffffffffffffffffffffff * 0xffffffffffffffffffffffffffffffff)
'0xfffffffffffffffffffffffffffffffe00000000000000000000000000000001'

Relevant code if anyone has any ideas

compiler-builtins/src/int/big.rs

Lines 233 to 284 in 3cc6355

    
           fn zero_widen_mul(self, rhs: Self) -> Self::D { 
        
               let product11: u64 = word!(1, self) * word!(1, rhs); 
        
               let product12: u64 = word!(1, self) * word!(2, rhs); 
        
               let product13: u64 = word!(1, self) * word!(3, rhs); 
        
               let product14: u64 = word!(1, self) * word!(4, rhs); 
        
               let product21: u64 = word!(2, self) * word!(1, rhs); 
        
               let product22: u64 = word!(2, self) * word!(2, rhs); 
        
               let product23: u64 = word!(2, self) * word!(3, rhs); 
        
               let product24: u64 = word!(2, self) * word!(4, rhs); 
        
               let product31: u64 = word!(3, self) * word!(1, rhs); 
        
               let product32: u64 = word!(3, self) * word!(2, rhs); 
        
               let product33: u64 = word!(3, self) * word!(3, rhs); 
        
               let product34: u64 = word!(3, self) * word!(4, rhs); 
        
               let product41: u64 = word!(4, self) * word!(1, rhs); 
        
               let product42: u64 = word!(4, self) * word!(2, rhs); 
        
               let product43: u64 = word!(4, self) * word!(3, rhs); 
        
               let product44: u64 = word!(4, self) * word!(4, rhs); 
        
               let sum0: u128 = u128::from(product44); 
        
               let sum1: u128 = u128::from(product34) + u128::from(product43); 
        
               let sum2: u128 = u128::from(product24) + u128::from(product33) + u128::from(product42); 
        
               let sum3: u128 = u128::from(product14) 
        
                   + u128::from(product23) 
        
                   + u128::from(product32) 
        
                   + u128::from(product41); 
        
               let sum4: u128 = u128::from(product13) + u128::from(product22) + u128::from(product31); 
        
               let sum5: u128 = u128::from(product12) + u128::from(product21); 
        
               let sum6: u128 = u128::from(product11); 
        
               let r0: u128 = 
        
                   (sum0 & u128::from(WORD_FULL_MASK)) + ((sum1 & u128::from(WORD_LO_MASK)) << 32); 
        
               let r1: u128 = (sum0 >> 64) 
        
                   + ((sum1 >> 32) & u128::from(WORD_FULL_MASK)) 
        
                   + (sum2 & u128::from(WORD_FULL_MASK)) 
        
                   + ((sum3 << 32) & u128::from(WORD_HI_MASK)); 
        
               let lo = r0.wrapping_add(r1 << 64); 
        
               let hi = (r1 >> 64) 
        
                   + (sum1 >> 96) 
        
                   + (sum2 >> 64) 
        
                   + (sum3 >> 32) 
        
                   + sum4 
        
                   + (sum5 << 32) 
        
                   + (sum6 << 64); 
        
               u256([ 
        
                   (lo & U128_LO_MASK) as u64, 
        
                   ((lo >> 64) & U128_LO_MASK) as u64, 
        
                   (hi & U128_LO_MASK) as u64, 
        
                   ((hi >> 64) & U128_LO_MASK) as u64, 
        
               ]) 
        
           }

marthadev · 2024-04-24T23:08:20Z

My widening function is failing at "0xffffffffffffffffffffffffffffffff * 0xffffffffffffffffffffffffffffffff = 0xfffffffffffffffffffffffffffffffe00000000000000000000000000000001 got 0xfffffffffffffffffffffffffffffffd00000000000000000000000000000001", https://github.com/rust-lang/compiler-builtins/actions/runs/8717331100/job/23912367209?pr=587#step:9:370. This seems to be identical to what LLVM's wideMultiply does, https://clang.godbolt.org/z/Y4vMxd6se (original source), but from Python it seems like the upper half should end with an e rather than a d:

AFAICT the code is dropping a carry with the wrapping_add

compiler-builtins/src/int/big.rs

Lines 269 to 276 in 3cc6355

let lo = r0.wrapping_add(r1 << 64);

let hi = (r1 >> 64)

+ (sum1 >> 96)

+ (sum2 >> 64)

+ (sum3 >> 32)

+ sum4

+ (sum5 << 32)

+ (sum6 << 64);

Should probably be something like:

    let (lo, carry) = r0.overflowing_add(r1 << 64);
    let hi = (r1 >> 64)
        + (sum1 >> 96)
        + (sum2 >> 64)
        + (sum3 >> 32)
        + sum4
        + (sum5 << 32)
        + (sum6 << 64)
        + u128::from(carry);

tgross35 · 2024-04-25T00:21:34Z

Thanks @marthadev, I just today realized that we have a nightly carrying_add too to make this even easier. Any idea if what LLVM has is also incorrect then, or am I missing something in that comparison?

marthadev · 2024-04-25T00:47:36Z

I just today realized that we have a nightly carrying_add

TIL, it does indeed look nice

Any idea if what LLVM has is also incorrect then, or am I missing something in that comparison?

The LLVM implementation seems to have the same problem, which honestly surprises me given that the code is 10 years old. I have no clue though how widely that function is being used, but having a logic bug in integer arithmetic seems pretty bad, so I am glad you found it.

Edit: Curiously enough, the LLVM 64x64 version even explicitly comments about the carry

tgross35 · 2024-04-25T01:13:07Z

I opened an issue at llvm/llvm-project#90009 for the LLVM side to take a look at, thanks for confirming.

@beetrees

Try splitting part of 'Int' into 'MinInt' so we don't need to implement everything on u256/i256 Add addsub test Add mul/div/rem tests Add cmp test Remove 32-bit div implementation formatting updates disable div tests for now Bigint updates Big update Fix widen mul wrapping add disable duplicate symbols in builtins Apply temporary unord fix from @beetrees rust-lang#593 tests add lowerhex display errors by ref tests fix-test Update big tests Fix core calls Disable widen_mul for signed Test adding symbols in build.rs Add a feature to compile intrinsics that are missing on the system for testing update Disable f128 tests on platforms without system support add missing build.rs file pull cas file from master testgs print more div values Add a benchmark Work on fixing bit widths Update benchmark

tgross35 · 2024-05-13T20:48:38Z

Everything except division has been cleaned up and split off to #606

tgross35 · 2024-05-21T07:24:33Z

Superceded by a few PRs, many have merged already.

tgross35 mentioned this pull request Apr 7, 2024

Tracking Issue for f16 and f128 float types rust-lang/rust#116909

Open

3 tasks

RalfJung reviewed Apr 7, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

tgross35 force-pushed the f16-f128-intrinsics branch from fc0051a to 405f0aa Compare April 7, 2024 17:15

Amanieu reviewed Apr 10, 2024

View reviewed changes

tgross35 force-pushed the f16-f128-intrinsics branch 2 times, most recently from d6c05d0 to e5e93e8 Compare April 12, 2024 02:15

tgross35 force-pushed the f16-f128-intrinsics branch from 7d347bb to a467dd2 Compare April 13, 2024 19:31

bjorn3 reviewed Apr 14, 2024

View reviewed changes

tgross35 force-pushed the f16-f128-intrinsics branch 2 times, most recently from cb353e2 to 44c414f Compare April 14, 2024 07:30

beetrees mentioned this pull request Apr 14, 2024

Add builtins for f16/f128 float conversions #593

Merged

tgross35 force-pushed the f16-f128-intrinsics branch 2 times, most recently from 32d6c0f to db3fb45 Compare April 14, 2024 09:05

tgross35 force-pushed the f16-f128-intrinsics branch from a19f4ec to 53203b0 Compare April 17, 2024 06:08

tgross35 force-pushed the f16-f128-intrinsics branch from 5dc5fda to e03afc8 Compare April 19, 2024 17:45

tgross35 mentioned this pull request Apr 25, 2024

Possible incorrect result in compiler-rt 128x128 wideMultiply llvm/llvm-project#90009

Open

tgross35 force-pushed the f16-f128-intrinsics branch from fcd9612 to 6d70c0f Compare May 6, 2024 05:37

Add missing functions for f16 and f128

f3f4dec

tgross35 force-pushed the f16-f128-intrinsics branch from 6d70c0f to bbe3aef Compare May 6, 2024 06:34

tgross35 added 11 commits May 6, 2024 01:40

update

e2b4bbc

Fix some runtime panics

fe8e3bd

fix feature flags

b288658

cfg-fix

72e8223

update

0d2685b

Fix bigint missing carry

9500b7d

Add apfloat fallback

c459693

add fallback for div

de8f9be

Simplify test macros

27ea26c

Make macro work without converting

ba6162c

Macros are working for all f128 cases

0facad1

tgross35 mentioned this pull request May 10, 2024

Add addition, subtraction, multiplication, and compare operations for f128 #606

Merged

tgross35 closed this May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add missing functions for `f16` and `f128` #587

Add missing functions for `f16` and `f128` #587

tgross35 commented Apr 7, 2024 •

edited

tgross35 commented Apr 10, 2024

Amanieu commented Apr 10, 2024

Amanieu Apr 10, 2024

tgross35 Apr 10, 2024

tgross35 commented Apr 10, 2024

tgross35 commented Apr 12, 2024

Amanieu commented Apr 13, 2024

tgross35 commented Apr 13, 2024

bjorn3 Apr 14, 2024

tgross35 commented Apr 15, 2024 •

edited

tgross35 commented Apr 17, 2024 •

edited

marthadev commented Apr 24, 2024

tgross35 commented Apr 25, 2024

marthadev commented Apr 25, 2024 •

edited

tgross35 commented Apr 25, 2024

tgross35 commented May 13, 2024

tgross35 commented May 21, 2024

Add missing functions for f16 and f128 #587

Add missing functions for f16 and f128 #587

Conversation

tgross35 commented Apr 7, 2024 • edited

tgross35 commented Apr 10, 2024

Amanieu commented Apr 10, 2024

Amanieu Apr 10, 2024

Choose a reason for hiding this comment

tgross35 Apr 10, 2024

Choose a reason for hiding this comment

tgross35 commented Apr 10, 2024

tgross35 commented Apr 12, 2024

Amanieu commented Apr 13, 2024

tgross35 commented Apr 13, 2024

bjorn3 Apr 14, 2024

Choose a reason for hiding this comment

tgross35 commented Apr 15, 2024 • edited

tgross35 commented Apr 17, 2024 • edited

marthadev commented Apr 24, 2024

tgross35 commented Apr 25, 2024

marthadev commented Apr 25, 2024 • edited

tgross35 commented Apr 25, 2024

tgross35 commented May 13, 2024

tgross35 commented May 21, 2024

Add missing functions for `f16` and `f128` #587

Add missing functions for `f16` and `f128` #587

tgross35 commented Apr 7, 2024 •

edited

tgross35 commented Apr 15, 2024 •

edited

tgross35 commented Apr 17, 2024 •

edited

marthadev commented Apr 25, 2024 •

edited