Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed-size integers larger than Int64 #4

Open
stephentyrone opened this issue Jun 25, 2019 · 20 comments
Labels

Comments

@stephentyrone
Copy link
Member

@stephentyrone stephentyrone commented Jun 25, 2019

Provide a family of [U]Int128, 256, ... up to a reasonable threshold where flexibly-sized bignum becomes competitive (most likely 1024 bits). This should be done as generically as possible so that support for additional sizes can be added as needed with minimal complexity.

These types should conform to FixedWidthInteger.

@Sajjon

This comment has been minimized.

Copy link
Contributor

@Sajjon Sajjon commented Nov 7, 2019

I cannot emphasize enough how great it would be with Int256, UInt256, Int512, UInt512 etc! But why not also a BinaryInteger of non fixed width. There is already a prototype by Apple here. So far I’ve been relying on Great attaswift/BigInt by @lorentey , would be nice with BigUInt as well!

@stephentyrone

This comment has been minimized.

Copy link
Member Author

@stephentyrone stephentyrone commented Nov 7, 2019

But why not also a BinaryInteger of non fixed width

It's just a separate issue.

@Lukasa

This comment has been minimized.

Copy link

@Lukasa Lukasa commented Nov 8, 2019

What are the odds that a generically written UInt128 will be optimised into use of the specialised instructions available on some supported platforms?

@davezarzycki

This comment has been minimized.

Copy link

@davezarzycki davezarzycki commented Nov 8, 2019

What are the odds that a generically written UInt128 will be optimised into use of the specialised instructions available on some supported platforms?

Do you have an example? I'm not aware of any processor with hardware support for 128-bit integers (that being said, newer PPC processors have 128-bit floating point). Or do you merely mean "use the carry/borrow flag/bit" available in most processors?

@Lukasa

This comment has been minimized.

Copy link

@Lukasa Lukasa commented Nov 8, 2019

Or do you merely mean "use the carry/borrow flag/bit" available in most processors?

Sorry, I should have been clearer: yes, I mostly meant that, though in the event some weirder platforms have actually got 128-bit integer operations that would be nice too. As a concrete example, here is a naive-but-reasonable implementation of 128-bit unsigned integer checked addition (that might be used when writing this generically across FixedWidthInteger), and the associated x86_64 assembly is:

output.UInt128.add(output.UInt128) -> ():
        push    rbp
        mov     rbp, rsp
        add     qword ptr [r13], rdi
        mov     rax, qword ptr [r13 + 8]
        jae     .LBB1_2
        add     rax, 1
        mov     qword ptr [r13 + 8], rax
        jb      .LBB1_4
.LBB1_2:
        add     rax, rsi
        mov     qword ptr [r13 + 8], rax
        jb      .LBB1_3
        pop     rbp
        ret
.LBB1_3:
        ud2
.LBB1_4:
        ud2

Whereas a Rust version gives us:

example::add:
        push    rax
        mov     r8, rdx
        mov     rdx, rsi
        mov     rax, rdi
        add     rax, r8
        adc     rdx, rcx
        jb      .LBB0_1
        pop     rcx
        ret
.LBB0_1:
        lea     rdi, [rip + .L__unnamed_1]
        call    qword ptr [rip + core::panicking::panic@GOTPCREL]
        ud2

I'd like us to have the Rust version rather than the naive Swift version if we could.

@davezarzycki

This comment has been minimized.

Copy link

@davezarzycki davezarzycki commented Nov 8, 2019

If your example is simplified, then the code gen is almost optimal:

https://swift.godbolt.org/z/ES6j7q

@Lukasa

This comment has been minimized.

Copy link

@Lukasa Lukasa commented Nov 8, 2019

Sadly your simplification makes the program incorrect. Specifically, you cannot safely perform an unchecked addition of the carry bit in case that addition itself overflows. You can see this by running your code with these extensions:

var s1 = UInt128(high: 0, low: 1)
let s2 = UInt128(high: .max, low: .max)
s1.add(s2)
print("high: \(s1.high), low: \(s1.low)")

This will print `high: 0, low: 0", which indicates that we have overflowed, but the program did not crash, though it should have. The Rust program correctly crashes in this case.

@stephentyrone

This comment has been minimized.

Copy link
Member Author

@stephentyrone stephentyrone commented Nov 8, 2019

What are the odds that a generically written UInt128 will be optimised into use of the specialised instructions available on some supported platforms?

If for some reason this weren’t possible, we simply wouldn’t use a generic implementation. Part of the API contract for such a type should be that it is essentially optimal.

Good news, though: it is possible =)

@Lukasa

This comment has been minimized.

Copy link

@Lukasa Lukasa commented Nov 8, 2019

Thanks for clarifying @stephentyrone, that's good to know, especially as this was a limitation with DoubleWidth back when that was floating around.

@davezarzycki

This comment has been minimized.

Copy link

@davezarzycki davezarzycki commented Nov 8, 2019

@Lukasa – Whoops. You're right. I was too focused on getting the right ASM out. Nevertheless, both Rust and Swift use LLVM, and the standard library integers are "just" a wrapper around LLVM intrinsics. We just need an intrinsic or pattern of intrinsics that can reliably lower to an "ADC" on x86 (or similar instructions on other ISAs). Rust is either using such an intrinsic, or their just using LLVM's i128 type. Swift can do the same. This isn't hard.

@Lukasa

This comment has been minimized.

Copy link

@Lukasa Lukasa commented Nov 8, 2019

Swift can do the same. This isn't hard.

I agree, unless the requirement is actually to use an intrinsic. If it is, writing the code generically gets pretty gnarly as you require compile time type checking to replace the generic slow path with the intrinisic fast path.

I also do not know how to use LLVM intrinsics safely from within a Swift package. If that’s straightforward then all is well, but if it isn’t then that raises further questions.

Hence the question: do we believe that LLVM can be convinced, either by the generic code or by means of LLVM intrinsic calls from a swift package? So far both you and @stephentyrone have said “yes”. You two are both more expert than I am, and so I believe you both, but I do think it is telling that so far no-one has actually convinced the Swift compiler to emit the pattern we want.

@Lukasa

This comment has been minimized.

Copy link

@Lukasa Lukasa commented Nov 8, 2019

I want to stress here: I am satisfied that my desired outcome is possible. I really mean it when I say I believe you both! What I am trying to say here is that I do not see how to do it (a personal limitation, not something that limits either of you), and I haven’t seen it done yet, so I am currently operating entirely on faith, not evidence. 👍

@davezarzycki

This comment has been minimized.

Copy link

@davezarzycki davezarzycki commented Nov 8, 2019

Okay, more old-school, but it works this time. The compiler fails to elide a CMP instruction and a MOV instruction, but otherwise it's "perfect":

https://swift.godbolt.org/z/JHDVUq

@Lukasa

This comment has been minimized.

Copy link

@Lukasa Lukasa commented Nov 8, 2019

Yup, that looks a lot better. It’s a shame about the cmp but it seems like something that could be optimised away at least in principle.

@stephentyrone

This comment has been minimized.

Copy link
Member Author

@stephentyrone stephentyrone commented Nov 8, 2019

The compiler fails to elide a CMP instruction and a MOV instruction

This would be a good bug to file if you can spare a minute to write it up (if not, let me know and I'll do it). I've probably already reported an equivalent issue, but I'm not finding it right now.

@davezarzycki

This comment has been minimized.

Copy link

@davezarzycki davezarzycki commented Nov 10, 2019

Looks like various carry chain related bugs exist:

https://bugs.llvm.org/buglist.cgi?quicksearch=addcarry

@stephentyrone

This comment has been minimized.

Copy link
Member Author

@stephentyrone stephentyrone commented Nov 10, 2019

A quick reading suggests that none of those is a great fit for this particular issue, which looks (to me) considerably easier to resolve.

@davezarzycki

This comment has been minimized.

Copy link

@davezarzycki davezarzycki commented Nov 10, 2019

We're drifting way, way off topic. I looked into this. I think LLVM bug 36243 and 39464 are the same bug, and the same bug as this bug. In short, LLVM doesn't model status flags during generic (not target specific) abstract optimization passes. This design tradeoff has pros and cons. The "pro" is many programming languages (and some processors) don't expose status flags, so the compiler has to recognize idioms that are effectively checks of status flags. But the con is that LLVM easily "forgets" that a status flag might hold the answer it wants and then creating redundant "TEST" or "CMP' instructions.

@stephentyrone

This comment has been minimized.

Copy link
Member Author

@stephentyrone stephentyrone commented Nov 11, 2019

(Hopefully not going too much further into the weeds) I think they're broadly the same class of bug, I just think that this particular case may be easier to address without completely solving the general problem. I'll poke around at some nebulous future point and see if that actually turns out to be true.

@davezarzycki

This comment has been minimized.

Copy link

@davezarzycki davezarzycki commented Nov 11, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.