Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for 128-bit integers #1504

Merged
merged 6 commits into from Jul 29, 2016

Conversation

Projects
None yet
@Amanieu
Copy link
Contributor

Amanieu commented Feb 21, 2016

Relevant issue: #521

Rendered

[edited to link to final rendered version]

@ticki

This comment has been minimized.

Copy link
Contributor

ticki commented Feb 21, 2016

👍

They were originally removed citing the lack of compiler backend support, however this was incorrect, they are completely supported by LLVM, so there should not be anything blocking it.

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Feb 21, 2016

they are completely supported by LLVM

AFAIR, the actual (underlying) reason was LLVM’s support for operations on f128 being slightly(?) broken. Since the 128 floats were being removed, 128 integers went along with it.

@Amanieu

This comment has been minimized.

Copy link
Contributor Author

Amanieu commented Feb 21, 2016

I don't think Rust has ever had support for 128-bit integers. The f128 stuff was completely separate.

@ticki

This comment has been minimized.

Copy link
Contributor

ticki commented Feb 21, 2016

Right, I misread the title as 128-bit floats.

# Detailed design
[design]: #detailed-design

From a quick look at Clang's source, 128-bit integers are supported on all 64-bit platforms and a few 32-bit ones (those with 64-bit registers: x32 and MIPS n32). To allow users to determine whether 128-bit integers are available, a `target_has_int128` cfg is added. The `i128` and `u128` types are only available when this flag is set.

This comment has been minimized.

@nagisa

nagisa Feb 21, 2016

Contributor

LLVM supports arbitrary sized integers (and operations on them) and will software-emulate operations on them if necessary – as inefficient as they get. LLVM produces perfectly correct assembly even on targets where clang claims there’s no support, so looking at what clang does is not the way to go here IMO.

This comment has been minimized.

@ticki

ticki Feb 21, 2016

Contributor

Furthermore, the emulation (which might or might not happen) is often performing pretty decent.

This comment has been minimized.

@Amanieu

Amanieu Feb 21, 2016

Author Contributor

I just ran a few quick tests on LLVM bitcode with i128 and i192 (test.ll), here are my results (LLVM 3.7.1):

  • add and sub are compiled down to a sequence of add with carry instructions for both i128 and i192.
  • mul and udiv are compiled to a call to __multi3 and __udivti3 for i128. These functions are implemented in compiler-rt.
  • mul and udiv generate an LLVM error for i192: LLVM ERROR: Unsupported library call operation!
  • compiler-rt contains variations of each function for 32, 64 and 128-bit integers (__mulsi3, __muldi3, __multi3). However, the 128-bit versions are only present on 64-bit platforms (#ifdef __LP64__).

In conclusion, LLVM doesn't support many operations on arbitrary sized integers. While we could in theory support 128-bit integers on all architectures, this would require extending compiler-rt on 32-bit architectures.

This comment has been minimized.

@nagisa

nagisa Feb 21, 2016

Contributor

Aha! I didn’t think of multiplication. How silly of me. I’m not too surprised about i192 – it’s not technically a power-of-two-bytes, but it certainly comes across as strange to me that there’s no generic-implementation-in-C (even if not fast at all) for more complex operations like multiplication for at least i128.


One possible complication is that primitive types aren't currently part of the prelude, instead they are directly added to the global namespace by the compiler. The new `i128` and `u128` types will behave differently and will need to be explicitly imported.

Another possible issue is that a `u128` can hold a very large number that doesn't fit in a `f32`. We need to make sure this doesn't lead to any `undef`s from LLVM.

This comment has been minimized.

@nagisa

nagisa Feb 21, 2016

Contributor

I don’t really see how that could be a problem in practice. If a u128 doesn’t fit into f32, a +∞ will happen which is a perfectly valid value?

This comment has been minimized.

@Amanieu

Amanieu Feb 21, 2016

Author Contributor

See this comment. It seems this is actually treated as undefined behavior by LLVM.

This comment has been minimized.

@nagisa

nagisa Feb 21, 2016

Contributor

Right, in my tests so far an inf will get printed for the specified target. Here’s my thoughts:

I want to require this to reliably result in ∞ on “overflow” regardless of what LLVM thinks the right behaviour should be (apparently UV). I’m inclined to think this is in no way related to deficiencies in LLVM itself, but rather a quirk in compiler-rt.

Uninteresting-ly enough, compiling and linking the aforementioned file for i386 target fails with undefined reference to '__floatuntisf'. I don’t want this feature landing in a “works here, but not there” state, when it is by all means technically possible to make it work regardless of target capabilities, by employing some more or less expensive software emulation. If compiler-rt does not support it we can roll our own compiler-rt which supports our tier one-to-three platforms for all necessary operations or wait/implement the necessary support in the compiler-rt.

This comment has been minimized.

@Amanieu

Amanieu Feb 21, 2016

Author Contributor

The undef only occurs when the input value is a constant: https://gist.github.com/Amanieu/f87da5f0599b343c5500

Regarding supporting i128 on all architectures, I agree with you. LLVM already has full support for this, it's just compiler-rt that needs to be modified to support 128-bit integer operations on 32-bit systems. I'll update the RFC accordingly.

# Drawbacks
[drawbacks]: #drawbacks

It adds a type to the language that may or may not be present depending on the target architecture. This could lead to surprises, bu

This comment has been minimized.

@nagisa

nagisa Feb 21, 2016

Contributor

“bu”…?

# Unresolved questions
[unresolved]: #unresolved-questions

How should 128-bit literals be handled? The easiest solution would be to limit integer literals to 64 bits, which is what GCC does (no support for `__int128` literals).

This comment has been minimized.

@nagisa

nagisa Feb 21, 2016

Contributor

If we add support for i128, we must, in some way, introduce some support for i128 calculations in the compiler anyway for cases like const X: i128 = !0, so it seems to me like supporting long literals is the smallest of all implementation woes in this regard.

This comment has been minimized.

@Aatch

Aatch Feb 22, 2016

Contributor

LLVM has some APInt and APFloat types we could use. Though we could also make our own.

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Feb 21, 2016

All in all, I’m all +1 for the general idea, but I’m not so keen on following what clang does. The RFC states the integers are not (can’t be?) supported on certain platforms. If there’s specific examples of such platforms, it could be good to know some, because otherwise it seems like LLVM’s emulation covers the ground.

@Amanieu Amanieu force-pushed the Amanieu:int128 branch from b6aa15a to c3b4a8d Feb 21, 2016

@Amanieu

This comment has been minimized.

Copy link
Contributor Author

Amanieu commented Feb 21, 2016

As I said in my comment above, it seems that LLVM's emulation does not cover all the needed operations.

@Amanieu

This comment has been minimized.

Copy link
Contributor Author

Amanieu commented Feb 22, 2016

I've updated the RFC to support 128-bit integers on all architectures, even 32-bit ones. This will require implementing some functions that compiler-rt doesn't provide, but this shouldn't be too much trouble.

# Drawbacks
[drawbacks]: #drawbacks

One possible complication is that primitive types aren't currently part of the prelude, instead they are directly added to the global namespace by the compiler. The new `i128` and `u128` types will behave differently and will need to be explicitly imported.

This comment has been minimized.

@petrochenkov

petrochenkov Feb 22, 2016

Contributor

It's still possible to improve name resolution for primitive types and prelude backward compatibly.
I'd define the new i128/u128 in some dedicated module like core::primitives, there's a good chance other primitive types will be defined there as well in some time.

This comment has been minimized.

@ticki

ticki Feb 22, 2016

Contributor

Adding primitive types is a backwards compatible thing to do (for 99.999% of all code).

This comment has been minimized.

@petrochenkov

petrochenkov Feb 22, 2016

Contributor

It'll break all the i128/u128 emulations using the same names, like https://crates.io/crates/extprim, and their dependencies. (Also, using playpen often prevents posting misinformation.)

This comment has been minimized.

@ticki

ticki Feb 22, 2016

Contributor

u128, i128 are not keywords, so that one is ruled out. However, defining types shadowing primitive types results in E0317, however I think this is justified, since the crates depending on crates using these emulations can easily drop the dependencies, without changing any code (except use and extern crate).

@yberreby

This comment has been minimized.

Copy link

yberreby commented Feb 22, 2016

👍

@retep998

This comment has been minimized.

Copy link
Member

retep998 commented Feb 22, 2016

Yes please, I'd love to have 128 bit integers.

@hauleth

This comment has been minimized.

Copy link

hauleth commented Feb 22, 2016

It could be behind stable feature gate that would give user more descriptive output when compiling on unsupported platform.

@Amanieu

This comment has been minimized.

Copy link
Contributor Author

Amanieu commented Feb 22, 2016

It could be behind stable feature gate that would give user more descriptive output when compiling on unsupported platform.

That shouldn't be needed, this RFC proposes adding 128-bit integers for all architectures. LLVM is capable of doing this if we implement some runtime library functions.

@ubsan

This comment has been minimized.

Copy link
Contributor

ubsan commented Feb 22, 2016

Seems good to me. I'd love to have 128 bit integers in Rust (if we're sure we're able to support it on all platforms).

@gnzlbg

This comment has been minimized.

Copy link
Contributor

gnzlbg commented Feb 23, 2016

what are the advantages of this RFC over the emulations we currently have?

@ticki

This comment has been minimized.

Copy link
Contributor

ticki commented Feb 23, 2016

@gnzlbg Performance on supported platforms. It will be an order of magnitude faster in some cases.

@hauleth

This comment has been minimized.

Copy link

hauleth commented Feb 23, 2016

That x86-64 and some other CPUs already have native support for 128-bit integers. IIRC Intel provides support for 256 and 512-bit integers in their new procesors.

Łukasz Jan Niemier

Dnia 23 lut 2016 o godz. 08:45 gnzlbg notifications@github.com napisał(a):

what are the advantages of this RFC over the emulations we currently have?


Reply to this email directly or view it on GitHub.

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Feb 23, 2016

IIRC Intel provides support for 256 and 512-bit integers in their new procesors.

Aren’t you confusing it with the AVX/SSE? I’m not aware of any features/instructions even in the newest intel processors which would allow for operations on 128/256/512-bit integers.

@sdroege

This comment has been minimized.

Copy link

sdroege commented Feb 23, 2016

There's at least divq, idivq, mulq, imulq on all x86-64 for 128 bit integer multiplication/division with 64 bit operands

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Feb 23, 2016

@sdroege Right, x86 does multiplication and division on double-word-sized operands to allow for all results and operands to fit. This is true all the way back to the 8086 where multiplication result was dx:ax = 32 bits.

What you cannot do, though, is given a 128bit number in rdx:rax add it to another 128 bit immediate or shift rdx:rax by 76 or any other common operation in a single instruction. This is what I’d consider a native support for 128-bit integers.

@sdroege

This comment has been minimized.

Copy link

sdroege commented Feb 23, 2016

@nagisa True (well, it's not available in old 32 bit x86, is it?). Fortunately that's all I need for my own use case here https://github.com/sdroege/rust-muldiv

But as long as LLVM emulates i128/u128 and/or all (other) operations are implemented in Rust for now, this seems like a useful feature to have

@gnzlbg

This comment has been minimized.

Copy link
Contributor

gnzlbg commented Feb 23, 2016

@ticki wrote:

@gnzlbg Performance on supported platforms. It will be an order of magnitude faster in some cases.

Why can't the current crates call the llvm intrinsics directly?

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Feb 23, 2016

@gnzlbg because add i128* %x, i128 123 is not an intrinsic in the first place.

@gnzlbg

This comment has been minimized.

Copy link
Contributor

gnzlbg commented Feb 23, 2016

Oh, I thought llvm would offer intrinsics for 128bit integers in the same way it does offer intrinsics for SIMD.

@Amanieu

This comment has been minimized.

Copy link
Contributor Author

Amanieu commented Mar 30, 2016

@alexcrichton
I have updated the RFC with your feedback. It should now explain better what the implementation plan for this feature is.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 31, 2016

It seems that we'll have to implement the library routines in Rust for non-64-bit platforms, right? In that case, is there an implementation of these functions that we can use already or are we gonna be forging new territory here?

I guess one question I might also have is why does C not support int128 on 32-bit platforms? Maybe that's a reason we should be wary of such as well?

@Amanieu

This comment has been minimized.

Copy link
Contributor Author

Amanieu commented Mar 31, 2016

It seems that we'll have to implement the library routines in Rust for non-64-bit platforms, right? In that case, is there an implementation of these functions that we can use already or are we gonna be forging new territory here?

We can just copy/port the existing implementations in compiler-rt.

I guess one question I might also have is why does C not support int128 on 32-bit platforms? Maybe that's a reason we should be wary of such as well?

I'm not quite sure, but my best guess would be that it was done this way in Clang for compatibility with GCC.

@jmesmon

This comment has been minimized.

Copy link

jmesmon commented Mar 31, 2016

Regarding common C impls not having 128 bit integers when targeting 32bit machines, it might be a gambit to avoid enlarging the size of uintmax_t and intmax_t (which would potentially slow down operations on those types). It could also be a compatibility issue around users of those types in public APIs of libraries (though I'd really hope no one had done that, it certainly is possible).

@Amanieu

This comment has been minimized.

Copy link
Contributor Author

Amanieu commented Mar 31, 2016

Regarding common C impls not having 128 bit integers when targeting 32bit machines, it might be a gambit to avoid enlarging the size of uintmax_t and intmax_t (which would potentially slow down operations on those types). It could also be a compatibility issue around users of those types in public APIs of libraries (though I'd really hope no one had done that, it certainly is possible).

That's not it since uintmax_t is a 64-bit type on all systems (at least for GCC and LLVM). My guess is that GCC just felt that supporting 128-bit integers on 32-bit systems was too much trouble.

@jmesmon

This comment has been minimized.

Copy link

jmesmon commented Mar 31, 2016

@Amanieu huh, I didn't expect that. Thinking it was too much trouble sounds like a reasonable guess. Perhaps they were worried someone might then want 256-bit integers on 64 bit machines? :)

@Amanieu Amanieu referenced this pull request May 12, 2016

Closed

Generic atomic v2 #1477

@aturon aturon added the I-nominated label Jun 2, 2016

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Jun 17, 2016

The linked atomic thread explains this in detail, but this currently blocks Atomic{U,I}128 (which has native support on x86_64 btw). Now the atomic intrinsics could be hacked up to avoid depending on this in the first place, but that defeats the cleanliness of a pure-rust software {u,i}128. So I vote +1 on the RFC.

CC @ezrosent

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Jul 11, 2016

This RFC is now entering is final comment period.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jul 11, 2016

I personally feel that the addition of i128 and u128 are inevitable in the sense that they're relatively-well supported in LLVM, platforms have enough native support to motivate the speedup over other implementations, and they're a natural extension of the existing primitive types. In that sense this seems reasonable to me as an idea to have new primitive types in the language (pending the shadowing support in resolve).

Implementation-wise I would predict that this is going to take a lot of work to do well. Much of this RFC leans on "compiler-rt will solve our problems" but historically compiler-rt has been very very difficult to work with, especially when it comes to portability. It seems that the correctness of a number of operations (like multiplication) would rely on the implementations in compiler-rt, which I'm not sure how well battle-tested they are in LLVM.

That being said, I don't personally feel that it should necessarily block this RFC per se. I would still prefer to have a relatively high quality implementation before merging (as opposed to just adding a bunch of "if bits == 128" cases in the compiler), but that's a matter for another time!

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Jul 11, 2016

I'm in agreement with @alexcrichton. To clarify the process here, I think we can accept the RFC but with some specific stipulations about the implementation, which perhaps @alexcrichton could elaborate.

@Amanieu

This comment has been minimized.

Copy link
Contributor Author

Amanieu commented Jul 11, 2016

Note that as it is currently written, this RFC does not propose modifying the compiler-rt code. Instead some functions will need to be implemented in Rust because they are required by LLVM on 32-bit platforms and not provided by compiler-rt on those platforms.

@rkruppe

This comment has been minimized.

Copy link
Member

rkruppe commented Jul 11, 2016

However, this meshes well with the plans discussed (though by no means made official, IIUC) in #34400: Eventually replacing compiler-rt with pure Rust code.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jul 11, 2016

My threshold for "this is feature-complete enough to merge as unstable" would be to get a program like past our CI (x86/x86_64 on three major platforms, arm Android on one):

fn main() {
    let a: i128 = 1;
    println!("{}", a);
    println!("{}", a + 2);
    println!("{}", a + a);
    println!("{}", a * a);
    println!("{}", a / 5);
    println!("{}", a / -1);
    println!("{}", a << 127);
    assert_eq!(a as u64, 1);
    assert_eq!(a, 1u32 as i128);
    let a: u128 = 147573952589676412928; // 2^67
    println!("{}", a);
    assert_eq!(a, a);
}

Basically:

  • Literals work. Literals that don't fit into u64 would be a bonus but I could see how they may not be required in the first implementation pass.
  • All basic arithmetic operations work and produce expected results.
  • The types have all common traits: Clone, Display, Debug, Default, Eq, Ord, etc
  • Conversions to/from other integral types works
  • The perf wins you'd expect to see are all visible, that is, u128 is much faster than [u64;2]
@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jul 18, 2016

I agree with @alexcrichton's comments; and of course we can also use the stability bar to gate if we feel there are flaws in the implementation, I imagine.

@Amanieu

This comment has been minimized.

Copy link
Contributor Author

Amanieu commented Jul 18, 2016

I started working on an implementation of this, but progress has stalled since I'm going to be very busy finishing my thesis in the next few months. In any case the work didn't seem too complicated, it's just a matter of replacing most uses of u64 in the compiler with u128 and providing a emulated u128/i128 types for stage0.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jul 29, 2016

Huzzah! The @rust-lang/lang team has decided to accept this RFC.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jul 29, 2016

Tracking issue: rust-lang/rust#35118

If you'd like to keep following the development of this feature, please subscribe to that issue, thanks! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.