Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: add support for int128 and uint128 #9455

Open
runner-mei opened this issue Dec 27, 2014 · 112 comments
Open

proposal: spec: add support for int128 and uint128 #9455

runner-mei opened this issue Dec 27, 2014 · 112 comments

Comments

@runner-mei
Copy link

@runner-mei runner-mei commented Dec 27, 2014

No description provided.

@mikioh mikioh changed the title can add int128 and uint128 support spec: add support for int128 and uint128 Dec 27, 2014
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Dec 27, 2014

Can you provide a real-life use case?

@twoleds
Copy link

@twoleds twoleds commented Feb 12, 2015

it's good for UUID, IPv6, hashing (MD5) etc, we can store IPv6 into uint128 instead byte slice and do some mathematics with subnetworks, checking range of IP addresses

@minux
Copy link
Member

@minux minux commented Feb 12, 2015

These use cases are not strong enough to justify adding 128-bit types,
which is a big task to emulate it on all targets.

  1. MD5 is not secure anymore, so there is little benefit adding types to
    store its result.
  2. How often do you need to manipulate a UUID as a number rather than a
    byte slice (or a string)?
  3. The other use cases can be done with math/big just as easy.

Also note that GCC doesn't support __int128 on 32-bit targets and Go do
want consistent language features across all supported architectures.

@twoleds
Copy link

@twoleds twoleds commented Feb 13, 2015

I agree with you there aren't a lot of benefits for int128/uint128, maybe a little better performance for comparing and hashing in maps when we use uint128 for storing UUID/IPv6 because for byte slices or string we need do some loops and extra memory but it isn't important I think

@runner-mei
Copy link
Author

@runner-mei runner-mei commented Jan 18, 2017

I stat all interface flux of a device in one day.

@rsc rsc changed the title spec: add support for int128 and uint128 proposal: spec: add support for int128 and uint128 Jun 20, 2017
@rsc rsc added the Go2 label Jun 20, 2017
@the80srobot
Copy link

@the80srobot the80srobot commented Jul 12, 2017

In addition to crypto, UUID and IPv6, int128 would be enormously helpful for volatile memory analysis, by giving you a safe uintptr diff type.

@iMartyn
Copy link

@iMartyn iMartyn commented Oct 2, 2017

It also just makes code that much more readable if you have to deal with large IDs e.g. those you get back from google directory API amongst others (effectively they're uuids encoded as uint128).
Obviously you can use math/big but it makes the code much harder to reason about because you have to parse the code mentally first, distracting you from reading the code.

@ericlagergren
Copy link
Contributor

@ericlagergren ericlagergren commented Dec 22, 2017

Adding a data point: ran into a situation with a current project where I need to compute (x * y) % m where x*y can possibly overflow and require a 128-bit integer. Doing the modulus by hand for the high and low halves is needlessly complicated.

@jfesler
Copy link

@jfesler jfesler commented Jan 6, 2018

Another +1 for both IPv6 and UUID cases.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 9, 2018

The examples of UUID and IPv6 are not convincing to me. Those types can be done as a struct just as easily.

It's not clear that this is worth doing if processors do not have hardware support for the type; are there processors with 128-bit integer multiply and divide instructions?

See also #19623.

@ericlagergren
Copy link
Contributor

@ericlagergren ericlagergren commented Jan 10, 2018

@ianlancetaylor I do not think so. GCC seems to use the obvious 6 instructions for mul, 4 for add and sub, and a more involved routine for quo. I'm not how anybody could emulate mul, add, or sub that precisely (in Go) without assembly, but that prohibits inlining and adds function call overhead.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 10, 2018

The fact that the current tools can't yet inline asm code is not in itself an argument for changing the language. We would additionally need to see a significant need for efficient int128 arithmetic.

If there were hardware support, that in itself would suggest a need, since presumably the processor manufacturers would only add such instructions if people wanted them.

@ericlagergren
Copy link
Contributor

@ericlagergren ericlagergren commented Jan 10, 2018

If there were hardware support, that in itself would suggest a need

A need that—presumably—compilers couldn't meet by adding their own 128-bit types, which they have. I mean, for all but division it's a couple extra instructions. For most cases that's been sufficient.

I confess I'm not an expert on CPU characteristics, but my understanding is much of the driving force behind adding larger sizes was the ability to address more memory. That makes me think general 128-bit support is rather unlikely.

Yet major compilers have added support (GCC, Clang, ICC, ...) for C and C++. Rust has them because of LLVM. Julia has them as well.

Other languages and compilers having support isn't sufficient reason to make a language change, sure. But it's evidence there exists a need other than simply UUIDs.

Their domain seems to lie in cryptography and arbitrary-precision calculations, for now.

@FlorianUekermann
Copy link
Contributor

@FlorianUekermann FlorianUekermann commented Jan 11, 2018

Additional usecases are timestamps, cryptographic nonces and database keys.

Examples like database keys, nonces and UUID represent a pretty large collection of applications where keys/handles can't ever be reused or number ranges can't overlap.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jan 11, 2018

@FlorianUekermann People keep saying UUID, but I see no reason that a UUID could not be implemented using a struct. It's not like people use arithmetic on a UUID once it has been created. The only reason to add int128 to the language is if people are going to use arithmetic on values of that type.

@FlorianUekermann
Copy link
Contributor

@FlorianUekermann FlorianUekermann commented Jan 11, 2018

It's not like people use arithmetic on a UUID once it has been created

They do. UUIDs don't have to be random. Sequential UUIDs are common in databases for example. Combine sequential UUIDs with some range partitioning and you'll wish for integer ops in practice.

Still, timestamps seem like the most obvious example to me, where 64bit is not sufficient and the full range of arithmetic operations is obviously meaningful. Had it been available, I would expect that the time package contained some examples.

How big of an undertaking is the implementation of div? The rest seems rather straightforward.

@ericlagergren
Copy link
Contributor

@ericlagergren ericlagergren commented Jan 11, 2018

How big of an undertaking is the implementation of div?

The code for naïve 128-bit division exists in the stdlib already (math/big). The PowerPC Compiler Writer’s Guide has a 32-bit implementation of 64-bit division (https://cr.yp.to/2005-590/powerpc-cwg.pdf, page 82) that can be translated upwards.

@josharian
Copy link
Contributor

@josharian josharian commented Jan 11, 2018

Use case: [u]int128 can be used to check for overflow of [u]int64 operations in a natural way. Yes, this could make you want int256, but since int64 is the word size of many machines, this particular overflow matters a lot. See e.g. #21588. Other obvious options to address this use case are math/bits and
#19623.

Somewhat related use case: #21835 (comment).

@justmax437
Copy link

@justmax437 justmax437 commented Jul 29, 2021

128 bits arithmetic would definitely help me in my current project, where i have to basically compute something like UUID mod N to efficiently distribute events between N instances of handler replicas.

By now I only see an option of using math/big package to perform such computation without involving cgo or assembly, but big.Int performance is not enough for some cases. So here it is, a use case of arithmetics required on what's basically a random UUID's.

@phuclv90
Copy link

@phuclv90 phuclv90 commented Jul 31, 2021

128 bits arithmetic would definitely help me in my current project, where i have to basically compute something like UUID mod N to efficiently distribute events between N instances of handler replicas.

@justmax437 in this case it's easy, just hash the low and high parts separately then combine:

h := high64 % N
l :=  low64 % N
bucket := (h ^ l) % N

@justmax437
Copy link

@justmax437 justmax437 commented Jul 31, 2021

128 bits arithmetic would definitely help me in my current project, where i have to basically compute something like UUID mod N to efficiently distribute events between N instances of handler replicas.

@justmax437 in this case it's easy, just hash the low and high parts separately then combine:

h := high64 % N
l :=  low64 % N
bucket := (h ^ l) % N

Yeah, i know, but i still would like to see some native way to do this, either thru SSE/AVX or 128 bit integer support. We already have complex128 type, which is, well, more complex than plain integers.

@phuclv90
Copy link

@phuclv90 phuclv90 commented Aug 1, 2021

Yeah, i know, but i still would like to see some native way to do this, either thru SSE/AVX or 128 bit integer support. We already have complex128 type, which is, well, more complex than plain integers.

@justmax437 no it's impossible to use SSE/AVX for 128-bit integer support, because they're meant for SIMD operations in parallel. There's no way to propagate the carry quickly from the lower part to the high significant part. See

They're useful to operate on multiple 128-bit integers in parallel though. But for a single integer scalar operations are still much faster. complex128 isn't a single integer but 2 numbers so it's much easier to work on, as each number fits in a single register

@vikmik
Copy link
Contributor

@vikmik vikmik commented Sep 2, 2021

  1. +1 to the "database" use-case. Databases use large integer types (I'll add ClickHouse to the list: https://clickhouse.tech/docs/en/sql-reference/data-types/int-uint/ ) but using them from Go requires a bunch of boilerplate code, and some drivers opt for the route of not supporting them at all because it's all quite messy.

  2. Also, hash algorithms often output a 128bit or 256bit output, which are often used as identifiers / DB columns. 64 bits does not provide enough collision protection ( https://preshing.com/20110504/hash-collision-probabilities/ ) in a lot of use cases - so the use of custom larger integer types is common. Unfortunately, the language makes it tempting to use 64bit content hashes just because that removes the need for a lot of boilerplate code - even though it is not a good practice.

Indeed, not being able to have a canonical type for hash values stored in DBs attracts a good share of complexity, like:

  • error handling when dealing with []byte / [n]byte <-> [custom library type] conversions. For example, how to handle nil when the target type is a struct? These concerns tend to leak in many places. It's easy to write code that causes a panic, or on the contrary to write overly-cautious code with handling of errors that cannot happen. Similarly, the Golang concept of "zero value" does not always translate well, or can be ambiguous
  • dealing with [16]byte conversions from []byte. DB drivers and libraries may provide one or the other - so we're often left with having to do awkward invocations of copy() in a compatibility layer
  • All this is exacerbated when using multiple DBs at the same time (time series, K/V store, relational), or multiple libraries that use their own custom type. It causes the proliferation of compatibility layers, which can have different failure modes.
  1. Lastly, doing 64bit arithmetic can sometimes be made quite difficult and/or unsafe when underflows / overflows are possible. Using math/big for this feels like using a rather inefficient tank, and underflow/overflow-safe code can be tricky to write and test. An integer type > 64 bits would be welcome, just as a safe / straightforward / fast way to compute 64 bit values with intermediate steps that can overflow.

@robpike
Copy link
Contributor

@robpike robpike commented Sep 2, 2021

It is not clear to me that the benefit comes close to outweighing the cost, which is substantial when rippled through all the code that must be updated and added to push these types through the library. This is not a small change, this is adding two new basic types to the language, which would touch huge swaths of the compiler, library, and community packages.

Obviously it would be "useful" (people don't propose useless things in earnest), and it might be convenient for some tasks, but not many, and not often. Having survived the transition from 16 to 32, and from 32 to 64, both of which seemed necessary at the time, there is clearly zero urgency for a transition to 128; it would be analogous to having a 64-bit type in the 32-bit era. But to be honest, far less widely used.

Rather than push towards adding another pair of locked-down types, and given the lack of urgency, I suggest instead that we think about other ways to grow the integer types, either by an extension mechanism that lets one define an integer of any size (which I first saw in PL/I; it's nothing new) or the thing I still feel is best: Making int have arbitrary precision, as described in #19623. That approach achieves far more and puts the whole topic to bed.

@JAicewizard
Copy link

@JAicewizard JAicewizard commented Sep 2, 2021

Although I do like this proposal, and would be in favour, I do not think it would be a good replacement for uint128.

I think waiting to be closer to real hardware support for 128 bit integers before implementing this would be a better approach.
With ice lake only lately supporting 57 bit address ranges it might take another decade (or 2) before we actually start to need 128 bits for addressing. However it could be that we actually start to see hardware support sooner than that.

@ethindp
Copy link

@ethindp ethindp commented Sep 6, 2021

@robpike This is a pretty nonsensical excuse not to implement 128-bit integers. It doesn't even hold up under any kind of scrutiny. Libraries will use them when they want to. You don't need to update any code at all. Let the authors of the code in question do that. If it requires a breaking change in stdlib, add a new method that does the same thing as the one you need to break and deprecate the old one. People can keep using the old one if they want, and you keep the golang compatibility promise as well.

I don't understand why we're still debating this issue. Do I have to point out that at least GCC implements 128-bit integers, DWORD integers, various floating-point types and even half-precision floating-point types? Or that Rust implements the i128 and u128 types? How about the fact that LLVM allows arbitrary bit integers up to 2^23 bits in size? Or that Ada allows you to limit the ranges of integral types (theoretically with no limit) via subtype declarations?

We're not asking for all of these features to be added. Many would say that adding things like double-word integers or half-precision FP types would be a bit too much for now. We're asking for 128-bit integers to be added and have provided (many) different examples where they're useful (including the fact that using the bigint package is a pain).

Furthermore, this isn't even a "big" language change. It will negatively impact nobody if its added. If you don't want to use it, don't. Simple as that. This is a change that will only be used when people need it. You might need software emulation to do it but that's where you take advantage of things like what LLVM provides you (or you take inspiration from them).

I hate to rant like this, but this discussion has gone back and forth and everyone is just dragging their feet and nothing has even been attempted from what I can tell. All I see from my POV are excuse after excuse after excuse about how its (supposedly) a bad idea, which is funny because if it was such a bad idea, then you wouldn't have that functionality in GCC and LLVM/Rust at minimum. The folks who work on GCC and LLVM aren't idiots. They're incredibly intelligent people. They don't add features like n-sized integers (that have a limit that no computer is going to reach within the next century) for no reason. Clearly, if they added it, then why not take advantage of it?

I would understand not doing so if the type in question had no use cases. But a number of use cases have been presented on this (very long, very overdrawn) issue that are very legitimate and would greatly ease the implementation of various features including databases and IPv6. There are no excuses for not adding this type at this point. If you want to know how to actually do it, go look at how GCC or LLVM do it.

Like I said, I'm sorry for ranting like this, and I apologize if I went too far in this comment. But this is utterly ridiculous. This (does not) violate the golang compatibility promise in any way. It (will not) cause any major problems in the go ecosystem because those who want to use it will use it and those who don't won't use it. Considering that there are (many) practical uses for this type, as very clearly demonstrated by a lot of other comments in this issue in support of this proposal, there is absolutely no reason not to accept this proposal and get it added in, perhaps, go 1.18 or 1.19. You have lots of sources and code and resources available to you if you want to know the internals of how to do this properly.

@smasher164
Copy link
Member

@smasher164 smasher164 commented Sep 7, 2021

@ethindp This is the second time you've both apologized for ranting while continuing to rant anyways. Please keep the discussion civil, without accusing other replies of being "nonsensical" or "ridiculous".

No one is "dragging their feet" on this issue. Other languages and compilers adding a feature doesn't stand on its own as a reason for adding it to Go.

You say that we could just add [u]int128 and "keep everything else the same," but as @josharian replied in #9455 (comment), a concrete proposal for this feature should outline changes to the spec and implications for the standard library (regardless of backwards compatibility).

@robpike
Copy link
Contributor

@robpike robpike commented Sep 7, 2021

@ethindp Maybe I wasn't clear enough, but your comment doesn't address in a helpful way what I wrote, which I think was nonnonsensical and nonridiculous. It can be summed up as: If there is no urgent need for this feature, then it can wait while we look for a general approach that avoids adding yet another integer type pair to the language, but fixes the problem (whatever it is) for perpetuity.

Yes, it's easy conceptually just to add a couple of new types, although the cost as many have said is significant, but good design means looking for general solutions rather than just adding features. I'm arguing that we should respond to the problem, not the proposed solution.

@batara666

This comment was marked as off-topic.

@batara666

This comment was marked as off-topic.

@golang golang deleted a comment from chai2010 Sep 7, 2021
@seebs
Copy link
Contributor

@seebs seebs commented Sep 7, 2021

I think that if a change is going to add int128, that change should also either add int256 or make it clear to everyone when and how int256 would be added, if it were to be added later. Like, right now, it's unclear whether we would ever really want int256 (although I'm sure the people who worked on AVX would at least consider advocating for it), but I think that if int128 gets added, the process should also establish what the requirements are for considering new int types, and when/how they would be named, or how the namespace questions would be addressed, and so on.

Personally, I would like parametric integer types to be available and possible. It might be worth looking at all at what C did, although Go lacks C's namespace problems. On the other hand, I'd love to be able to express the distinction between "I need an integer of at least 32 bits, but whatever's efficient is fine" and "I want exactly 32 bits".

I would sort of like to be able to express things like 24-bit or 48-bit values more naturally, but I also worry about the hidden costs they'd carry on most modern hardware. Similarly, I'd sort of like saturating types, but worry about their hidden costs. And I don't think C's solution of "we describe how they would be specified but you don't have to provide them" is a great fit for Go; one of Go's strengths is that you don't have to worry about the hardware's specific capabilities. On the other hand, "you can definitely have a 24-bit integer type but we can't promise it's fast enough to be usable" is sort of awful. (But if you're on a hypothetical DSP with native 24-bit hardware available, it'd be NICE to be able to use that for values that don't fit in uint16, wouldn't it?)

Long story short, I think Rob's right and that this is a thing important enough to do well. I don't entirely like the state of not having uint128, but I will note that, except for multiplying int64 values, I've never wanted int128, only uint128.

@Bjohnson131
Copy link

@Bjohnson131 Bjohnson131 commented Sep 8, 2021

I'd just like to add that it seems like most arguments against int128 / uint128 are just a stall for the inevitable. They're becoming more commonly used every day. I do think that there's a lot of effort to go into this change, certainly it's not trivial for x86 systems, but when I hear that the work is not worth the payoff, I know that time is not on their side.

@chiro-hiro
Copy link

@chiro-hiro chiro-hiro commented Sep 8, 2021

Some sort of zk-STARKs are using 128 bits proof. The implement in Rust looks pretty good with uint128.

@Shaptic
Copy link

@Shaptic Shaptic commented Sep 27, 2021

Adding another real-world use case to the list: we use 128-bit integers to hold intermediate calculations about trade outcomes among 64-bit values. https://github.com/stellar/go/blob/f6989352abafe4b7a93d6e35c2eafaf040305091/exp/orderbook/pools.go#L106-L118

This code is a hot path in a tight loop that involves exploring massive orderbook graphs, so it needs to be as performant as possible. As of today, it uses math/big.Int, but obviously having native language support would result in code that is more performant, more readable, and less error-prone.

@CAFxX
Copy link
Contributor

@CAFxX CAFxX commented Oct 3, 2021

I'm still on the fence on this issue, but just for the sake of discussion another use case for uint128 would be providing hash.Hash128 to mirror the existing hash.Hash32 and hash.Hash64. Even without changing their signatures hash/fnv.New128 and hash/fnv.New128a would be obvious candidates to make use of the new type.

(In theory, Hash128 could be defined even today by returning a pair of uint64. While this was not discussed at the time I suspect the reason why this was not done had to do with the uncertainty surrounding whether we would eventually add 128bit types to the language.)

@lfaoro

This comment has been minimized.

@ernado
Copy link
Contributor

@ernado ernado commented Nov 17, 2021

Lack of built-in uint128 leads to [2]uint64 which can be sub-optimal.

Was re-writing CityHash from [2]uint64 to struct{ Low, High uint64 } recenetly.
This was very counter-intuitive optimization.

name            old time/op    new time/op    delta
CityHash64-32      336ns ± 0%     145ns ± 2%   -56.92%  (p=0.008 n=5+5)
CityHash128-32     353ns ± 0%     149ns ± 2%   -57.81%  (p=0.008 n=5+5)

name            old speed      new speed      delta
CityHash64-32   3.04GB/s ± 0%  7.07GB/s ± 2%  +132.16%  (p=0.008 n=5+5)
CityHash128-32  2.90GB/s ± 0%  6.87GB/s ± 2%  +137.06%  (p=0.008 n=5+5)

@Al2Klimov
Copy link

@Al2Klimov Al2Klimov commented Dec 8, 2021

Whether or not you'll add this, consider also float128. For more or less the same reasons and for symmetry.

@JAicewizard
Copy link

@JAicewizard JAicewizard commented Dec 11, 2021

I don't see a reason for float128? Almost all the reasons listed don't apply to floats. Its also offtopic, maybe a new issue should be opened?

@Bjohnson131
Copy link

@Bjohnson131 Bjohnson131 commented Dec 15, 2021

I don't see a reason for float128? Almost all the reasons listed don't apply to floats. Its also offtopic, maybe a new issue should be opened?

There's plenty of reasons for float-128s. And you're right, they're not listed / off topic here.

if an issue is created, feel free to link it here to redirect users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet