New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: change all int types to panic on wraparound, overflow #19624

Open
bcmills opened this Issue Mar 20, 2017 · 71 comments

Comments

Projects
None yet
@bcmills
Copy link
Member

bcmills commented Mar 20, 2017

I know this has been discussed before, but I didn't see a specific proposal filed for it yet and I think it's important.

Unexpected integer overflow can lead to serious bugs, including bugs in Go itself. Go's bounds-checking on slices and arrays mitigates some of the harmful effects of overflow, but not all of them. For example, programs that make system calls may pass data structures into the kernel, bypassing Go's usual bounds checks. Programs that marshal data-structures to be sent over the wire (such as protocol buffers) may send silently-corrupted data instead of returning errors as they ought to. And programs that use unsafe to access addresses with offsets are vulnerable to exactly the same overflow bugs as in C.

In my experience, Go programs and libraries are often written assuming "reasonable inputs" and no overflow. For such programs, it would be clearer for overflow to cause a run-time panic (similar to dividing by zero) rather than silently wrapping around. Even in the case where the unintended overflow is subsequently caught by a slice bounds check, reporting the error at the overflowing operation rather than the slice access would make the source of the bug easier to diagnose.

The potential performance impact of this proposal is similar to bounds-checking in general, and likely lower than using arbitrary-precision ints (#19623). The checks can be omitted when the compiler can prove the result is within bounds, any new branches will be trivially predictable (they'll occupy some CPU resources in the branch-predictor but otherwise add little overhead), and in some cases the checks might be able to use bounds-check instructions or other hardware traps.

For the subset of programs and libraries that intentionally make use of wraparound, we could provide one of several alternatives:

  1. "comma, ok" forms or "comma, carry" forms (#6815) that ignore overflow panics, analogous to how the "comma, ok" form of a type-assertion ignores the panic from a mismatched type.
  2. Separate "integer mod 2ⁿ" types (requiring explicit conversions from ordinary integer types), perhaps named along the lines of int32wrap or int32mod.
  3. Implicit wrapping only for unsigned types (uint32 and friends), since they're used for bit-manipulation code more often than the signed equivalents.

Those alternatives could also be used to optimize out the overflow checks in inner-loop code when the programmer has already validated the inputs by some other means.


[Edit: added this section in response to comments.]

Concretely, the proposed changes to the spec are:

Integer operators

For two integer values x and y, the integer quotient q = x / y and remainder r = x % y satisfy the following relationships:

[…]

As an exception to this rule, if the dividend x is the most negative value for the int type of x, the quotient q = x / -1 is equal to x (and r = 0).

[…]

The shift operators shift the left operand by the shift count specified by the right operand. They implement arithmetic shifts if the left operand is a signed integer and logical shifts if it is an unsigned integer. The result of a logical shift is truncated to the bit width of the type: a logical shift never results in overflow. Shifts behave as if the left operand is shifted n times by 1 for a shift count of n. As a result, x << 1 is the same as x*2 and x >> 1 is the same as x/2 but truncated towards negative infinity.

[…]

Integer overflow

If the result of any arithmetic operator or conversion to an integer type cannot be represented in the type, a run-time panic occurs.

An expression consisting of arithmetic operators and / or conversions between integer types used in an assignment or initialization of the special form

v, ok = expr
v, ok := expr
var v, ok = expr
var v, ok T1 = expr

yields an additional untyped boolean value. The value of ok is true if the results of all arithmetic operators and conversions could be represented in their respective types. Otherwise it is false and the value of v is computed as follows. No run-time panic occurs in this case.

For unsigned integer values, the operations +, -, *, and << are computed modulo 2ⁿ upon overflow, where n is the bit width of the unsigned integer's type. Loosely speaking, these unsigned integer operations discard high bits upon overflow, and programs may rely on ``wrap around''.

For signed integers, the operations +, -, *, and << are computed using two's complement arithmetic and truncated to the bit width of the signed integer's type upon overflow. No exception is raised as a result of overflow. A compiler may not optimize code under the assumption that overflow does not occur. For instance, it may not assume that x < x + 1 is always true.

If the dividend x of a quotient or remainder operation is the most negative value for the int type of x, evaluation of x / -1 overflows and its result upon overflow is equal to x. In contrast, evaluation of x % -1 does not overflow and yields a result of 0.

[…]

Conversions between numeric types

For the conversion of non-constant numeric values, the following rules apply:

  1. When converting between integer types, if the value is a signed integer, it is sign extended to implicit infinite precision; otherwise it is zero extended. If the value cannot be represented in the destination type, an overflow occurs; see the section on integer overflow. Upon overflow, the result is truncated to fit in the result type's size. For example, if v := uint16(0x10F0), then w, _ := uint32(int8(v)) results in w == 0xFFFFFFF0.

This proposal is obviously not compatible with Go 1, but I think we should seriously consider it for Go 2.

@dr2chase

This comment has been minimized.

Copy link
Contributor

dr2chase commented Mar 20, 2017

This simplifies bounds check elimination, since guarding-against/proving-impossibility-of overflow in indexing calculations can be tricky.

@randall77

This comment has been minimized.

Copy link
Contributor

randall77 commented Mar 20, 2017

Just a datapoint, the code in https://golang.org/src/runtime/hash64.go takes advantage of overflow in almost every line of source (for the uintptr type).

Maybe we can't do this until Go 2, but we can do experiments to get some data about it now. We could hack panic on overflow into the compiler today. What happens performance-wise with the go1 benchmarks? How many false positives are there? How many true positives are there?

@griesemer

This comment has been minimized.

Copy link
Contributor

griesemer commented Mar 20, 2017

From a programmer's point of view, proposal #19623 is a significant simplification (wrap around and overflow disappear, and the new int type is both simpler in semantics and more powerful in use), while this proposal is a significant complication (wrap around and overflow now panic, and the new int type is both more complex in semantics and more difficult to use).

One of Go's ideas is to reduce the intrinsic complexity of the programming language at hand while at the same time provide mechanisms that are general and powerful and thus enable programmer productivity and increased readability. We need to look at language changes not from a compiler writer's point of view, but from a programmer's productivity point of view. I think this proposal would be a step backward in the philosophy of Go.

I am also not convinced about the claim that overflow checking is "likely much lower than using arbitrary-precision ints": The cost of arbitrary precision ints is there when one actually uses them, otherwise their cost is similar to what needs to be done for bounds/overflow checking (it's the same test, essentially). There's a grey area for ints that use all 64 (or 32) bits as they will become "big ints" internally (at least one bit is usually reserved to implement "tagged ints" efficiently) - but in code that straddles this boundary and if it matters one might be better off using a sized intxx type anyway. Finally, there's a GC cost since the garbage collector will need to do extra work. But all that said, dealing with overflow panic will also require extra code, and that has to be written by each programmer by hand. It is also much harder to verify/read that code.

I'm not in favor of this proposal.

@dr2chase

This comment has been minimized.

Copy link
Contributor

dr2chase commented Mar 20, 2017

I think either proposal is an improvement over the status quo. Programmers who assert the nonexistence of overflow will write the same code they do today, so no cognitive overhead there, and I won't have to worry about silent errors if they're wrong.

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Mar 20, 2017

@griesemer

proposal #19623 is a significant simplification […], while this proposal is a significant complication

I believe that the two proposals are compatible (and even complementary). We could make int and uint arbitrary-precision types (to make default behavior simpler), and also make the sized integer types panic on overflow (to make complex behavior safer).

But all that said, dealing with overflow panic will also require extra code, and that has to be written by each programmer by hand. It is also much harder to verify/read that code.

Could you elaborate on this point? I would expect that most code would either not handle the panic, or use a wrapping integer type explicitly. Even the latter option does not seem like a lot of extra code.

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Mar 20, 2017

@randall77

Just a datapoint, the code in https://golang.org/src/runtime/hash64.go takes advantage of overflow in almost every line of source (for the uintptr type).

That is a nice data point, and I think it nicely illustrates the three options I propose for intentional wraparound. Consider this snippet:

	h := uint64(seed + s*hashkey[0])
tail:
	switch {
	case s == 0:
	case s < 4:
		h ^= uint64(*(*byte)(p))
		h ^= uint64(*(*byte)(add(p, s>>1))) << 8
		h ^= uint64(*(*byte)(add(p, s-1))) << 16
		h = rotl_31(h*m1) * m2
	case s <= 8:
		h ^= uint64(readUnaligned32(p))
		h ^= uint64(readUnaligned32(add(p, s-4))) << 32
		h = rotl_31(h*m1) * m2
  1. With the "comma, ok" option it becomes unwieldy: there are many lines which combine the ^ with shifting or multiplication, and it is the latter which may overflow (so using the "comma, ok" form requires splitting lines):
    Edit: See #19624 (comment) below. If we apply the _, ok to the entire expression tree rather than just the immediate expression, it's not unwieldy at all.
	h := uint64(seed + s*hashkey[0])
tail:
	switch {
	case s == 0:
	case s < 4:
		h ^= uint64(*(*byte)(p))
		x, _ := uint64(*(*byte)(add(p, s>>1))) << 8
		h ^= x
		x, _ = uint64(*(*byte)(add(p, s-1))) << 16
		h ^= x
		x, _ = h * m1
		h, _ = rotl_31(x) * m2
	case s <= 8:
		h ^= uint64(readUnaligned32(p))
		x, _ := uint64(readUnaligned32(add(p, s-4))) << 32
		h ^= x
		x, _ = h * m1
		h = rotl_31(x) * m2
  1. With explicitly-wrapping types, only the conversions (which are mostly already present in the code) need to change:
	h := uint64mod(uintptrmod(seed) + uintptrmod(s)*hashkey[0])
tail:
	switch {
	case s == 0:
	case s < 4:
		h ^= uint64mod(*(*byte)(p))
		h ^= uint64mod(*(*byte)(add(p, s>>1))) << 8
		h ^= uint64mod(*(*byte)(add(p, s-1))) << 16
		h = rotl_31(h*m1) * m2
	case s <= 8:
		h ^= uint64mod(readUnaligned32(p))
		h ^= uint64mod(readUnaligned32(add(p, s-4))) << 32
		h = rotl_31(h*m1) * m2
  1. With implicit wrapping only for unsigned types, that function wouldn't change at all, although I think that also demonstrates that the safety advantages of detecting overflow diminish with that approach (since uintptr is a fairly common type to see in code using unsafe).
@griesemer

This comment has been minimized.

Copy link
Contributor

griesemer commented Mar 20, 2017

@bcmills The point of the sized integer types is a) to be able to control actual space consumed when laid out in memory, and b) often enough that they do wrap around. There's tons of code that makes use of that. Almost any left-shift operation would become more complicated if there wasn't silent overflow. Thus, a lot of code would have to deal with overflow.

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Mar 20, 2017

The point of the sized integer types is a) to be able to control actual space consumed when laid out in memory, and b) often enough that they do wrap around.

That's part of my point? At the moment, the sized integer types conflate together (a) and (b). I'm proposing that we make them orthogonal, not eliminate (b).

Almost any left-shift operation would become more complicated if there wasn't silent overflow.

Perhaps that's a good argument for making the left-shift operator not panic? (None of the other bitwise operators can overflow, and that would make the shift operator somewhat less redundant with multiplication.)

@bcmills bcmills changed the title proposal: Go 2: integer overflow should panic by default proposal: Go 2: fixed-width integer overflow should panic by default Mar 20, 2017

@griesemer

This comment has been minimized.

Copy link
Contributor

griesemer commented Mar 20, 2017

@bcmills Perhaps. My point is that this all adds extra complexity to the language where I am not convinced that we need more. We need less. Most people couldn't care less about overflow and simply want integers that "just work".

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Mar 20, 2017

Most people couldn't care less about overflow and simply want integers that "just work".

That's also part of my point? The fact that most people couldn't care less about overflow is what leads to the bugs in the first place. Most people couldn't care less about bounds-checking either, but Go has bounds checks nonetheless.

I agree that 'integers that "just work"' is a desirable goal, and ideally I would like to see this proposal combined with an arbitrary-precision int type. However, it's not obvious to me that that will be sufficient to make a dent in the incidence of overflow bugs in practice.

@gopherbot gopherbot added this to the Proposal milestone Mar 20, 2017

@gopherbot gopherbot added the Proposal label Mar 20, 2017

@bronze1man

This comment has been minimized.

Copy link

bronze1man commented Mar 21, 2017

Is it possible to add this type into go1.9?
Just add a type called uint64OverflowPanic may be enough to start try this stuff.
I think We may need both overflow panic uint64 and non overflow panic uint64 in go 2.

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Mar 21, 2017

@bronze1man
A type called uint64OverflowPanic would be counterproductive. Users who are spelling out verbose type-names are presumably already thinking about overflow, and at that level of verbosity they can just as easily make calls to some library to check the operations.

The point of this proposal is to make detection of overflow the default behavior. That's why it's a language proposal and not just a library.

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Mar 21, 2017

Regarding cost: overflow-checking is normally one instruction (a conditional branch based on the ALU's overflow flag), and for addition and subtraction my understanding is that modern Intel hardware will fuse the arithmetic instruction and the branch into one µop to be executed on the branch unit.

I don't see how we could implement arbitrary-precision integers with any fewer than two additional instructions per op. If we encode tags in the sign bit then every operation needs a shift in, shift out, and mask. If we encode tags in the least-significant bit then every operation needs at least a mask and a branch, and it's not obvious to me that the branch can be fused.

@griesemer

This comment has been minimized.

Copy link
Contributor

griesemer commented Mar 21, 2017

@bcmills Regarding arbitrary-precision integers: It's probably 2-3 additional instructions in the general case. But there's usually no masking needed in the common case if the tag bits are at the bottom (see #19623 (comment)).

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Mar 21, 2017

How do you test the tag bits without masking them?

I would expect panic-on-overflow to look like:

    add %rax, %rdx
    jo $overflowPanic

with a single overflowPanic defined for the entire program.

Or perhaps, if we need to save the faulting instruction more precisely, something like:

    add %rax, %rdx
    cmovo $0, $overflowPanic

and let the SIGSEGV handler actually produce the panic (the same way we do for dereferencing nil).

If I'm understanding correctly, an arbitrary-precision operation would look like:

add %rax, %rdx
test %rax, $0x3
jnz $addSlow

and it's not at all obvious to me that we could get by with a small number of variants of addSlow without either overconstraining the register allocator or adding even more instructions (perhaps a cmovnz, consuming an additional register?) to tell addSlow which registers (and widths) are involved.

@randall77

This comment has been minimized.

Copy link
Contributor

randall77 commented Mar 21, 2017

@bcmills: on x86 at least, we can use the parity (low bit of op result) condition code.
We only need a single bit of tag. Integers have a low bit of 1, pointers 0. z = x+y translates to:

    SUB   x, $1, a  // remove x's tag bit
    JP    addSlow // x is a pointer
    ADD   a, y, z
    JNP   addSlow // y is a pointer
    JO    addSlow // z overflowed

(Those are 3-operand ADD and SUB, we'd need 2 moves also to do this with 2-operand instructions, I think.)

How to implement addSlow will be a problem. Because we'd have to spill all live registers around any runtime call, we'd need essentially infinite variations. We'd have to generate them as needed and it would probably be a lot of code. We could use faulting instructions, but those are slow and we'd still need a stack + register map for each.

@randall77

This comment has been minimized.

Copy link
Contributor

randall77 commented Mar 21, 2017

Just a dumb experiment - I hacked the compiler to add the following sequence after very int and uint addition:

   TESTQ x, x
   JLT 3(PC)
   TESTQ x, x
   JLT 1(PC)

It doesn't do anything, just adds some cruft to simulate overflow checks.

It makes the go binary 1% larger. That is way undercounting what it would actually cost, as it doesn't include code needed for addSlow.
Here's the go1 benchmarks results:

name                     old time/op    new time/op    delta
BinaryTree17-8              2.36s ± 3%     2.40s ± 3%     ~     (p=0.095 n=5+5)
Fannkuch11-8                2.96s ± 0%     3.57s ± 0%  +20.74%  (p=0.008 n=5+5)
FmtFprintfEmpty-8          43.7ns ± 2%    44.2ns ± 1%     ~     (p=0.119 n=5+5)
FmtFprintfString-8         68.0ns ± 0%    68.8ns ± 0%   +1.06%  (p=0.008 n=5+5)
FmtFprintfInt-8            75.7ns ± 0%    79.6ns ± 0%   +5.15%  (p=0.008 n=5+5)
FmtFprintfIntInt-8          118ns ± 1%     122ns ± 0%   +3.38%  (p=0.008 n=5+5)
FmtFprintfPrefixedInt-8     159ns ± 0%     195ns ± 1%  +22.77%  (p=0.016 n=4+5)
FmtFprintfFloat-8           206ns ± 1%     226ns ± 1%   +9.30%  (p=0.008 n=5+5)
FmtManyArgs-8               469ns ± 1%     505ns ± 1%   +7.54%  (p=0.008 n=5+5)
GobDecode-8                6.53ms ± 1%    6.53ms ± 1%     ~     (p=1.000 n=5+5)
GobEncode-8                5.05ms ± 1%    5.07ms ± 0%     ~     (p=0.690 n=5+5)
Gzip-8                      213ms ± 1%     259ms ± 0%  +21.60%  (p=0.008 n=5+5)
Gunzip-8                   37.3ms ± 2%    38.3ms ± 2%   +2.57%  (p=0.032 n=5+5)
HTTPClientServer-8         84.1µs ± 0%    85.3µs ± 3%   +1.44%  (p=0.016 n=4+5)
JSONEncode-8               14.5ms ± 1%    15.6ms ± 0%   +8.12%  (p=0.008 n=5+5)
JSONDecode-8               52.0ms ± 1%    56.7ms ± 0%   +9.11%  (p=0.008 n=5+5)
Mandelbrot200-8            3.81ms ± 1%    3.71ms ± 0%   -2.72%  (p=0.008 n=5+5)
GoParse-8                  2.93ms ± 1%    2.97ms ± 0%   +1.50%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-8      69.9ns ± 2%    70.3ns ± 1%     ~     (p=0.460 n=5+5)
RegexpMatchEasy0_1K-8       223ns ± 1%     229ns ± 1%   +2.69%  (p=0.008 n=5+5)
RegexpMatchEasy1_32-8      66.3ns ± 1%    67.3ns ± 1%   +1.60%  (p=0.008 n=5+5)
RegexpMatchEasy1_1K-8       352ns ± 1%     360ns ± 1%   +2.04%  (p=0.008 n=5+5)
RegexpMatchMedium_32-8      104ns ± 1%     105ns ± 0%     ~     (p=0.167 n=5+5)
RegexpMatchMedium_1K-8     33.6µs ± 1%    34.8µs ± 1%   +3.52%  (p=0.008 n=5+5)
RegexpMatchHard_32-8       1.77µs ± 5%    1.90µs ± 4%   +7.41%  (p=0.032 n=5+5)
RegexpMatchHard_1K-8       54.3µs ± 5%    56.6µs ± 4%     ~     (p=0.310 n=5+5)
Revcomp-8                   433ms ± 1%     595ms ± 3%  +37.52%  (p=0.008 n=5+5)
Template-8                 64.9ms ± 1%    64.4ms ± 2%     ~     (p=0.222 n=5+5)
TimeParse-8                 305ns ± 0%     332ns ± 0%   +8.93%  (p=0.008 n=5+5)
TimeFormat-8                320ns ± 0%     347ns ± 1%   +8.57%  (p=0.008 n=5+5)

A few are hurt quite a bit, but a surprising number don't care so much.

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Mar 21, 2017

That still seems like a lot of extra instructions compared to what is needed for panic-on-overflow.

At any rate, my performance point on this issue is more "the cost of overflow checks is fairly low", with arbitrary-length integers as a reference point for an integer cost that a lot of folks believe to be reasonable.

@griesemer

This comment has been minimized.

Copy link
Contributor

griesemer commented Mar 21, 2017

@randall77 The problem with using only one bit is that you cannot optimistically add two tagged ints. The problem with using a 1 (instead of a 0) as tag bit is that one has to correct for it each time. Having a 1-offset pointer is trivial to correct when accessing through pointer-indirection. Again, using the scheme I have outlined before, addition is (dst on the right):

ADD x, y, z
JO overflow
TEST $3, z
JNZ bigint

If both x and y are tagged ints, they have a 00 tag (least significant 2 bits). The result is already correct. If one or both of them have a 01 tag, the result tags are going to be 01 or 10 - either way its not 00 after masking. In that case we need to run the slow routine. This is 4 instructions per addition in the best case.

@randall77

This comment has been minimized.

Copy link
Contributor

randall77 commented Mar 21, 2017

@griesemer , yes, I guess you're trading a bit in the representation for one less instruction.
Also to your benefit, your scheme doesn't use parity. It exists on x86 but not on ARM, for example.

@cherrymui

This comment has been minimized.

Copy link
Contributor

cherrymui commented Mar 21, 2017

With panic-on-overflow, the compiler even cannot fold (x + 1) - 1?

@bronze1man

This comment has been minimized.

Copy link

bronze1man commented Mar 21, 2017

@randall77
Does it mean that I have to buy 8.12% more CPUs for an useless overflow check in JSONEncode?

As I assume that JSONEncode do not have int overflow bug right now,and JSONEncode/JSONDecode uses 60% of my server's CPUs.😁

I hope golang can do better than that.

JSONEncode-8 14.5ms ± 1% 15.6ms ± 0% +8.12% (p=0.008 n=5+5)

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

ianlancetaylor commented Mar 21, 2017

Even without panic on overflow we can't fold x < x + 1, which is important because it means we can't determine the number of iterations for loops like for i := j; i < j + 10; i++, which means we can't unroll the loop without run time checks. I don't think the lack of folding opportunities is going to be significant, at least not compared to the run time overhead of overflow checks.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

ianlancetaylor commented Mar 21, 2017

@bronze1man Do you have reason to think that the cost of JSON encoding is dominated by integer arithmetic? On a modern CPU many algorithms are dominated by the time it takes to load values from memory, and that should not change under this proposal.

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jun 22, 2017

if this proposal was extended to cover divide by zero, _, ok could still be used there.

That's an interesting suggestion. For divide-by-zero, what would the other value be when ok is false? For overflows there is an obvious choice on current hardware (two's-complement truncation), but for division that's much less clear to me.

@nathany

This comment has been minimized.

Copy link
Contributor

nathany commented Jun 22, 2017

Maybe just 0 -- the zero value for the type as with type assertions?
https://play.golang.org/p/CBYt8ISMAu

(I'm not thinking of implementation details though)

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Aug 16, 2017

The proposed , ok extension here would make the fix for #21481 much clearer. Instead of:

if c := cap(b.buf); c > maxInt-c-n {
  panic(ErrTooLarge)
}
newBuf = makeSlice(2*cap(b.buf) + n)

we could write

s, ok := 2*cap(b.buf) + n
if !ok {
  panic(ErrTooLarge)
}
newBuf = makeSlice(s)
@nathany

This comment has been minimized.

Copy link
Contributor

nathany commented Nov 17, 2017

A nice property of the s, ok syntax is that it might not be considered a breaking change in Go 1.x because it's opt-in. But I don't like the idea of tag bits and int63s and such.

From a blog post on Rust:

These should cover all bases of “don’t want overflow to panic in some modes”:
wrapping_* returns the straight two’s complement result,
saturating_* returns the largest/smallest value (as appropriate) of the type when overflow occurs,
overflowing_* returns the two’s complement result along with a boolean indicating if overflow occurred, and
checked_* returns an Option that’s None when overflow occurs.
All of these can be implemented in terms of overflowing_*, but the standard library is trying to make it easy for programmers to do the right thing in the most common cases.
http://huonw.github.io/blog/2016/04/myths-and-legends-about-integer-overflow-in-rust/

The s, ok syntax is effectively the overflowing option, allowing any of the other options to be implemented.

s, ok := 2*cap(b.buf) + n

Though it isn't necessarily the best default option. Panic may be better, just like divide by zero.

If Go were to trap the overflow, does that essentially mean a panic, which could optionally be recovered from? In the case of #21481, would you recover just to panic with bytes.ErrTooLarge instead?

Could an overflow panic be implemented in Go 1.x behind a compiler/build flag (opt-in)? Potentially switching the default behaviour in Go 2.0.

The following may be a pipe dream for Go:

var m int64 = math.MaxInt64

a := m + 1 // panic: runtime error: ... overflows int64
b, ok := m + 1 
if !ok {
    // saturate, panic, return an error, other custom behaviour
}
c, _ := m + 1 // wrapping (current behaviour)

And perhaps similar for divide by zero, more for consistency than anything else:

var z int64 = 0

a := 1 / z // panic: runtime error: integer divide by zero (current behaviour)
b, ok := 1 / z
if !ok {
    // custom behavior
}
c, _ := 1 / z // zero?

(but perhaps not)

While at the same time having arbitrary precision ints to avoid overflows in the common case?

@dr2chase

This comment has been minimized.

Copy link
Contributor

dr2chase commented Nov 17, 2017

This doesn't work for division by zero, but for addition, subtraction, and multiplication
s, ok := a op b
can assign useful values to s in even when ok is false. For subtraction and addition, the result is what you'd get right now. For multiplication, I think also "what you get now", which is (I think) the low-order bits of the overflowing multiplication.

For the add/sub case you can directly compute what the answer should have been; for multiplication, you have the low-x-low part of your ultimate result.

We could do this, without the panics, in Go 1, which both gives people a clear way to write this now, and future-proofs it against panic-on-overflow in Go 2 (because you wrote this where you thought it would happen). It would come into minor conflict with a Go 2 "int just grows as needed" int type (but not the sized types) -- either it always returns "okay" or we statically error out on that case, and we could conceivably remove them automatically (if the "ok" were involved in a complex expression we'd have to be a little more careful).

@nathany

This comment has been minimized.

Copy link
Contributor

nathany commented Nov 18, 2017

I'm wondering about whether this comma, ok syntax should apply to divide by zero or not.

In more complex expressions, would it be better if ok is false for either overflow or divide by zero? Or is it better to only use this construct for overflow, and just panic for division by zero as Go currently does.

var m int64 = math.MaxInt64
var z int64 = 0

b, ok := (m + 1) / z
if !ok {
    // custom behavior
}

It's also not clear to me how feasible it is to support comma, ok on any given mathematical expression (including function calls?) with a standard int64 (no tainted bits and so on to break compatibility with other languages).

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Nov 18, 2017

I'm wondering about whether this comma, ok syntax should apply to divide by zero or not.

It's not obvious to me either (see the comments starting at https://golang.org/issue/19624#issuecomment-310427675).

It's also not clear to me how feasible it is to support comma, ok on any given mathematical expression

My current draft proposal is that it should only apply to arithmetic expressions, not map lookups (for which , ok would become ambiguous) or function calls (which can have side effects and could thus unintentionally allow an overflowed value to escape).

But I'm not strongly tied to that point: if you have interesting examples or use-cases either way, I'd be happy to see them.

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Aug 11, 2018

@nathany, here's an interesting analysis of the x/0 = 0 option (for when the overflow is ignored). Seems like it's actually not too bad.
https://www.hillelwayne.com/post/divide-by-zero/

@firelizzard18

This comment has been minimized.

Copy link

firelizzard18 commented Dec 30, 2018

I propose that Go adds unchecked and checked blocks, like those of C#: #29472.

For example, the snippet from hash64.go would look like:

h := uint64(seed + s*hashkey[0])
tail:
    switch {
    case s == 0:
    case s < 4:
        unchecked {
            h ^= uint64(*(*byte)(p))
            h ^= uint64(*(*byte)(add(p, s>>1))) << 8
            h ^= uint64(*(*byte)(add(p, s-1))) << 16
            h = rotl_31(h*m1) * m2
        }
    case s <= 8:
        unchecked {
            h ^= uint64(readUnaligned32(p))
            h ^= uint64(readUnaligned32(add(p, s-4))) << 32
            h = rotl_31(h*m1) * m2
        }
@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jan 8, 2019

@firelizzard18, that's only marginally clearer than the version in #19624 (comment), and costs two keywords and a new block syntax.

@firelizzard18

This comment has been minimized.

Copy link

firelizzard18 commented Jan 8, 2019

@bcmills Let's agree to disagree. In this particular case, they're not too different. I find the _, ok syntax to be generally clumsy and awkward, and it reduces expressiveness. In a more complicated scenario, I would be quite frustrated by having to use an assignment for each and every operation.

@bcmills

This comment has been minimized.

Copy link
Member

bcmills commented Jan 9, 2019

In a more complicated scenario, I would be quite frustrated by having to use an assignment for each and every operation.

I think you have misunderstood the proposal. You would only need one assignment per “expression consisting of arithmetic operators and / or conversions between integer types”, not one per operation: the check applies to the whole expression tree.

In the current draft, the only place you have to chop up the expression tree is at function-call boundaries, and even that could perhaps be relaxed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment