Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

proposal: spec: extend comma-ok expressions to + - * / arithmetic #6815

Open
griesemer opened this Issue Nov 21, 2013 · 30 comments

Comments

Projects
None yet
Contributor

griesemer commented Nov 21, 2013

This is a proposal for a (backward-compatible) language extension.

The notion of comma-ok expressions can be extended in a natural way to the 4 basic
arithmetic operations (+, -, *, /) such that a 2nd result value (traditionally the
"ok" value) provides the carry, borrow, overflow, and remainder value,
respectively.

Specifically (spec wording):

-----

A binary expression with one the four basic arithmetic operators may be used in an
assignment of initialization of the special form:

   z, c = x op y
   z, c := x op y
   var z, c = x op y
   var z, c T = x op y

The operator op must be one of +, -, *, or /. The value and type of z is the same as in
the single-result form. In the special form, the type of c is the same as the type of z,
and the value of c is defined as follows:

For +, -, and /, the value of c is (x op y) >> s with the operation carried out in
twice the precision of the types of the operands, and with s = operand type size in
bits. For /, the value of c is the remainder x%y.

In other words:

   for +, c is the "carry" value (0 or 1)
   for -, c is the "borrow" value (0 or -1)
   for *, c is the "overflow" value 
   for /, c is the remainder

-----

The implementation is straight-forward since these c values tend to be computed anyway:
For +, and -, the value of c corresponds to the "carry/borrow bit" usually
computed always for these operations. The carry bit simply needs to be converted and
stored (between 1 to 3 machine instructions, depending on architecture, storage
location). Division operations usually leave the remainder in a register. Only
multiplication requires double-width arithmetic, and only if c is required.

Applications:

- Code that needs to check for integer overflow can be simplified. For unsigned ints,
the c value is the desired bit.
- Numeric conversions (from integer to decimal) require both / and % on the same
operands. Providing q, r := x/y is likely to run twice as fast w/o the need for the
compiler to detect that a % operation following a / is using the same operands.
- Some of the core routines required for arbitrary precision arithmetic can be written
in a straight-forward manner in Go and should achieve similar performance as
corresponding assembly code, without the need for fancy compiler optimizations.

Concrete example: If x, y, z represent fixed-size high precision unsigned integers (n*64
bits), + could be written as follows:

   const n = 100
   var x, y, z [n]uint64

   // compute z = x + y
   var c uint64
   for i, x := range x {
            z[i], c = x + y[i] + c
   }

Without the special z, c = x + y form, analogous code written in c will be prohibitively
expensive compared to the respective assembly.
Contributor

griesemer commented Nov 21, 2013

Comment 1:

Amendments:
1) This extension should only apply for integer operands.
2) If general extension (to all 4 operations) is considered too extensive, one might
consider simply permitting the special form for / .
Contributor

rsc commented Nov 27, 2013

Comment 2:

Labels changed: added go1.3maybe.

Contributor

rsc commented Dec 4, 2013

Comment 3:

Labels changed: added release-none, removed go1.3maybe.

Contributor

rsc commented Dec 4, 2013

Comment 4:

Labels changed: added repo-main.

Comment 5 by mdempsky@chromium.org:

I like the idea of an extra value to easily check for overflow conditions, but the
proposed wording seems to have at least a couple issues when extended to signed integer
arithmetic:
If x and y are both int32(-1), then (int64(-1) + int64(-1)) >> 32 evaluates to -1.  This
isn't consistent with the "for +, c is the 'carry' value (0 or 1)" text, as the value is
neither 0 nor 1.
Additionally, if x and y are both math.MaxInt32 (or both math.MinInt32), then (int64(x)
+ int64(y)) >> 32 will evaluate to 0 (or -1, respectively).  In this case, (x + y) will
wrap around, but c won't give any useful indication of that (at least when inspected in
isolation).

@rsc rsc added this to the Unplanned milestone Apr 10, 2015

Contributor

josharian commented Mar 7, 2016

Numeric conversions (from integer to decimal) require both / and % on the same operands. Providing q, r := x/y is likely to run twice as fast w/o the need for the compiler to detect that a % operation following a / is using the same operands.

I know that the suggestion here was to spare the compiler some work. However, I think that it would now be quite easy to write an SSA pass that recognizes code like

q := x / y
r := x % y

and substitutes a specialized op for it, and then add faster arch-specific implementations for that op. Something similar probably holds for carry, borrow, and overflow, if we pick a standard idiomatic way to express them (a la the memclear range expression).

cc the usual numeric/SSA crowd: @dr2chase @brtzsnr @randall77 @tzneal

Contributor

josharian commented Mar 7, 2016

(Picking an idiom to express such calculations also solves @mdempsky's concerns, since it completely specifies the correct behavior.)

Contributor

randall77 commented Mar 8, 2016

In SSA we have so far avoided having ops which return multiple values. I think such a thing is inevitable, and this is one use case for it. Another is ops which generate both a result and a flags value, so we don't have to do things like SUBQ x, y; CMP $0, y; JNE ... . If we had a subtract that generated a result and flags, we could avoid the comparison.

Multiple-value-generating ops require some sort of tuple type and tuple-extraction psuedoops in the SSA form. It isn't hard, but I've avoided it for now because life is simpler without those things.

Member

bcmills commented Mar 8, 2016

@josharian

I know that the suggestion here was to spare the compiler some work. However, I think that it would now be quite easy to write an SSA pass that recognizes code [...] and substitutes a specialized op for it, and then add faster arch-specific implementations for that op.

That would address the point about compiler efficiency, but that doesn't really address the main point of the issue - which is to simplify and improve readability in code that cares about overflow and remainders.

(Adding to the spec is unlikely to spare the compiler much work anyway - and this case is easy enough to optimize using any of a bunch of strategies that don't require extending the language.)

Member

minux commented Mar 8, 2016

Member

bcmills commented Mar 8, 2016

@minux

For example, the proposed
sum, carry := a + b
makes the implicit assumption that the underlying architecture
has the concept of carry bit. What if the architecture doesn't have
carry bit?

Then it compiles to more than one instruction, of course. Same as what you do today if the architecture doesn't have a hardware divide instruction - there's no general requirement that operators in the high-level language be in 1:1 correspondence with machine instructions.

Similarly, the proposed
quo, rem := p / q
assumes that quotient and remainder are available from the same
instruction, but actually most RISC architectures don't offer such
a divide instruction.

No? It only assumes that the programmer is interested in both the quotient and the remainder of the operation. If you can compile those to a single instruction, that's great - but if you can't, you compile it to a set of instructions that compute them independently.

Again, this seems to be missing the point: the main benefit of the proposed syntax is to simplify programs that need carries, remainders, and the like - not to make the compiler's instruction selection easier to implement.

Contributor

griesemer commented Mar 8, 2016

@minux What is the preferred way to implement multi-precision arithmetic (say + for arbitrarily large numbers) on RISC-V (you mentioned the absence of a carry bit). Is a separate sequence of instructions needed to determine the carry explicitly? (What's the relevant section in http://riscv.org/specifications/?)

Member

minux commented Mar 8, 2016

@griesemer griesemer added Go2 and removed Thinking labels Mar 9, 2016

Contributor

griesemer commented Mar 9, 2016

I still believe there is merit to this idea, especially because we already do have the concept of operations returning one or two results, depending on context. No matter the underlying architecture, the code is nicer to write and a compiler will likely be able to generate better code. The ability for a compiler to recognize patterns that amount to the same effect is orthogonal to this feature.

That said, we're not going to make language changes of this magnitude at this point. Marked this for Go2, just so we have a record of this discussion.

CL https://golang.org/cl/25004 mentions this issue.

gopherbot pushed a commit that referenced this issue Jul 18, 2016

[dev.ssa] cmd/compile: use 2-result divide op
We now allow Values to have 2 outputs.  Use that ability for amd64.
This allows x,y := a/b,a%b to use just a single divide instruction.

Update #6815

Change-Id: Id70bcd20188a2dd8445e631a11d11f60991921e4
Reviewed-on: https://go-review.googlesource.com/25004
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Member

dsnet commented Sep 30, 2016

Just noting http://golang.org/cl/29954 where something like this may be useful.

brunokim commented Mar 19, 2017 edited

thread.deposit(Money{Cents: 2})

As an end-user who never considered overflow conditions in my code, I'd indeed expect an 'ok' return instead of the carry. I see some conflicting goals for this feature:

  1. Teach developers about overflow, which is ubiquitous, bug-inducing, and yet largely ignored...
    a. ...except for enthusiasts like me who read blog posts and postmortems, but still never take it into account.
  2. Showcase a powerful Go internal capability: some primitives can have single or multiple return, allowing for extended functionality for those that require it, including something so fundamental as addition. What's more, it's backward compatible!
  3. Ease of use for developers that already consider overflow conditions, and have to make their code not-so-readable to consider all edge cases
  4. Make code that uses carry very performant by using the underlying architecture capabilities directly

The order above matches my perspective on importance; I'd accept a loss in performance in exchange of clarity and consistence. That is, for addition it could be implemented in Go directly, not considering the processor architecture:

func outOfBoundsSum(x, y int32) bool {
	return (x > 0 && y > math.MaxInt32-x) || (x < 0 && y < math.MinInt32-x)
}

(Wouldn't a SufficientlySmartCompiler(TM) take these checks and transform it into very optimized machine code already?)

Also: When this feature is implemented I can already see golint warning on this construct:

// Just use rem := p % q 🙄
_, rem := p / q
Contributor

josharian commented Mar 19, 2017

(Wouldn't a SufficientlySmartCompiler(TM) take these checks and transform it into very optimized machine code already?)

FWIW, there's discussion of adding checked arithmetic library routines in #18616.

Member

bcmills commented Mar 19, 2017 edited

@brunokim

I see some conflicting goals for this feature:

  1. Teach developers about overflow […].

Yes. As I see it, that's the main purpose of extending the operators rather than adding a library: programmers view the core language as "the default thing to use" and libraries as "power-user features", and arguably that encourages the wrong defaults in terms of thinking about (and handling) overflow.

  1. Showcase […] multiple return […].

I think that's a non-goal? The point of adding language features is to improve programs and/or the programming experience. We don't generally add features to Go just to demonstrate that they could fit in.

  1. Ease of use for developers that already consider overflow conditions […].

Yes, although arguably that goal could be satisfied almost as well with a library.

  1. Make code that uses carry very performant by using the underlying architecture capabilities directly

I think that's a non-goal, too: it would be easy enough to add performance optimizations for specific patterns, especially now that the compiler has an SSA backend. The potential performance implications don't seem nearly strong enough to justify adding a language feature.

(Wouldn't a SufficientlySmartCompiler(TM) take these checks and transform it into very optimized machine code already?)

Yes. The point of the proposal is to improve legibility and correctness-by-default ― not performance.

@bcmills, glad we agree on the most important parts.

Point 2. is indeed my personal opinion: it'd be awesome to show off, and perhaps bringing people to Go that care about safety. And I'm certainly not serious about the SufficientlySmartCompiler(TM), though I'm curious what's the actual, generated code today.

I didn't see this brought up yet: I was thinking a bit more about the issue, and now I believe that division should also return an 'ok', to bubble up the result from intermediate operations. We'd also be able to use that for zero-checking!

// Almost free overflow checking for lazy developers
func bucketDivisionOk(a, b, size int) (int, bool) {
	return ((a+b)/2 + size + pad - 1) / (size + pad)
}

// If division is different, code can become ugly...
func bucketDivisionRemainder(a, b, size int) (int, bool) {
        mid, ok1 := a+b
	up, ok2 := mid/2 + size + pad - 1
        down, ok3 := size + pad
        return up / down, ok1 && ok2 && ok3
}

// ...almost like what libraries can already provide today.
func bucket(a, b, size int) (int, bool) {
	mid, ok1 := sumInt(a, b)
	up, ok2 := sumInt(mid/2, size, pad, -1)
	down, ok3 := sumInt(size, pad)
	return up / down, ok1 && ok2 && ok3
}

So my feeling is now against the proposed change, which is to surface the internals of arithmetic operations (carries, remainder), and stick to the 'ok' convention to instead provide an 'out of bounds/invalid argument' error reporting.

There's this proposal for extending the language and the proposal (for a proposal) for a library in #18616.

Of the two, I'd much rather have it be part of the language. I think we definitely need one of the two, and a library is certainly better than nothing.

They both have their pros and cons. (In this I'm assuming that either will perform equally well with compiler intrinsics, and ignoring any particulars up for debate as much as possible).

Language change:

  • pro: makes a point about how important the concept is
  • pro: can apply to all integer types and operations equally
  • pro: terse in the good way
  • pro: natural extension of the "comma-ok" language pattern
  • pro: easily readable and writable, once you know what it means, making it more likely to be used
  • con: easily confusing if you don't know what it means (but this would happen once and have a short learning curve)
  • con: much more work to add than a library: spec, parsers, go/* libs, et al. need to be updated
  • con: the last point could cause weird bugs to surface in downstream code using go/* libs
  • con: programs written with the new syntax would immediately be locked to Go 1.x+, possibly hampering initial adoption.

Library:

  • pro: can be prototyped more easily
  • pro: a pure Go prototype could be backported to those stuck on an earlier Go 1.x
  • pro: only library and compiler (for intrinsics) need to be touched
  • pro: no "what does that mean?" moment: the meaning will be encoded in package/func name
  • con: more verbose than syntax, especially given 4 operations times 10 types†
  • con: given the 40 functions† necessary for full coverage, less likely to cover all integer types/operations, at least initially
  • con: not easy to read the docs in godoc without #18858

All of that said, and while, again, I would prefer a language change, I think a math/checked library seems more reasonable.

† realistically 3 operations (for a maximum of 30 functions) since DivRem isn't symmetric in a math/checked library like it is with the syntax change and the body of that function is trivial, easy to do now safely and tersely, and more amenable to optimization than Add, Sub, and Mul.

Member

bcmills commented Mar 20, 2017

@jimmyfrasche

  • con: programs written with the new syntax would immediately be locked to Go 1.x+, possibly hampering initial adoption.

Yeah, I think that's the biggest downside to the language change. But as @griesemer noted (#6815 (comment)), it's something we can revisit for Go 2.

At this point, I think the best approach is likely a standard library (or /x/ library) for Go 1 with an eye toward built-in operators for Go 2.

  • con: given the 40 functions† necessary for full coverage, less likely to cover all integer types/operations, at least initially

It is indeed unfortunate that the library would need so many functions. Sadly, the fix for that (#15292) is also labeled for Go 2.

Contributor

griesemer commented Mar 20, 2017

I'm going to mark this a proper proposal (for Go 2).

That said, and even though I originally proposed this, I am less and less in favor of this. The original motivation was to write some of the math/big code routines in Go rather than assembly, and get similar performance. Yet, it's probably always possible to squeeze out more performance in assembly for some of these core routines (about a dozen or so), and if we can, we want to because these routines are crucial (e.g crypto applications). Also, it will be much easier to tweak the specific assembly code than the compiler if a better way to write that code emerges. Finally, the places where code like this matters is small.

More philosophically, moving specialized complexity into the language and compiler just doesn't seem justified for these specific routines (it's a bit a different story for built-ins such as append, etc. which are used pervasively).

@griesemer griesemer changed the title from spec: extend comma-ok expressions to 4 basic operations to Proposal: extend comma-ok expressions to 4 basic operations (language change) Mar 20, 2017

@griesemer griesemer added the Proposal label Mar 20, 2017

@griesemer to clarify do you mean that you're marking the proposal for the language change to Go2 (in favor) but that a separate proposal for a math/checked library would be considered?

Contributor

griesemer commented Mar 20, 2017

@jimmyfrasche I'm less and less in favor of this language change no matter Go 1 or Go 2. There's no way this is going in Go 1, so I leave it open for Go 2, to be considered as a proper proposal (others may feel differently). We have accepted the math/bits library and we are still considering extending it with some of these operations, which would be an an alternative approach.

Member

bcmills commented Mar 25, 2017

In an out-of-band discussion, @praetergeek pointed out that the typical use of a carry value from an addition is to combine it with some later (three-operand) addition: x = y + z + carry.

And it's worth noting that you can do around 2^31 additions before the carry bits overflow an int32.

So if we were to pursue this proposal further, perhaps we would want to allow strings of multiple addition or subtraction operations, with the second value representing the sum of the carries rather than a single bit. That would also help cut down on boilerplate when the feature is used: applications that are sensitive to overflow in aggregate but don't care about temporary excursions could simply string together multiple operations and check whether the carry is nonzero.

It isn't obvious to me whether that approach can be applied to expressions that mix addition or subtraction with multiplication or division.

Member

bcmills commented Mar 27, 2017 edited

I think we should also provide "comma, ok" or carry forms for lossy conversions. For example:

x := math.MaxInt64
y, c := int16(x)

would result in y = int16(math.MaxInt16) and c = int64(math.MaxInt64 - math.MaxInt16).

Member

bcmills commented Mar 28, 2017

I've been thinking about this some more. Putting the carry in the second value has a nice mathematical purity to it, but in practice it seems more useful to be able to detect overflow for an expression containing multiple arithmetic operators (including multiple multiplication operators).

So I'm now in favor of using booleans instead (as described in #19624).

@rsc rsc changed the title from Proposal: extend comma-ok expressions to 4 basic operations (language change) to proposal: spec: extend comma-ok expressions to + - * / arithmetic Jun 17, 2017

Contributor

rsc commented Jun 17, 2017

FWIW, given both Robert's Mar 20 comment above and the fact that now we have math/bits for similar compiler trickery, it seems like a language change is overkill for this, since these are only rarely needed. It would be much simpler to add appropriate functions to math/bits than to complicate the core language.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment