Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: operator functions #27605

Open
deanveloper opened this issue Sep 10, 2018 · 31 comments

Comments

@deanveloper
Copy link

commented Sep 10, 2018

The only issue that I could find about operator overloading currently #19770, although it's currently closed and doesn't have many details.

Goal

Operator overloading should be used to create datatypes that represent things that already exist in Go. They should not represent anything else, and should ideally have no other function.

New operators should not be allowed to defined, we should only be able to define currently existing operators. If you look at languages that let you define your own operators (Scala, looking at you!) the code becomes extremely messy and hard to read. It almost requires an IDE because operator overloading is very heavily used.

Using an operator for anything other than it's purpose is a bad idea. We should not be making data structures with fancy looking + operators to add things to the structure. Overloading the + operator in Go should only be used for defining addition/concatenation operators for non-builtin types. Also, as per Go spec, binary operators should only be used for operating on two values of the same type.

Operators should only operate on and return a single type. This keeps things consistent with how operators currently work in Go. We shouldn't allow any type1 + string -> type1 stuff.

Operators should only be defined in the same package as the type they are defined on. Same rule as methods. You can't define methods for structs outside your package, and you shouldn't be able to do this with operators either.

And last but not least, operators should never mutate their operands. This should be a contract that should be listed in the Go spec. This makes operator functions predictable, which is how they should be.

Unary operators should not need to be overloaded.

+x                          is 0 + x
-x    negation              is 0 - x
^x    bitwise complement    is m ^ x  with m = "all bits set to 1" for unsigned x
                                      and  m = -1 for signed x

This part of the spec should always remain true, and should also remain true for anything using these operators. Perhaps ^x may need to have it's own, as there's no good way to define "all bits set to 1" for an arbitrary type, although defining a .Invert() function is no less readable IMO.

Unary operations on structs would then therefore be Type{} + t or Type{} - t, and pointers would be nil + t and nil - t. These may have to be special cases in the implementation of operator functions on pointers to types.

Assignment operators should also never be overloaded.

An assignment operation x op= y where op is a binary arithmetic operator is equivalent to x = x op (y) but evaluates x only once. The op= construct is a single token.

This should remain the same just as unary operators.

If we do not permit overloading the ^x unary operator, this means that we only need to define binary operations.

Issues/Projects aided by operator overloading

#19787 - Decimal floating point (IEEE 754-2008)
#26699 - Same proposal, more detail
#19623 - Changing int to be arbitrary precision
#9455 - Adding int128 and uint128
this code - Seriously it's gross
really anything that uses math/big that isn't micro-optimized
If I went searching for longer, there'd probably be a few more that pop up

Syntax

What's a proposal without proposed syntaxes?

// A modular integer.
type Int struct {
	Val int
	Mod int
}

// ==============
// each of the functions would have the following function body:
//
//    	if a == Int{} { // handle unary +
//		a.Mod = b.Mod
//	}
//
//	checkMod(a, b)
//	nextInt = Int{Val: a.Val + b.Val, Mod: a.Mod}
//	nextInt.Reduce()
//
//	return nextInt
//
// ==============

// In all of these, it would result in a compile error if the types of the arguments
// do not match each other and the return type.

// My new favorite. This makes for a simple grammar. It allows
// people who prefer function calls can instead use the `Add` function.
operator (Int + Int) Add
func Add(a, b Int) Int { ... }

// My old favorite. Abandoning the `func` construct clarifies
// that these should not be used like a standard function, and is much
// more clear that all arguments and the return type must be equal.
op(Int) (a + b) { ... }
operator(Int) (a + b) { ... }      // <- I like this better

// My old second favorite, although this looks a lot like a standard method definition.
// Maybe a good thing?
func (a + b Int) Int { ... }

// It can be fixed with adding an "op" to signify it's an operator function, although
// I do not like it because it just reads wrong. Also, looks like we're defining a package-level
// function named `op`... which is not what we are doing.
func op (a + b Int) Int { ... }

// Although at the same time, I don't like having words
// before the `func`... I feel that all function declarations should begin with `func`
op func (a + b Int) Int { ... }

// Another idea could just be to define a method named "Plus", although this
// would cause confusion between functions like `big.Int.Plus` vs `big.Int.Add`.
// We probably need to preserve `big.Int.Add` for microoptimization purposes.
func (a Int) Plus(b Int) Int { ... }

Considering other languages' implementations.

C++

// there's several ways to declare, but we'll use this one
Type operator+(Type a, Type b)

I think C++ isn't a bad language, but there are a lot of new programmers who use it and think it's "super cool" to implement operator functions for everything they make, including stuff like overloading the = operator (which I have seen before).

I also have a couple friends from college who really enjoyed defining operator functions for everything... no bueno.

It gives too much power to the programmer to do everything that they want to do. Doing this creates messy code.

Swift

static func +(a: Type, b: Type) -> Type

Note that custom operators may be defined, and you can define stuff like the operator's precedence. I have not looked much into how these operators end up being used, though.

C#

public static Type operator+ (Type a, Type b)

Operator functions in C# end up being massively overused in my experience. People define all of the operators for all of their data structures. Might just be a consequence of using a language with many features, though.

Kotlin

operator fun plus(b: Type): Type // use "this" for left hand side

https://kotlinlang.org/docs/reference/operator-overloading.html, actually a really nice read.

Operator functions get used everywhere, and even the standard library is littered with them. Using them leads to unreadable code. For instance, what does mutableList + elem mean? Does it mutate mutableList? Does it return a new MutableList instance? No one knows without looking at the documentation.

Also, defining it as a method instead of a separate function just begs it to mutate this. We do not want to encourage this.

Open Questions

Which operators should be allowed to be overridden?

So, the mathematical operators +, -, *, /, % are pretty clear that if this gets implemented, we'd want to overload these operators.

List of remaining operators that should be considered for overloading:

  • << and >> (I do not think this is a good idea)
  • |, &, and &^ (also do not think this is a good idea)
  • <, >, <=, >=, ==, != (maybe a good idea?)
    • If we include <, we should include == to prevent the confusing case of x <= y && y >= x but x != y.

Overloading equality may be a good thing. big.Int suffers because the only way to test equality is with a.Cmp(b) == 0 which is not readable at all.

I have left out || and && because they should be reserved exclusively for bool or types based on bool (has anyone ever even based a type on bool?) and see no reason to override them.

Should we even allow operator overloading on pointer types?

Allowing operator overloading on a pointer type means the possibility of mutating, which we do not want. On the other hand, allowing pointer types means less copying, especially for large structures such as matrices. This question would be resolved if the read only types proposal is accepted.

Disallowing pointer types
  • Does not allow mutation
  • No need for nil checks in operator implementation
Allowing pointer types
  • Allows operators to be consistent with the rest of the type's methods.
    • ie *big.Int is *big.Int everywhere else, it would be good for consistiency
  • Since it's consistent, it makes it easier to pass into other functions.
    • ie Can't pass big.Int into a function that takes *big.Int

Perhaps it should be a compile-time error to mutate a pointer in an operator function. If read-only types were added then we could require the parameters to be read-only.

Should it reference/dereference as needed?

Methods currently do this with their receivers. For instance:

// Because the receiver for `NewInt` is `*big.Int`,
// the second line is equivalent to `(&num).NewInt(....)`

var num big.Int
num.NewInt(5000000000000000000)

So should the same logic apply to operator functions?


I'm aware that this will probably not be added to Go 2, but I figured it would be a good thing to make an issue for, since the current issue for Operator Functions is quite small and, well, it's closed.


Edits:

  • Added #9455 to list of proposals which could be solved using this instead
  • Fixed a clarity issue
  • Changed examples to use a modular integer rather than a *big.Int since the math/big package is designed to be used in an efficient way, and added that read-only types would benefit this proposal

@gopherbot gopherbot added this to the Proposal milestone Sep 10, 2018

@gopherbot gopherbot added the Proposal label Sep 10, 2018

@OneOfOne

This comment has been minimized.

Copy link
Contributor

commented Sep 10, 2018

@gopherbot please add go2

@gopherbot gopherbot added the Go2 label Sep 10, 2018

@bronze1man

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2018

I do not like this idea. Because:

  • It increase the complexity of syntax, compilers, tools that read the language.
  • It also increase the learning cost of the language.
  • It decrease the readable of the code. It is much more difficult to answer the question "where is the code defined?"
@deanveloper

This comment has been minimized.

Copy link
Author

commented Sep 11, 2018

It increase the complexity of syntax, compilers, tools that read the language.

Of course it does, it's a new feature. All features inevatibly will do this. Although if we're worrying about complicating anything, it should be the spec. Luckily a lot of the rules remain the same, the only things that change are allowing more things to be "operated" together with arithmetic operators.

It also increase the learning cost of the language.

Again, so does every new feature, and I'd say the learning cost is very small. People typically don't need to write operator functions anyway, as there are meant to be used much more than they are written.

It decrease the readable of the code. It is much more difficult to answer the question "where is the code defined?"

That is a valid critique, but ideally you should not be worried about what it does. It doesn't matter how 1+2 works, you care that it adds properly and you get a 3 in return. Operator functions should not do anything other than what you expect the operator to do.

I'd say it actually increases readability as well. Again, look at literally anything that uses math/big like this quick example. People who need stuff like math/big or a future decimal64 library can take advantage of this. Honestly, APIs for these things are quite terrible, and need to be because we need workarounds for these operators.

@deanveloper

This comment has been minimized.

Copy link
Author

commented Sep 11, 2018

Also, it's defined the same place that methods are. In the same package as the type itself! So really the question "where is it defined" is a question that we already have the tools to answer.

@urandom

This comment has been minimized.

Copy link

commented Sep 11, 2018

It also increase the learning cost of the language.

I would argue the opposite. The current implementation of the math/big package is confusing and very hard to use. A very big improvement would be if it would just support operators and would act like the base number types. That would make it very easy to learn and use.

It decrease the readable of the code. It is much more difficult to answer the question "where is the code defined?"

The code readability is greatly improved. Again, math/big clearly illustrates that. As for "where is the code defined", that has little to do with readability, and the answer, as is right now, is of course: https://godoc.org/math/big

@urandom

This comment has been minimized.

Copy link

commented Sep 11, 2018

Another area that may be improved with such operator functions is the current generics proposal.
As it stands right now, writing a function to add 2 numbers would only be limited to the current numeric types. A hypothetical Sum generic function will never be able to work with big.Int, for example, or say, a custom rational number type.

@davecheney

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2018

@as

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2018

// each of the functions would have the following function body:
{
    return new(big.Int).Add(a, b)
}

I would prefer not to allocate a potentially large amount of memory for every arithmetic operation, especially if that operation looks like a normal arithmetic operation.

The package math/big is both well-designed and has a learning curve. The two qualifiers are conflated in some of the examples and comments here. The stated goal of this package was to provide efficient manipulation of big numbers. The current proposal should perform no worse than the current state of affairs.

@deanveloper

This comment has been minimized.

Copy link
Author

commented Sep 11, 2018

I disagree. Only a small percentage of the math/big api can be replaced with the small set of binary operators defined in the language.

Yes, only a small percentage can be replaced, but what about how often those are used? (Also, this really isn't only for math/big, I mainly used it as an example because it has a very steep learning curve, and the code resulting from it is typically very messy and can be hard to maintain.

I would prefer not to allocate a potentially large amount of memory for every arithmetic operation, especially if that operation looks like a normal arithmetic operation.

The package math/big is both well-designed and has a learning curve. The two qualifiers are conflated in some of the examples and comments here. The stated goal of this package was to provide efficient manipulation of big numbers. The current proposal should perform no worse than the current state of affairs.

This is a good critique as well, but when all you want to do is to add numbers without having to worry about overflow, it could be extremely useful.

There could also be a separate type defined somewhere else with a less steep learning curve. After all, there is a pretty popular issue (#19623) to make int arbitrary precision. This could instead be a third party library (or in golang.org/x/) with operator functions defined for it.

@ianlancetaylor ianlancetaylor changed the title proposal: Operator Functions proposal: Go 2: operator functions Sep 12, 2018

@deanveloper

This comment has been minimized.

Copy link
Author

commented Sep 19, 2018

Just as a quick demo as to how this could be implemented outside of math/big, which was a bad example (since it's designed to be efficient).

This could also of course be used for other mathematical constructs such as matrices, vectors, etc. Also for arbitrary-precision integers, a bigdecimals, a elliptical-point, etc.

A simple modular integer implementation:

// A modular integer.
//
// `Reduce` should be called if you ever modify `Val` or `Mod` directly.
// As long as this is done, the `==` operator will work properly.
type Int struct {
	Val int
	Mod int
}

// Adds two modular integers and reduces.
// Panics if a and b do not have the same modulus
operator(Int) (a + b) {
	if a == Int{} { // handle unary +
		a.Mod = b.Mod
	}

	checkMod(a, b)
	nextInt = Int{Val: a.Val + b.Val, Mod: a.Mod}
	nextInt.Reduce()

	return nextInt
}

// Subtracts a modular integer from another and reduces.
// Panics if a and b do not have the same modulus.
operator(Int) (a - b) {
	if a == Int{} { // handle unary -
		a.Mod = b.Mod
	}

	checkMod(a, b)
	nextInt = Int{Val: a.Val - b.Val, Mod: a.Mod}
	nextInt.Reduce()

	return nextInt
}

// Multiplies two modular integers and reduces.
// Panics if a and b do not have the same modulus
operator(Int) (a * b) {
	checkMod(a, b)
	nextInt = Int{Val: a.Val * b.Val, Mod: a.Mod}
	nextInt.Reduce()

	return nextInt
}

// sets i.Val to its least nonnegative residue
func (i *Int) Reduce() {
	if i.Val >= 0 && i.Val < i.Mod {
		return
	}
	i.Val = (i.Val%i.Mod + i.Mod)%i.Mod
}

// makes sure a and b have the same modulus.
// if not, it panics.
func checkMod(a, b Int) {
	if a.Mod != b.Mod {
		panic(fmt.Errorf("mod is not equal: a(%d) b(%d)", a.Mod, b.Mod))
	}
}

And of course the usage:

func main() {
	nine := modmath.Int{ Val: 9, Mod: 2 }
	ten := modmath.Int{ Val: 10, Mod: 2 }
	twentyOne := modmath.Int{ Val: 21, Mod: 2 }.Reduce()
	
	fmt.Printf("9 + 10 == 21 (mod 2):", x + y == twentyOne)
	// output: true
}
@go-li

This comment has been minimized.

Copy link

commented Oct 19, 2018

With operators there are 5 goals:

  1. Disallow operating on different types
  2. Not wildly exported to other packages, force re-declaring
  3. Disallow mutating input args
  4. Interact with generics well
  5. Only one function that returns integer for operators <,>,<=,>=,==,!= negative positive and equal (zero)

I propose the following. Operators should be a function calls with functions that are in general:

type T struct{}
func MyCustomAdd(*T , *T) *T {...}
func MyCustomSub(*T , *T) *T {...}
func MyCustomMul(*T , *T) *T {...}
func MyCustomMod(*T , *T) *T {...}
func MyCustomShr(*T , uint) *T {...}
func MyCustomShl(*T , uint) *T {...}
func MyCustomBwAnd(*T , *T) *T {...}
func MyCustomBwOr(*T , *T) *T {...}
func MyCustomBwClr(*T , *T) *T {...}
func MyCustomCompare(*T , *T) int {...}

Comparison is special, it is the "default operator", and maps all of the operators <,>,<=,>=,==,!=

Comparison is written like:

op default MyCustomCompare

Next the user can define also the other operators:

op + MyCustomAdd

If MyCustomAdd does not match the generic signature func ( * , * ) (*) this would be a compile time error. This satisfies point 1. Also, defining operator on a type that already has it would be a compile time error.

If the function is capitalized then the operator can be exported but must be also declared in the remote package, like so:

op + otherpackage.MyCustomAdd

This satisfies point 2.

All functions linked to operator will go thru additional type checking, to detect cases where you write to args and where you return something else than pointer to your own variable (this rule bans returning nil, also bans returning a and b):

func MyCustomAdd(a *MyInt ,b *MyInt) *MyInt {
    var foo MyInt // this must exist
    a.x = 7 // cannot write to arg
    if (yadda yadda) {
        return nil // this cannot exist
    } else {
       return b // this cannot exist
    }
   return &foo // this must exist
   return //this cannot exist
}

Operators that passed the additional type checking, will be internally optimized to something like

func(a *, b *,ret *)

Those who did not won't.

Next, generic function Maximum:

func Maximum (compare func (*, *)int, ... things) (out *) {...}

Can be specially written as:

func Maximum2 (op default, ... things) (out *) {...}

Now comes the main point: Maximum requires either plain generic function or auto-passed "unoptimized operator".
Maximum2 requires optimized operator. If you pass optimized operator to function that requires unoptimized one, a wrapper code will be generated and the wrapper passed.

Operators can be auto passed, like this:

Maximum(_, T{},T{},T{},T{} )
Maximum2(_, T{},T{},T{},T{} )

And used inside the generic function:

func Maximum2 (op default, things ... ) (out *) {
   if things[1] > things[0] {
          return &things[1]
    }
    return &things[0]
}

And the best part: compiler takes care of the boring details.
Like: promoting values to pointers when passing to operator functions. Creating wrappers for operators. Typechecking operator functions and returning compile time errors when you pass unoptimizable operator function to a generic function that has op param.

@deanveloper

This comment has been minimized.

Copy link
Author

commented Oct 19, 2018

Could you edit to have your examples in code fences?

like this

would be
```
like this
```

@maj-o

This comment has been minimized.

Copy link

commented Oct 20, 2018

Maybe a simpler syntax is also okay?

func (a mytype) +(b mytype) (mytype) {
     return a.Add(b)
}

Just like a normal methods, but with an operator rune as name.

@go-li

This comment has been minimized.

Copy link

commented Oct 21, 2018

To make it sure that operators always return something, and are optimized (by writing it's result to a specific memory location using input argument as output), I revise my proposal to:

type T struct{}
func MyCustomAdd(*T , *T, *T) *T {...}
func MyCustomSub(*T , *T, *T) *T {...}
func MyCustomMul(*T , *T, *T) *T {...}
func MyCustomMod(*T , *T, *T) *T {...}
func MyCustomShr(*T , uint, *T) *T {...}
func MyCustomShl(*T , uint, *T) *T {...}
func MyCustomBwAnd(*T , *T, *T) *T {...}
func MyCustomBwOr(*T , *T, *T) *T {...}
func MyCustomBwClr(*T , *T, *T) *T {...}
func MyCustomCompare(*T , *T) int {...}

If operator is passed to a generic function, the operator function must return it's third argument: Example:

func DoAddition(op +, operand1 *, operand2 *, result *) (*) {
	*result = operand1 + operand2
	return result
}

type T struct {x int}
func AddCorrect(a *T, b *T, ret *T) (ret2 *T) {
	(*ret).x = (*a).x + (*b).x
	return ret
}

func AddIncorrect(a *T, b *T, ret *T) (ret2 *T) {
	r := T{(*a).x + (*b).x}
	return &r
}

this will work:

op + AddCorrect
DoAddition(_, &T{}, &T{}, &T{})
// okay

this won't:

op + AddIncorrect
DoAddition(_, &T{}, &T{}, &T{})
// COMPILE TIME ERROR: Operator + mapped to AddIncorrect on line 54,
//does not return it's third argument on line 87774,
//when passed to generic function DoAddition on line 45645
@deanveloper

This comment has been minimized.

Copy link
Author

commented Oct 21, 2018

Okay, I think you're micro-optimizing a bit here. The syntax should be simple.

Also, as a minor point, you don't need to dereference before using the . operator. The compiler does that for you since it's part of the spec.

I'd also like to add that read-only types (I'd link the proposal if I weren't on my phone) would work well with this proposal, as operator functions could require that all inputs are read-only.

@deanveloper

This comment has been minimized.

Copy link
Author

commented Oct 21, 2018

I think forcing redeclaration of operators may lead to an "operator poisoning" of sorts... It would be especially inconvenient for people who use Go for mathematics to use a matrix, vector, and arbitrary-size int, and need to redeclare 50 operators.

Calling the comparison operators "default" doesn't make any sense to me.

I like the rest of the first part of your proposal, though. The second part not-so-much.

@go-li

This comment has been minimized.

Copy link

commented Oct 21, 2018

I am not merely speculating, I have a very serious intention to add operator overloading to my experimental language, named go-li. In fact I plan to do it next week, exactly how I propose.

Comparisons: Calling them default is wrong you're right. IMHO Most people would want to map all of them to the same function, in the C language tradition that returns negative int when a<b, zero when a==b, and positive int when a>b.

take look at this line:

func Foo(op comparison) {...}

How do you know that it is operator comparison, rather than parameter named op of type comparison? You don't know. Backwards compatibility is not 100% bulletproof. For this reason at least one of the words has to be reserved keyword, I propose:

func Foo(op default) {...}

or something like this:

op <=> MyComparison
func Foo(op <=>) {...}

or, alternatively, not using "op" but the "map" keyword to map operators.

map + MyAddition
func Foo(map +, map -, map comparison) {...}

alternatively, not using any arg names:

map + MyAddition
map <=> MyComparison
func Foo( +,  -, <=>, actual_parameter int) {...}
@go-li

This comment has been minimized.

Copy link

commented Oct 21, 2018

To prove it, here is example of code that uses generics, and would benefit from operator overloading. Fenwick tree (a data-structure that allows you to quickly sum a continuous sub-sequence of a large array.).
Fenwick tree in go-li language
Fenwick tree in folang
Fenwick tree example on folang playground
It needs the addition operator and negation operator , currently implemented using those callback functions.

The thing you probably disliked, is how we pass operators to generic functions. Trust me this is needed, otherwise people who don't prefer operators will have no way to call into generic functions that use operators. In this way they can, by passing the required concrete function that matches the callback signature as the parameter the way it is.

If we don't do this, the use of operators would spread like a disease:
i need package A
package A uses operators
therefore I define operators in my package so A can use them
since i defined operators i start using them (why not? they're already here)
suddenly my code is full of operators
other people need my package
therefore they define operators ... and so on

@deanveloper

This comment has been minimized.

Copy link
Author

commented Oct 21, 2018

I'm 100% okay with binding operators to existing function calls. What I don't like is redeclaring the operators in every package that's going to use them.

I really would like a new "op" or "operator" TLD. It would make sure source code stays readable. It could still be backward-compatible as it is a TLD and would only be used as one.

I think the best syntax would be something like

op + CustomAdd

func CustomAdd(*T, *T) *T { ... }

The only issue is that this would allow

op + func (*T, *T) *T { ... }

Sure, you could say "just don't allow inlining here", but I feel like that'd be against Go's simplistic design. Perhaps instead preventing that should just be a go vet check.

@c69

This comment has been minimized.

Copy link

commented Dec 1, 2018

TL;DR
Dear "go community", please read how its implemented in C++, then come back to this proposal. Also, you will probably need generics. And this is not trolling. @go-li has actually already brought up many valid points from that ecosystem in this thread.

Long list of comments:

  • Operator overloading beyond "i can do this" brings us to value semantics and abstract algebra, and requires several levels of constraints on types of the arguments to be done right. But it also allows to radically simplify all code dealing with Number-like object (i.e: the ones that have conceptual arithmetic operations available upon. E.g.: ints, floats, bigInts, matrices, complex numbers, quaternions, currency), and also provide convenience for things like Set operations (s1 + s2 == s1.union(s2)).
  • without equality (==) and its siblings (ordering / comparison) operator overloading turns into weak syntax sugar at best. Comparison operator definitions are a quite wordy, and so C++ plans to introduce <=> (starship) operator to simplify them.
  • while forcing totality on types (A, A)=>A sounds logical and cool, it will not let you implement [even such a simple thing as] multiplying matrix by scalar
  • regarding developer ergonomics, if operators are defined on types, then only packages using those types will have new semantics, and so developers would not be surprised by behavior of the operator.
@creker

This comment has been minimized.

Copy link

commented Dec 1, 2018

I also don't like the idea of re-declaration. If I import a package that defines operators on its types then I don't have access to those operators? That kinda makes them useless. The whole point of operator overloading is to give them to library consumers. I, as a library implementer, don't care about them and don't really need them.

Also, I don't think we should look at C++. It has too much complexity that's very specific to C++ - value semantics, rvalue/lvalue, move semantics etc etc etc. It looks, frankly, ridiculous how it works rights now and committee understands that. <=> tries to make it slightly better. There're better examples on how it can be done.

@maj-o

This comment has been minimized.

Copy link

commented Dec 1, 2018

Operator declaration without overloading is very useful. With this feature decimals can be made usable, complex numbers can be took out of core, even thinking of a flexible number type gets live.
For me the absolutly useless decimal type today is the reason for woting for operator methods. Operator overloading is no must for me, cause I define a new type when I want to overload, but declaring a new type meens also I can declare new operator methods. Overloading is useful for languages like C++, but (in my opinion) not for Go.

@hydroflame

This comment has been minimized.

Copy link

commented Jan 24, 2019

// Barycentric returns the barycentric coordinates for a point in a triangle.
// The barycentric coordinates cna be used for interpolation of {a, b, c} to
// point p (such as normals, texture coordinates, colors, etc)
func Barycentric(a, b, c, p *glm.Vec3) (u, v, w float32) {
	var v0, v1, v2 glm.Vec3
	v0.Sub(b, a)
	v1.Sub(c, a)
	v2.Sub(p, a)
	d00 := v1.Dot(&v1)
	d01 := v0.Dot(&v1)
	d11 := v1.Dot(&v1)
	d20 := v2.Dot(&v0)
	d21 := v2.Dot(&v1)

	denom := 1 / (d00*d11 - d01*d01)
	v = (d11*d20 - d01*d21) * denom
	w = (d00*d21 - d01*d20) * denom
	u = 1 - v - w
	return
}

^ disgusting, error prone, entirely because I can't define +* on vectors

All geometry/physics libraries are full of unreadable code because this feature is missing.

@as

This comment has been minimized.

Copy link
Contributor

commented Jan 25, 2019

@hydroflame Your example contains a lot of code with little context. Can you provide the example of imagined "non-disgusting, error minimizing" code where you are using the notation?

@StephanVerbeeck

This comment has been minimized.

Copy link

commented Apr 23, 2019

I would like to add some new insights to this topic:

  • when reading the original design goals of GO I thought this (operator overloading) would never happen
  • when going forward with this I would not reflect upon how other languages do it (see end of text why)
  • when implementing this we will quickly run into secondary decisions as operator overloading is closely linked to function overloading (same function name, different function argument types) and that is also something that GO was supposed to never implement

Proposed method for function overloading is via function names that SHOW both the operator and the position of the arguments.

Example 1:
func "_+_"(a,b int) (sum int) {
	sum = a + b 
	return
}

Example 2:
func "_++"(a *int) (result int) {
	result = *a
	*a = result+1
	return
}

Example 3:
func "++_"(a *int) (result int) {
	result = *a + 1
	*a = result
	return
}

Example 4:
func "_[_]=_"(arr map[int]string, idx string, value string){
	...
}

Example 5:
func "_._"(arr map[string]int, value string) int {
	return arr[value]
}

As you see this opens much wider possibilities than old-style operator overloading but (and I have implemented this in an compiler/bytecode-interpreter language so I know) this is bothersome and if I would face it again I would not go for operator overloading at all.

The only thing why you want to do this is that GO currently is lacking some of the more excotic data containers but francly. I do not think that GO needs this. It would change the language profoundly in a manner that makes it less distinct and less systematic (readable/predictable).

It also must be implemented on a low level in the parser and compiler so this change would be affect a large part of the code-base of the language.

@deanveloper

This comment has been minimized.

Copy link
Author

commented May 7, 2019

I reflect on other languages to learn from them by their mistakes. We should avoid the _[_] = _ operator and such similar operators which do not reflect arithmetic types - this proposal is strictly for creating a way to represent arithmetic types in Go. The syntax you provide is intriguing though. I may draft a new proposal in the near future which is less about overloading operators. It's hard to explain in a quick comment but hopefully it should still be simple.

@cosmos72

This comment has been minimized.

Copy link

commented Jul 12, 2019

There is an important but implicit (and probably slightly hidden) trade-off in operator overloading, which is visible if you compare the proposed functions/methods signatures with the methods of math/big:

TL;DR:

it is not possible to pass among the arguments a value or pointer where the result will be stored.

This will cause significant slowdown on large objects as arbitrary precision numbers (math/big.Int, math/big.Float, math/big.Rat), arbitrary size matrices (gonum.org/v1/gonum/mat/Dense) etc.

Full explanation:

Let's consider for example the addition between user-defined types: the syntax is not very relevant, it can be any of the existing proposals:

operator (*big.Int + *big.Int) Add
func Add(a, b *big.Int) *big.Int {
 // must allocate a new big.Int and return its address
}

or

operator(*big.Int) (a + b) {
  // idem, must allocate a new big.Int and return its address
}

or anything else.

In all cases, as highlighted in the comments above, the operator implementation must allocate a (possibly large) object where the result is stored. I used pointers *big.Int instead of values to make the issue more obvious.

On the other hand, math/big methods solve this issue by accepting an additional argument where they store the result - currently the receiver:

package big

// Add sets z to the sum x+y and returns z. 
func (z *Int) Add(x, y *Int) *Int {
  // store the result in z:
  // if its internal buffer is large enough, no allocation is needed.
}

Now, one may argue that the approach taken by math/big is a micro-optimization.
I am not really interested in discussing whether it's a "micro" optimization or not, but I think it's a very important optimization:

Imagine, instead of integers or floats, a library for linear algebra on arbitrary size matrices - for example gonum.org/v1/gonum/mat:
if operator overloading is part of Go, authors and users of such library will be tempted to support it.
At the moment, they have methods similar in spirit to math/big, for example:

package mat

// Add adds a and b element-wise, placing the result in the receiver
func (m *Dense) Add(a, b Matrix) {
  // store the result in m: also here, if its internal buffer
  // is large enough, no allocation is needed.
}

but operator overloading will not have the third argument, forcing to allocate it:

operator (Matrix + Matrix) Add
func Add(a, b Matrix) *Dense {
  // must allocate a new Dense and return its address
}

If you then write something like a + b + c + d on matrices, each + will allocate its result, which will need to be garbage collected at the end of the expression (except for the last one) - I expect it will be much slower than the (micro?) optimized

var *Dense a, b, c, d, z
// initialize a, b, c, d and z. Then:
z.Add(a, b)
z.Add(z, c)
z.Add(z, d)

So my question is about the following trade-off:

The (hopefully) improved readability and simplicity of operator overloading (if used correctly and not abused) is worth the slowdown it causes due to multiple allocations and subsequent garbage collection?

This happens if the the arguments of overloaded operators are values with internally allocated (and possibly large) buffers:
for example arbitrary precision numbers (math/big.Int, math/big.Float, math/big.Rat), arbitrary size matrices (gonum.org/v1/gonum/mat/Dense) etc.

P.S. one may attempt to solve this issue by adding an extra parameter also to overloaded operators, as for example:

operator (*big.Int + *big.Int) Add
func (z *big.Int) Add(a, b *big.Int) *big.Int {
  // fill z with the result of a + b and return z
}

but it does not work: it only moves the problem from method declaration to method use, because when invoking such operator+ in many cases there is no receiver:

var a, b, c, z *big.Int
// initialize a, b, c, d and z. Then:

z = a + b // a clever compiler may use z as receiver of operator+

z = a + b + c // z can be the receiver of the second addition.
                    // but who is the receiver of the first addition, i.e. a + b ?
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jul 12, 2019

@cosmos72 You make an excellent point, but I note that that compiler can convert z = a + b + c into z = a + b; z = z + c. Any expression no matter how complex is just a sequence of unary and binary operations. So the question is whether and when the compiler can reliably turn

t1 = a + b
z = t1 + c

into

z = a + b
z = z + c

That is, when is it safe to replace a compiler-introduced temporary variable with the final result of the expression, and, similarly, when it is safe to combine compiler-introduced temporaries? It's a problem not dissimilar to register allocation with an unbounded number of registers, although it's possible that there are cases where there is some additional aliasing that must be considered.

Anyhow, I think the main point is, if we have operator functions, when writing the implementation there needs to be a way to express not just the operands but also the destination aka receiver.

@cosmos72

This comment has been minimized.

Copy link

commented Jul 13, 2019

Thanks @ianlancetaylor

Sorry for the second long post in a row - the analysis of "compiler-introduced temporary variables" and "aliasing" contains some subtle points and, if operator overloading is added to the language, they are going to affect Go programmers and compilers writers for a long time.

It is thus important to fully understand the consequences in detail, as they are non-trivial - I hope you agree it's the only way to make an informed decision.

Let's start from a concrete example of operator overloading:

package main
import (
  "math/big"
)
var _2 = big.NewRat(2, 1)

// compute z = ax^2 + 2bxy + cy^2 + 2dx + 2fy + g
func QuadraticCurve(z, x, y, a, b, c, d, f, g *big.Rat) *big.Rat {
  z = a*x*x + _2*b*y + c*y*y + _2*d*x + _2*f*y + g
  return z
}

The implementation surely looks enticingly similar to the mathematical formula.

Without operator overloading, the equivalent code is ugly, difficult to read
and requires some thought to write:

// compute z = ax^2 + 2bxy + cy^2 + 2dx + 2fy + g
func QuadraticCurve(z, x, y, a, b, c, d, f, g *big.Rat) *big.Rat {
  var t big.Rat
  
  z.Mul(a, x).Mul(z, x)               // z = a*x*x
  
  t.Mul(_2, b).Mul(&t, x).Mul(&t, y)  // t = 2*b*x*y
  z.Add(z, &t)                        // we can now reuse t
  
  t.Mul(c, y).Mul(&t, y)              // t = c*y*y
  z.Add(z, &t)                        // we can now reuse t
  
  t.Mul(_2, d).Mul(&t, x)             // t = 2*d*x
  z.Add(z, &t)                        // we can now reuse t

  t.Mul(_2, f).Mul(&t, y)             // t = 2*f*y
  z.Add(z, &t)                        // we can now reuse t

  return z.Add(z, g)
}

There is no need to highlight the advantages of the first version and the disadvantages of the second.
I will just say that the first seems a high-level language, while the second has an assembler-like look,
where variables take the place of CPU registers and are managed explicitly one by one - including temporary variables.

The disadvantages of the first version can be summarized in a question:
can the compiler be sufficiently smart that both compiled versions run equally fast?
(without artificially penalizing the second, obviously)

Let's see what a sufficiently smart compiler should do.

  1. compute how many temporary variables are needed.

This is somewhat similar to register allocation, as you pointed out.
It can be visualized as a tree coloring problem, where the tree is the Abstract Syntax Tree (AST):

      z
      =
      +
     / \________
    *           +
   / \         / \________
  a   *       *           +
     / \     / \         / \________
    x   x   2   *       *           +
               / \     / \         / \________
              b   *   c   *       *           +
                 / \     / \     / \         / \
                x   y   y   y   2   *       *   g
                                   / \     / \
                                  d   x   2   *
                                             / \
                                            f   y

The tree coloring problem can be stated as follows:

  • All leaves have no color - nothing to allocate there, as the values are stored in existing variables.
  • All internal nodes (the + and *) must be colored i.e. their result must be stored
    in some temporary variable. Each color is a different temporary variable.
  • The immediate children of an internal node must all have different colors:
    you cannot store two (or more) partial computations into the same temporary variable.

Also, the compiler cannot take advantage of any mathematical property of the operators,
such as neutral element e op x == x, commutativity a * b == b * a,
associativity (a * b) * c == a * (b * c) and distributivity (a + b) * c == a * c + b * c
because the user-defined operators may not have such properties.
For example, matrix or quaternion multiplication is not commutative
and octonion multiplication is neither commutative nor associative.

In this case, the AST is quite simple and needs only two colors, i.e. two temporary variables:

  • a*x*x is blue (i.e. all its internal nodes are blue)
  • 2*b*x*y is red (i.e. all its internal nodes are red)
  • c*y*y is red
  • 2*d*x is red
  • 2*f*y is red
    and all other internal nodes (the +) are blue
  1. analyze aliasing

What's aliasing actually?
One can attempt to define it as follows: if two values are aliased, they may reference the same mutable data.
Thus modifying the content of one value may modify the other.

In our example, the "sufficiently smart" compiler should realize that z is only used to store the final result, and thus, if it demonstrates that z is not aliased to any other variable, it can directly use z
as one of the temporary variables.
This is a tricky point, both for programmers and for the compiler:

if z is aliased to some other argument passed to QuadraticCurve(), the second version of the function will not behave as expected, i.e. it will produce a different result.

Maybe it's a bug, maybe it's intentional (and hopefully documented), but in any case there is no way to inform the compiler of such peculiarity. In other words, there is currently no Go syntax to tell the compiler which variables may alias each other and which variables are guaranteed not to alias each other.

As a side note, C (not C++) has the restrict keyword to annotate a pointer as "it will not alias anything else", but a single mistake using it will introduce subtle and hard-to-find bugs.
Not something I would recommend for Go.

In Go, analyzing aliasing at compile time can probably be analogous to escape analysis:
when a function is compiled, its callers may not be known yet, so the compiler cannot
assume anything about the aliasing among its arguments, but it can analyze the aliasing
created or destroyed by the function body.
When local variables are created, they are not aliased to anything: aliasing can only be introduced by storing shared data inside them, or by storing the same slice, map or address into multiple places.
(Note: what about closures?)

In short, aliases analysis will probably be able to produce a proof "these two variables are not aliases"
only for local variables, and only if the compiler performs an aliasing analysis on each function
while compiling it, similarly to how it currently performs escape analysis.

  1. actually creating "compiler-introduced temporary variables"

There is currently no uniform syntax for this.

Many types guarantee in their documentation (which a compiler cannot read) that the zero value is valid and useful, but other types give no such guarantee.

Also, in many cases operator overloading will accept pointers to named types, as the *big.Int used in the example above, and the "compiler-introduced temporary variables" of type *big.Int must be a non-nil pointer to big.Int.

In general, if the compiler needs to introduce a temporary variable of type *T, should it call new(T)?
What happens if the zero value of T is not the right thing to use?

The solution I am currently experimenting with (for generics, which have a similar need for "user-introduced temporary variables") is the following:

  • a compiler-generated temporary variable of type T is the value returned by zero.New() below, provided that the code compiles and zero.New() returns a T
    var zero T
    zero.New()
    Note that T may be a pointer type, i.e. a *U, as for the example *big.Int. In such case, the method T.New() must have pointer receiver and will be called with a nil receiver - which must not be dereferenced.

This is a way to introduce C++-style constructors in Go - New() becomes equivalent to a zero-argument constructor, but I could not find a better way.

And it certainly has the disadvantage that Go code using operator overloading becomes harder to reason about: the compiler will invisibly insert calls to T.New()

@maj-o

This comment has been minimized.

Copy link

commented Jul 13, 2019

@cosmos72 much work, chapou.

But the main point is just to have operator methods, like introduced by Mr. Griesemer in 2016. This is absolutely needed for a usable decimal implementation or other useful number formats. And for minimizing the core of Go. For example complex numbers could be implemented in a separate package.

Overloading is not needed or wanted.

That this works, can be seen here https://youtu.be/vLxX3yZmw5Q and here https://github.com/griesemer/dotGo2016?files=1
Maybe I missed the point. For me it seems quiet simple. Here is the proposal and the solution.
Motivation is business calculations: invoices, taxes, ... This is not possible without a good decimal implementation with usual operators.

@cosmos72

This comment has been minimized.

Copy link

commented Jul 14, 2019

@maj-o I was not discussing the syntax to define operator on new types - it can be operator overloading, operator methods, annotations, or any other proposal.

My work above analyzes the amount of garbage (temporary values) produced by operators on user-defined types, the techniques to minimize the garbage, and the limits of such techniques.

The reason is simple: less garbage means fewer allocations and fewer garbage collections, which translates to faster execution (provided the algorithm is not changed in other ways)

The only user-visible part of my analysis is that functions/methods that implement operators should accept an additional parameter, where they store the result before returning it. The Add, Sub, Mul, Div ... methods of types inside math/big do exactly that. Note: this has no visible effect on how operators are used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.