Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax for maybe-mutating operations #30407

Open
chethega opened this issue Dec 16, 2018 · 7 comments
Open

Syntax for maybe-mutating operations #30407

chethega opened this issue Dec 16, 2018 · 7 comments

Comments

@chethega
Copy link
Contributor

A somewhat typical situation is that we need to compute a = b + c, and we know that whatever object that was previously bound to a is not needed anymore. However, we don't know whether a is effectively immutable (e.g. Int, SArray) or mutable (e.g. BigInt, Array), because we are writing generic code. In the immutable case, a = b + c is the right thing. In the Array case, a .= b .+ c is the right thing, and in the BigInt case the right thing is Base.GMP.MPZ.add!(a, b, c).

Assignments can have three types of mutation semantics: (1) no mutation, use a = ..., (2) mutate, use a .= ... as syntactic sugar for a function call, (3) do whatever is fast.

We currently can't express (3), i.e. we want a way for programmers to express that mutation is a valid optimization. In case of functions with long names, this could be done with keyword argument, e.g. a = union(a, b; inplace = true) instead of a = union(a,b) (immutable set) or union!(a,b) (mutable set). But it is not entirely clear what to use for infix operators.

Something that appears to be still free (i.e. currently a parsing failure) is a ?.= b .+ c, which could use a modification of the broadcast machinery: Lower this to a = broadcast!(MaybeInplace(), +, a, b, c) which then becomes either a = (broadcast!(+, a, b, c); a) / a = (Base.GMP.MPZ.add!(a,b,c); a) in the mutable case, or a = (broadcast(+,b,c)) in the immutable case, depending on types and operations. a !.= ... and a ?= ... are also free.

@antoine-levitt
Copy link
Contributor

I don't think syntax transformations have type information, so I don't see how that would work. a = b is a fundamentally different operation from a .= b (assignment vs function call), and mixing them like this seems tricky. For BigInt, any reason a .= b can't just be made to mutate a? I do agree with the problem however of writing code that works for Arrays and for SArrays.

@chethega
Copy link
Contributor Author

chethega commented Dec 16, 2018

Yes, that's why the user would need to write e.g. a ?.= b, which lowers to a = broadcast!(MaybeInplace(), identity, a, b), and broadcast!(::MaybeInplace, fun, dst, args...) decides, depending on typeof(dst) (during dispatch?), whether to mutate dst to contain the result and then return dst, or rather to return a new object with the result. That way, the same generic code works for both Array and SArray, or Int and BigInt.

@antoine-levitt
Copy link
Contributor

antoine-levitt commented Dec 16, 2018

I see, thanks for the explanation. Compared to now it would basically just add an a=a in the case of a regular array, then. If that works, can't it just be the default for .=?

@JeffBezanson
Copy link
Sponsor Member

I think the right approach for this might be something like the bounds check removal mechanism, which is a collaboration between the compiler and libraries. Basically, provide a way for a function to specify how to do something in-place if the compiler can prove it's safe.

@tkf
Copy link
Member

tkf commented Dec 17, 2018

I was thinking this for some time too. I want to add that it's useful outside broadcast, like */mul!, setindex/setindex!, and setproperty!. But I guess the broadcast-like dispatchable computation graph is the key.

Currently, a .= f.(b, c) is lowered to

CodeInfo(
1%1 = (Base.broadcasted)(f, b, c)
│   %2 = (Base.materialize!)(a, %1)
└──      return %2
)

Would it make sense to change it to something like this?

CodeInfo(
1%1 = (Base.broadcasted)(f, b, c)
│   %2 = (Base.materialize!)(a, %1)
│   a = %2  # added
└──      return %2
)

This way, libraries like StaticArrays.jl can start experimenting with unified interface. It just has to ignore the argument a and return a new modified version of the same type. If this is available, I think libraries like @dlfivefifty's LazyArrays.jl can already combine * and mul! (although it needs a new binary operator for this).

For setindex[!] and setproperty!, I think it would be nice to go with lens-based approach @jw3126's Setfield.jl is using. The basic idea is to lower a.x[i] .= f.(b, c) to something like

il = IndexLens((i,))
pl = PropertyLens(:x)
lv = dotview(a, pl  il)
bc = broadcasted(f, b, c)
a, rhs = materialize!(lv, bc)
return rhs

Like the first example, materialize!(lv, bc) may return a new instance of a (e.g., when a is a named tuple of static arrays). To make it consistent and composable, the fist example can also be written using IdentityLens.

Of course, this approach can be combined with a new symbol like ?.= with a new "execution strategy" like MaybeInplace.

@chethega
Copy link
Contributor Author

chethega commented Dec 17, 2018

I think the right approach for this might be something like the bounds check removal mechanism, which is a collaboration between the compiler and libraries. Basically, provide a way for a function to specify how to do something in-place if the compiler can prove it's safe.

That's the right way of obtaining performant BigInt code from Int code, or performant broadcast! code from broadcast code. That would be super cool, but is completely orthogonal to what I'm asking about.

My proposal goes in the opposite direction: If the programmer has carefully thought about where to reuse objects (as any good Array or BigInt code must, today!), then make it simple to also work on immutable types (Int, SArray). This simply needs two things: The maybe mutating operation / function returns the object (old one if mutated, new one if immutable), and the programmer rebinds the variable to the result. This rebinding / assignment is a nop in case of successful mutation, and this mechanism would ideally introduce no runtime overhead and only small syntactic overhead in case of successful mutation.

Sometimes, mutation is essential: There are long-lived references to the object all over the place. Then, a mutating BigInt algorithm would need to work with Ref{Int} instead of Int; and we can't ask the compiler to read our mind, so we need to error out (e.g. MethodError for setindex!). But in my experience, it is quite common that the hand-optimized mutating code could work on immutable types as well, if we only told the compiler that it is permitted to skip mutation, instead of throwing an exception.

@mbauman
Copy link
Sponsor Member

mbauman commented Dec 17, 2018

See #19992

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants