Syntax for maybe-mutating operations #30407

chethega · 2018-12-16T16:53:22Z

A somewhat typical situation is that we need to compute a = b + c, and we know that whatever object that was previously bound to a is not needed anymore. However, we don't know whether a is effectively immutable (e.g. Int, SArray) or mutable (e.g. BigInt, Array), because we are writing generic code. In the immutable case, a = b + c is the right thing. In the Array case, a .= b .+ c is the right thing, and in the BigInt case the right thing is Base.GMP.MPZ.add!(a, b, c).

Assignments can have three types of mutation semantics: (1) no mutation, use a = ..., (2) mutate, use a .= ... as syntactic sugar for a function call, (3) do whatever is fast.

We currently can't express (3), i.e. we want a way for programmers to express that mutation is a valid optimization. In case of functions with long names, this could be done with keyword argument, e.g. a = union(a, b; inplace = true) instead of a = union(a,b) (immutable set) or union!(a,b) (mutable set). But it is not entirely clear what to use for infix operators.

Something that appears to be still free (i.e. currently a parsing failure) is a ?.= b .+ c, which could use a modification of the broadcast machinery: Lower this to a = broadcast!(MaybeInplace(), +, a, b, c) which then becomes either a = (broadcast!(+, a, b, c); a) / a = (Base.GMP.MPZ.add!(a,b,c); a) in the mutable case, or a = (broadcast(+,b,c)) in the immutable case, depending on types and operations. a !.= ... and a ?= ... are also free.

The text was updated successfully, but these errors were encountered:

antoine-levitt · 2018-12-16T18:00:17Z

I don't think syntax transformations have type information, so I don't see how that would work. a = b is a fundamentally different operation from a .= b (assignment vs function call), and mixing them like this seems tricky. For BigInt, any reason a .= b can't just be made to mutate a? I do agree with the problem however of writing code that works for Arrays and for SArrays.

chethega · 2018-12-16T18:59:53Z

Yes, that's why the user would need to write e.g. a ?.= b, which lowers to a = broadcast!(MaybeInplace(), identity, a, b), and broadcast!(::MaybeInplace, fun, dst, args...) decides, depending on typeof(dst) (during dispatch?), whether to mutate dst to contain the result and then return dst, or rather to return a new object with the result. That way, the same generic code works for both Array and SArray, or Int and BigInt.

antoine-levitt · 2018-12-16T20:02:30Z

I see, thanks for the explanation. Compared to now it would basically just add an a=a in the case of a regular array, then. If that works, can't it just be the default for .=?

JeffBezanson · 2018-12-16T21:13:47Z

I think the right approach for this might be something like the bounds check removal mechanism, which is a collaboration between the compiler and libraries. Basically, provide a way for a function to specify how to do something in-place if the compiler can prove it's safe.

tkf · 2018-12-17T01:57:14Z

I was thinking this for some time too. I want to add that it's useful outside broadcast, like */mul!, setindex/setindex!, and setproperty!. But I guess the broadcast-like dispatchable computation graph is the key.

Currently, a .= f.(b, c) is lowered to

CodeInfo(
1 ─ %1 = (Base.broadcasted)(f, b, c)
│   %2 = (Base.materialize!)(a, %1)
└──      return %2
)

Would it make sense to change it to something like this?

CodeInfo(
1 ─ %1 = (Base.broadcasted)(f, b, c)
│   %2 = (Base.materialize!)(a, %1)
│   a = %2  # added
└──      return %2
)

This way, libraries like StaticArrays.jl can start experimenting with unified interface. It just has to ignore the argument a and return a new modified version of the same type. If this is available, I think libraries like @dlfivefifty's LazyArrays.jl can already combine * and mul! (although it needs a new binary operator for this).

For setindex[!] and setproperty!, I think it would be nice to go with lens-based approach @jw3126's Setfield.jl is using. The basic idea is to lower a.x[i] .= f.(b, c) to something like

il = IndexLens((i,))
pl = PropertyLens(:x)
lv = dotview(a, pl ∘ il)
bc = broadcasted(f, b, c)
a, rhs = materialize!(lv, bc)
return rhs

Like the first example, materialize!(lv, bc) may return a new instance of a (e.g., when a is a named tuple of static arrays). To make it consistent and composable, the fist example can also be written using IdentityLens.

Of course, this approach can be combined with a new symbol like ?.= with a new "execution strategy" like MaybeInplace.

chethega · 2018-12-17T14:01:19Z

I think the right approach for this might be something like the bounds check removal mechanism, which is a collaboration between the compiler and libraries. Basically, provide a way for a function to specify how to do something in-place if the compiler can prove it's safe.

That's the right way of obtaining performant BigInt code from Int code, or performant broadcast! code from broadcast code. That would be super cool, but is completely orthogonal to what I'm asking about.

My proposal goes in the opposite direction: If the programmer has carefully thought about where to reuse objects (as any good Array or BigInt code must, today!), then make it simple to also work on immutable types (Int, SArray). This simply needs two things: The maybe mutating operation / function returns the object (old one if mutated, new one if immutable), and the programmer rebinds the variable to the result. This rebinding / assignment is a nop in case of successful mutation, and this mechanism would ideally introduce no runtime overhead and only small syntactic overhead in case of successful mutation.

Sometimes, mutation is essential: There are long-lived references to the object all over the place. Then, a mutating BigInt algorithm would need to work with Ref{Int} instead of Int; and we can't ask the compiler to read our mind, so we need to error out (e.g. MethodError for setindex!). But in my experience, it is quite common that the hand-optimized mutating code could work on immutable types as well, if we only told the compiler that it is permitted to skip mutation, instead of throwing an exception.

mbauman · 2018-12-17T14:20:12Z

See #19992

Tokazama mentioned this issue Aug 26, 2022

More specific characterization of mutable collections with can_setindex and can_change_size #46500

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syntax for maybe-mutating operations #30407

Syntax for maybe-mutating operations #30407

chethega commented Dec 16, 2018

antoine-levitt commented Dec 16, 2018

chethega commented Dec 16, 2018 •

edited

antoine-levitt commented Dec 16, 2018 •

edited

JeffBezanson commented Dec 16, 2018

tkf commented Dec 17, 2018 •

edited

chethega commented Dec 17, 2018 •

edited

mbauman commented Dec 17, 2018

Syntax for maybe-mutating operations #30407

Syntax for maybe-mutating operations #30407

Comments

chethega commented Dec 16, 2018

antoine-levitt commented Dec 16, 2018

chethega commented Dec 16, 2018 • edited

antoine-levitt commented Dec 16, 2018 • edited

JeffBezanson commented Dec 16, 2018

tkf commented Dec 17, 2018 • edited

chethega commented Dec 17, 2018 • edited

mbauman commented Dec 17, 2018

chethega commented Dec 16, 2018 •

edited

antoine-levitt commented Dec 16, 2018 •

edited

tkf commented Dec 17, 2018 •

edited

chethega commented Dec 17, 2018 •

edited