Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Fix #31392, unaliasing of broadcast arguments against destinations with repeated indices #31407

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

mbauman
Copy link
Sponsor Member

@mbauman mbauman commented Mar 19, 2019

When introducing the new broadcasting implementation (which came alongside greater
semantic guarantees with regards to aliasing source and destination), I found that
we needed a performance optimization for the common case where the destination was
=== to an argument. This is quite common due to the .op= syntax and indeed is
safe in most cases as the iteration order between the destination and the source
is matched. This performance optimization is not safe, however, in the case that
the destination has multiple locations that refer to the same memory.

This just ensures our unaliasing pass is indeed doing what we want it to. You can still broadcast into these sorts of "self-aliasing" arrays — which will indeed repeatedly assign into the same location — but regardless of how you reference those destinations on the RHS (be it implicitly via .op=, with or without @views, explicitly on the RHS, etc), this will now ensures the RHS arguments are unaliased from the destination before performing the operation. I believe the following example from #31392 is a strong argument in favor of this fix:

Previously:

julia> x = [3]
       v = view(x, [1,1])
       x[[1,1]] .+= v # Both RHS arguments are copies; `v` is unaliased, the implicit `x[[1,1]]` makes a copy
       x
1-element Array{Int64,1}:
 6

julia> x = [3]
       v = view(x, [1,1])
       @views x[[1,1]] .+= v # The implicit `@view x[[1,1]]` is _not_ unaliased
       x
1-element Array{Int64,1}:
 9

julia> x = [3]
       I = [1,1]
       v = view(x, I)
       x[I] .+= v # The implicit `x[I]` is a copy, `v` is _not_ unaliased as it is `=== @view x[I]`
       x
1-element Array{Int64,1}:
 9

julia> x = [3]
       I = [1,1]
       v = view(x, I)
       @views x[I] .+= v # Neither RHS argument is a copy
       x
1-element Array{Int64,1}:
 12

Clearly, this is unsustainable. With this PR the answer is 6 in all cases. I believe this is the correct answer because it matches how things behave with other views:

julia> x = [3,3]
       v = view(x, [1,1])
       x .+= v # v is unaliased so this is `x[1] = x[1] + 3; x[2] = x[2] + 3`
       x
2-element Array{Int64,1}:
 6
 6 # And not 9!

Marking this as both a bugfix and a minor change for triage as apparently some packages may have been depending upon this in certain circumstances.

@mbauman mbauman added kind:bugfix This change fixes an existing bug domain:broadcast Applying a function over a collection needs news A NEWS entry is required for this change status:triage This should be discussed on a triage call kind:minor change Marginal behavior change acceptable for a minor release labels Mar 19, 2019
@JeffBezanson
Copy link
Sponsor Member

Nice. I assume if we merge this, we probably don't need/want #31391, since this PR more accurately detects when a copy is needed.

@mbauman
Copy link
Sponsor Member Author

mbauman commented Mar 19, 2019

Yes, that's exactly right (edit: with the caveat that other broadcast implementations in packages need to be similarly careful about aliasing behaviors, but they need to be careful about aliasing in any case).

@Keno Keno removed status:triage This should be discussed on a triage call labels Mar 28, 2019
@mbauman
Copy link
Sponsor Member Author

mbauman commented Mar 28, 2019

Triage is in favor here given the following observations:

  • This behavior matches our other unaliasing behaviors
  • Hopefully the compiler can work harder in the future to avoid making unnecessary copies
  • This decision is one that lends itself towards a slightly friendlier semantic towards parallelism. There are many other concerns there, but it's good to keep our eyes on that goal.

@mbauman
Copy link
Sponsor Member Author

mbauman commented Mar 28, 2019

Let's see if I missed any obvious cases where a copy is now made:

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@JeffBezanson
Copy link
Sponsor Member

Is this ready to merge as-is? If so, we might want to consider squeezing it into 1.2.

@mbauman
Copy link
Sponsor Member Author

mbauman commented May 23, 2019

I'm not terribly thrilled with this because it is making lots of unnecessary preventative copies for custom abstract array types. Perhaps it should only make the preventative copy for the known array types that can self-alias (e.g., just SubArray and a few wrappers thereof?).

…th repeated indices

When introducing the new broadcasting implementation (which came alongside greater
semantic guarantees with regards to aliasing source and destination), I found that
we needed a performance optimization for the common case where the destination was
`===` to an argument. This is quite common due to the `.op=` syntax and indeed is
safe in most cases as the iteration order between the destination and the source
is matched. This performance optimization is not safe, however, in the case that
the destination has multiple locations that refer to the same memory.
but still provide the hook for arrays to improve
@mbauman mbauman force-pushed the mb/self-aliasing-broadcasts branch from d64f74c to 55e10b2 Compare April 27, 2020 22:23
@mbauman
Copy link
Sponsor Member Author

mbauman commented Apr 27, 2020

Ok, we've been letting perfect be the enemy of the good here for far too long. Let's be less conservative — and default to not unaliasing self broadcasts — but still allow SubArrays and other custom array types to catch and fix this situation.

I think this is a good compromise for now.

"""
potentially_self_aliased(::DenseArray) = false
potentially_self_aliased(A::StridedArray) = any(==(0), strides(A))
potentially_self_aliased(A::SubArray) = any(!allunique, A.indices)
potentially_self_aliased(A::SubArray) = any(map(!allunique, A.indices))
Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note for posterity: any(map(!allunique, A.indices)) does better compile-time evaluation with heterogeneous tuples and constant-folds to return 0 more frequently.

@mbauman mbauman added kind:minor change Marginal behavior change acceptable for a minor release needs news A NEWS entry is required for this change and removed kind:minor change Marginal behavior change acceptable for a minor release needs news A NEWS entry is required for this change labels Apr 29, 2020
@vtjnash
Copy link
Sponsor Member

vtjnash commented Oct 29, 2020

GTG then?

@jeremiedb
Copy link

Sorry to bump, but were there reasons preventing merging a fix for this?
AFAIK, this PR may fix a silent flawed gradients retuned by zygote on a common operation (embedding).

@jeremiedb jeremiedb mentioned this pull request Mar 24, 2021
4 tasks
@StefanKarpinski
Copy link
Sponsor Member

bumpity

@StefanKarpinski StefanKarpinski added the status:triage This should be discussed on a triage call label May 7, 2021
@StefanKarpinski
Copy link
Sponsor Member

Marking for triage so discuss if there's actually anything preventing this from being merged.

broadcast_unalias(::Nothing, src) = src

"""
potentially_self_aliased(A)
Copy link
Sponsor Member Author

@mbauman mbauman May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A year Two years later, I'm not a big fan of this name. I'm guessing I used potentially as an attempt to say that trues are conclusive but falses are not... but I don't think it actually reads that way.

mightalias documents itself as a "conservative" test for sharing the same memory, but I don't even know what that means on its face. Is it more likely to have false positive or a false negative?

In both cases, I used a different word than my usual go-to "maybe_foo" as an attempt to do some work in one direction or the other, but I don't think it was successful because I'm having a hard time piecing together what they mean right now. In short, here's what we have between these two systems:

function returns wrong if...
mightalias true the arrays reference distinct subsets of the same memory region in a complicated manner
mightalias false a custom array has not tracked its internal field(s) with dataids and one of those aliases
potentially_self_aliased true an array goofed its implementation (the default definitions should not have false positives)
potentially_self_aliased false this is the default answer; a self-aliasing array didn't define a method on this obscure function

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Numpy calls this may_have_internal_overlap with tri-state output (known to not overlap, known to overlap, and unknown).

@bkamins
Copy link
Member

bkamins commented May 13, 2021

@mbauman - I think given this change we should add potentially_self_aliased to the Julia manual section on adding broadcasting to custom types (as user defined types might want to decide if they true or false should be returned by this function).

@N5N3
Copy link
Member

N5N3 commented Nov 21, 2021

Something might unrelated:
#43153 founds that x .= f.(x) can't be vectorlized if ndims(x) > 1. (Looks like LLVM failed to prove that LHS and RHS share the same access pattern).
A easy fix is to add ivdep after @simd, but we need to ensure that dest array is not self aliased.
Looks like this PR's potentially_self_aliased could be reused there.

Returns true if multiple locations in `A` reference the same memory
"""
potentially_self_aliased(::DenseArray) = false
potentially_self_aliased(A::StridedArray) = any(==(0), strides(A))
Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:broadcast Applying a function over a collection kind:bugfix This change fixes an existing bug kind:minor change Marginal behavior change acceptable for a minor release needs news A NEWS entry is required for this change status:triage This should be discussed on a triage call
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants