Skip to content

Commit

Permalink
RFC: Effect Preconditions - or - the future of @inbounds
Browse files Browse the repository at this point in the history
This proposal is an attempt to tie together some of the recent discussion around
the future of `@inbounds`, formal interface checking (a long the lines of
my proposal in https://github.com/Keno/InterfaceSpecs.jl), and `--check-bounds`.

# Reviewing `@inbounds`

## History
The `@inbounds` annotation has been in the langauge for a decade at this point [1].
It is a crictical tool to achieve machine performance in high-performance applications
where the overhead of boundschecking is prohibitive [2]. At the same time, as Julia
has matured, the dangers of this macro have become more clear. It is essentially
impossible to use correctly in generic code and instances where code (including
code in base from before we were culturally more careful about inbounds) are
using `@inbounds` have become a not-uncommon complaint-point about the language
as a whole [3]. In current practice, at least in Base, we have become extremely
hesitant about the introduction of new `@inbounds` annotations (see e.g.
discussion in #41200). At the same time, the ability to disable bounds checking
remains a critical capability for several of our user communities (c.f. #50239
and related discussions). So how do we move forward here? I'm hoping we can find
a way forward that achieves the same benefits of `@inbounds` but in a way where
it is possible to re-establish soundness if desired.

## When is inbounds still required?

### LLVM's current capabilities

Let's look at a couple of examples. First:
```
function sum1(A::Vector{Int64})
	a = zero(eltype(A))
	for i in eachindex(A)
		a += A[i]
	end
	return a
end
```

Is inbounds required here in modern julia? The answer is no - LLVM's range analysis
proves that the bounds check is dead and eliminates it.
(The same holds true for the `1:length(A)` version, although this was not always true).

What about the following:
```
function sum2(A::Vector{Int64}, B::Vector{Int64})
	a = zero(eltype(A))
	for i in eachindex(A, B)
		a += A[i] + B[i]
	end
	return a
end
```

Here, LLVM is again able to eliminate the boundscheck, though of course the
`eachindex` introduces an additional check that the array shapes are compatible.
Even this is still ok-ish:

```
# Noinline here just for illustration, think some big internal function that is
# not inlineable.
@noinline function sum_partial(A::Vector{Int64}, upto::Int)
	a = zero(eltype(A))
	for i in 1:upto
		a += A[i]
	end
	return a
end
sum3(A::Vector{Int64}) = sum_partial(A, length(A))
```

The generated code here runs through the loop. LLVM's vectorizer has support for bounds-like
checks and can remove them from the vector body. However, in scalar code (e.g. Float64
without fastmath), LLVM does still perform the bounds check in every loop iteration rather
than hoisting it outside. This is probably a bug somewhere. The IRCE pass is supposed to
take care of this, so for the present purposes, let's assume that this will also go through
eventually.

That said, the takeaway here is that, for simple cases, where everything is inlined, LLVM
is reasonably powerful at eliminating these checks.

### The effect-separation trick

Let's consider a case like this
```
function repeat_outer(a::AbstractMatrix, (m,n)::NTuple{2, Any})
    o, p = size(a,1), size(a,2)
    b = similar(a, o*m, p*n)
    for j=1:n
        d = (j-1)*p+1
        R = d:d+p-1
        for i=1:m
            c = (i-1)*o+1
            b[c:c+o-1, R] = a
        end
    end
    return b
end
```

The setindex! call in here goes through a few layers of dispatch, but eventually
ends up at:

```
function _setindex!(l::IndexStyle, A::AbstractArray, x, I::Union{Real, AbstractArray}...)
    @inline
    @BoundsCheck checkbounds(A, I...)
    _unsafe_setindex!(l, _maybe_reshape(l, A, I...), x, I...)
    A
end
```

This is a very common pattern in our code, where we have an `@inline` function that
first checks for the in-boundedness and then performs an unsafe operation that assumes
everything is inbounds.

This pattern is quite good. By allowing the boundscheck itself to be inlined, LLVM can
eliminate the boundscheck using local context from the calling function (in theory
at least - in practice we still have an `@inbounds` there because LLVM isn't strong
enough currently).

However, the pattern is somewhat hard to use consistently and correctly, so we generally
only use it in the low level indexing code.

# Effect-precondition synthesis

## `@inbounds` motivations

In the previous section I said that the effect-separation was good but hard
to do consistently and correctly. So can we have the compiler implement this
for us. Suppose we had a macro `@hoist_boundschecks` that worked something like
the following:

```
function _setindex!(l::IndexStyle, A::AbstractArray, x, I::Union{Real, AbstractArray}...)
    @inline
    @hoist_boundscheck _safe_setindex!(l, _maybe_reshape(l, A, I...), x, I...)
    A
end
# = Expands to =#
function _setindex!(l::IndexStyle, A::AbstractArray, x, I::Union{Real, AbstractArray}...)
    @inline
    #= All the boundschecks from _safe_setindex! hoisted =# checkbounds(A, I...)
    #= Switched to unsafe_setindex, but now proven correct =#
    _unsafe_setindex!(l, _maybe_reshape(l, A, I...), x, I...)
    A
end
```

Where essentially what the macro would do is go through the `_safe_setindex!` call
and synthesize a check that ensures that all calls are inbounds. Of course
such a synthesis is not always possible in general, but it's not that hard for simple
cases (a trivial way to do it is to duplicate the function, delete all the side
effects other than throwing boundserrors and then let LLVM go to town on it).

## Generalizing

There is a few problems with the macro as suggested in the previous section. In particular,
it changes the ordering of side effects and errors in the function, which may be desirable,
but I would like have explicit control over. My proposal is to have a slight generalization
of it that works roughly as follows:

```
function foo(A, I)
	@split_effects :nothrow bar(A, I)
end
#= Expands to =#
function foo(A, I)
	if precondition(Val{:nothrow}(), bar, A, I)
		@assume_effects :nothrow bar(A, I)
	else
		bar(A, I)
	end
end
```

Where `precondition` like the proposed macro above is some synthesized predicate
that ensures that the resulting function does not throw. Of course this brings up
many questions, such as the implementation of `precondition` and the statement-position
`@assume_effects`, which we currently don't have, but let's explore the implications
of this a bit further.

## Precondition inference and composition

The question of coming up with these kinds of preconditions is a well-known problem in the
compiler literature. The term to look for is "weakest precondition synthesis". How exactly
do this best is somewhat outside the scope of this writeup, but I do want to highlight
a the primary property that suggests the idea: functional composition of functions is boolean
composition of preconditions:

```
function foo()
	bar()
	baz()
end

precondition(n::Val{:nothrow}, foo) = precondition(n, bar) && precondition(n, baz)
```

Also, if a function uses the `@split_effects` macro internally, then an outside
`@assume_effects` can cause the precondition to be assumed to be true. This mirrors
the `@inbounds/@boundscheck` interaction we have right now, but with an alternative
that is provably safe.

# Extensions to `@assume_effects`

So far, we have assumed that these preconditions are synthesized automatically, but
getting this to work well, of course depends on the synthesis strength of the compiler.
To still allow users to take advantage of this mechanism, even if the compiler's synthesis
is not strong enough, we can extend @assume_effects to allow conditional effects:

```
@assume_effects (checkbounds(false, A, I) && :nothrow) setindex!(A, I)
	_safe_setindex!(A, I)
end
```

The idea here is that precondition is overriden by `checkbounds(false, A, I)`, so
any `@split_effects` of this function would use the `checkbounds` function for its
check and if this returned true, could `@assume_effects :nothrow` this function,
which as described above would allow the use of the unsafe function in the interior.

## Call-site `@assume_effects` and effects-assumption specialization

In the foregoing we have, in a few places, used a call-site `@assume_effects`, without
defining what it does. The idea here is pretty simple: We add a new specializaiton
axis based on effects assumption. For example, if a function is `@assume_effects :nothrow`,
then at codegen time, any path that throws an error (in particular bounds checks)
becomes undefined behavior and LLVM is allowed to assume that it does not happen.
Of course, this is macro is extremely dangerous (as the existing @assume_effects
and @inbounds are).

However, one significant benefit of this design is that there is a very clear notion of
what the assertion promises. This is not necessarily clear of `@inbounds`, which we
did not give semantics beyond it's effect on the `@boundscheck` macro. As a result,
even an infinitely-powerful prover could not check the correctness or non-correctness
of `@inbounds` (as currenlty used - we could of course consider an alternative @inbounds
design with stronger semantics). In contrast, the formal meaning of a conditional
`@assume_effects` is well defined and could in principle be checked by a tool (#49273).

# Implications for `--check-bounds`

In the whole, `--check-bounds` removal discussion, we had assumed that we did not want
to keep two copies of all code just for the purposes of `--check-bounds` which thus
required us disable constant propagation. However, in this proposal, the `@assume_effects`
specialization is exactly such a copy set and could thus be used for this purpose.
That said, this proposal would also hopefully provide a much stronger system for boundscheck
removal that would allow us to make `--check-bounds` much less necessary.

# Other uses

There are a few other places where domain checks can interfere with optimization.
For exmaple, consider the following situation:

```
function sin_loop(n)
	for i = 1:n
		# Imagine there was a whole bunch of other code here that used this value,
		# but all that got DCE'd, but we can't in general DCE `sin`, because it
		# may throw.
		sin(Float64(i))
	end
end
```
```
julia> @time sin_loop(1_000_000_000)
 20.383584 seconds
```

With the support in this PR, we can:
```
# The actual condition here is `!isinf`, but we're allowed to overapproximate and
# LLVM can handle `isfinite` better.
# TODO: In a more complete version of this PR, this precondition would be synthesized
@Base.assume_effects (isfinite(x) && :nothrow) @noinline function mysin(x::Float64)
	sin(x)
end

function sin_loop_split(n)
	for i = 1:n
		Core.invoke_split_effects(:nothrow, mysin, Float64(i))
	end
end
```
```
julia> @code_llvm sin_loop_split(1_000_000_000)
;  @ REPL[19]:1 within `sin_loop_split`
define void @julia_sin_loop_split_428(i64 signext %"n::Int64") #0 {
top:
;  @ REPL[19]:4 within `sin_loop_split`
  ret void
}

julia> @time sin_loop_split(1_000_000_000)
  0.000000 seconds
```

# Current status of this PR
This PR contains a complete sketch of this mechanism, including inference support
for the new intrinsic, as well as codegen and runtime support for effect-assumption
specialization. It also includes an (extremely incomplete) sketch of the supporting
macros. It does not implement any precondition synthesis logic. My plan is to support
synthesis for some of the simple `@inbounds` cases in Base, but then delagate fancier
synthesis support for packages, since that can get arbitrarily complicated.

# Implementation Plan

This PR itself is not suitable for merging, but if people like this direction, I would
like to get the basic pieces in in relatively short order. To that end, I would suggest
the following order of implementation as separate PRs once we've agreed on the overall plan:

1. New intrinsics, Method(Instance) fields, inference support
2. @assume_effects extensions
3. Codegen and specialization support, Code(Instance) fields
4. Basic Synthesis and `@inbounds` removal

[1] Introduced in 66ab577 for julia 0.2
[2] Note that it's usually not the boundschecking itself that is the problem,
    but rather that the presence of the boundschecking inhibits other optimizations.
[3] E.g. https://yuri.is/not-julia/
  • Loading branch information
Keno committed Jul 23, 2023
1 parent 6691a75 commit 63db2b0
Show file tree
Hide file tree
Showing 27 changed files with 505 additions and 217 deletions.
2 changes: 1 addition & 1 deletion base/abstractarraymath.jl
Original file line number Diff line number Diff line change
Expand Up @@ -468,7 +468,7 @@ function repeat_outer(a::AbstractMatrix, (m,n)::NTuple{2, Any})
R = d:d+p-1
for i=1:m
c = (i-1)*o+1
@inbounds b[c:c+o-1, R] = a
b[c:c+o-1, R] = a
end
end
return b
Expand Down
8 changes: 4 additions & 4 deletions base/boot.jl
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ export
# type reflection
<:, typeof, isa, typeassert,
# method reflection
applicable, invoke,
applicable, invoke, invoke_split_effects,
# constants
nothing, Main

Expand Down Expand Up @@ -439,12 +439,12 @@ function CodeInstance(
mi::MethodInstance, @nospecialize(rettype), @nospecialize(inferred_const),
@nospecialize(inferred), const_flags::Int32, min_world::UInt, max_world::UInt,
ipo_effects::UInt32, effects::UInt32, @nospecialize(argescapes#=::Union{Nothing,Vector{ArgEscapeInfo}}=#),
relocatability::UInt8)
relocatability::UInt8, effect_assumptions::UInt32)
return ccall(:jl_new_codeinst, Ref{CodeInstance},
(Any, Any, Any, Any, Int32, UInt, UInt, UInt32, UInt32, Any, UInt8),
(Any, Any, Any, Any, Int32, UInt, UInt, UInt32, UInt32, Any, UInt8, UInt32),
mi, rettype, inferred_const, inferred, const_flags, min_world, max_world,
ipo_effects, effects, argescapes,
relocatability)
relocatability, effect_assumptions)
end
GlobalRef(m::Module, s::Symbol) = ccall(:jl_module_globalref, Ref{GlobalRef}, (Any, Any), m, s)
Module(name::Symbol=:anonymous, std_imports::Bool=true, default_names::Bool=true) = ccall(:jl_f_new_module, Ref{Module}, (Any, Bool, Bool), name, std_imports, default_names)
Expand Down
37 changes: 37 additions & 0 deletions base/compiler/abstractinterpretation.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1969,6 +1969,43 @@ function abstract_call_known(interp::AbstractInterpreter, @nospecialize(f),
return abstract_apply(interp, argtypes, si, sv, max_methods)
elseif f === invoke
return abstract_invoke(interp, arginfo, si, sv)
elseif f === invoke_split_effects
# First perform inference as usual. We may not need to use the precondition logic
# at all, (e.g. if we get a constant).
new_arginfo = ArgInfo(arginfo.fargs === nothing ? nothing : arginfo.fargs[3:end],
arginfo.argtypes[3:end])
r = abstract_call(interp, new_arginfo, si, sv, max_methods)
# Find applicable preconditions that we may be able to use
which = arginfo.argtypes[2]
if isa(which, Const)
if isa(r.info, MethodMatchInfo) && length(r.info.results) == 1 && which.val === :nothrow
# Check for user-defined preconditions
match = r.info.results[1]
m = match.method

for i = 1:2:length(m.preconditions)
i + 1 <= length(m.preconditions) || break

cond = m.preconditions[1]
check = m.preconditions[2]

isa(cond, EffectsOverride) || continue

# TODO: Should depend on `which`
cond.nothrow || continue

check_arginfo = ArgInfo(nothing,
Any[Const(check), arginfo.argtypes[3:end]...])
rcheck = abstract_call(interp, check_arginfo, StmtInfo(true), sv, max_methods)
rcheck.rt === Bool || continue
is_foldable_nothrow(rcheck.effects) || continue

# TODO: What to do if multiple preconditions match
return CallMeta(r.rt, r.effects, InvokeSplitEffectsInfo(r.info, check, cond, rcheck.info))
end
end
end
return r
elseif f === modifyfield!
return abstract_modifyfield!(interp, argtypes, si, sv)
elseif f === Core.finalizer
Expand Down
4 changes: 2 additions & 2 deletions base/compiler/cicache.jl
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,11 @@ WorldView(wvc::WorldView, wr::WorldRange) = WorldView(wvc.cache, wr)
WorldView(wvc::WorldView, args...) = WorldView(wvc.cache, args...)

function haskey(wvc::WorldView{InternalCodeCache}, mi::MethodInstance)
return ccall(:jl_rettype_inferred, Any, (Any, UInt, UInt), mi, first(wvc.worlds), last(wvc.worlds)) !== nothing
return ccall(:jl_rettype_inferred, Any, (Any, UInt, UInt, UInt32), mi, first(wvc.worlds), last(wvc.worlds), 0) !== nothing
end

function get(wvc::WorldView{InternalCodeCache}, mi::MethodInstance, default)
r = ccall(:jl_rettype_inferred, Any, (Any, UInt, UInt), mi, first(wvc.worlds), last(wvc.worlds))
r = ccall(:jl_rettype_inferred, Any, (Any, UInt, UInt, UInt32), mi, first(wvc.worlds), last(wvc.worlds), 0)
if r === nothing
return default
end
Expand Down
1 change: 1 addition & 0 deletions base/compiler/effects.jl
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,7 @@ struct EffectsOverride
notaskstate::Bool
inaccessiblememonly::Bool
end
EffectsOverride() = decode_effects_override(0x00)

function encode_effects_override(eo::EffectsOverride)
e = 0x00
Expand Down

0 comments on commit 63db2b0

Please sign in to comment.