Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Box{AbstractFloat}() calling convention #368

Closed
ChrisRackauckas opened this issue Jun 21, 2022 · 12 comments
Closed

Box{AbstractFloat}() calling convention #368

ChrisRackauckas opened this issue Jun 21, 2022 · 12 comments
Assignees

Comments

@ChrisRackauckas
Copy link

ChrisRackauckas commented Jun 21, 2022

MWE:

using DiffEqFlux, Zygote, DifferentialEquations, LinearAlgebra
k, α, β, γ = 1, 0.1, 0.2, 0.3
tspan = (0.0,10.0)

function dxdt_train(du,u,p,t)
  du[1] = u[2]
  du[2] = -k*u[1] - α*u[1]^3 - β*u[2] - γ*u[2]^3
end

u0 = [1.0,0.0]
ts = collect(0.0:0.1:tspan[2])
prob_train = ODEProblem{true}(dxdt_train,u0,tspan)
data_train = Array(solve(prob_train,Tsit5(),saveat=ts))

A = [LegendreBasis(10), LegendreBasis(10)]
nn = TensorLayer(A, 1)

f = x -> min(30one(x),x)

function dxdt_pred(du,u,p,t)
  du[1] = u[2]
  du[2] = -p[1]*u[1] - p[2]*u[2] + f(nn(u,p[3:end])[1])
end

θ = zeros(102)
prob_pred = ODEProblem(dxdt_pred, u0, tspan, θ)

function predict_adjoint(θ)
  x = Array(solve(prob_pred,Tsit5(),p=θ,saveat=ts,
                  sensealg=InterpolatingAdjoint(autojacvec=ReverseDiffVJP(true))))
end

function loss_adjoint(θ)
  x = predict_adjoint(θ)
  loss = sum(norm.(x - data_train))
  return loss
end

# ReverseDiffVJP is fine
Zygote.gradient(loss_adjoint, θ)[1]

function predict_adjoint(θ)
  x = Array(solve(prob_pred,Tsit5(),p=θ,saveat=ts,
                  sensealg=InterpolatingAdjoint(autojacvec=EnzymeVJP())))
end

Zygote.gradient(loss_adjoint, θ)[1]
Unreachable reached at 00000000023fcb2b

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ILLEGAL_INSTRUCTION at 0x23fcb2b -- macro expansion at C:\Users\accou\.julia\packages\Enzyme\Wanbg\src\compiler.jl:4484 [inlined]
enzyme_call at C:\Users\accou\.julia\packages\Enzyme\Wanbg\src\compiler.jl:4278
in expression starting at REPL[21]:1
macro expansion at C:\Users\accou\.julia\packages\Enzyme\Wanbg\src\compiler.jl:4484 [inlined]
enzyme_call at C:\Users\accou\.julia\packages\Enzyme\Wanbg\src\compiler.jl:4278
unknown function (ip: 00000000023fcb98)
jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1788 [inlined]
do_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\builtins.c:713
AugmentedForwardThunk at C:\Users\accou\.julia\packages\Enzyme\Wanbg\src\compiler.jl:4269
jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1788 [inlined]
do_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\builtins.c:713
unknown function (ip: 00000000025d10b4)
Allocations: 307234471 (Pool: 307176051; Big: 58420); GC: 133

It might be due to dynamism, but it would be good if this could throw instead of segfaulting so we can catch it (we have this in a try/catch.

@vchuravy
Copy link
Member

Could you re-run this on a Linux machine? We get better stacktraces there. Ideally this would also be more minimized, and please provide a Manifest.toml

@DaniGlez
Copy link

DaniGlez commented Jun 26, 2022

On KDE neon (basically Ubuntu LTS), julia 1.7.3:

Linux ----- 5.13.0-51-generic #58~20.04.1-Ubuntu SMP Tue Jun 14 11:29:12 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
warning: didn't implement memmove, using memcpy as fallback which can result in errors
Unreachable reached at 0x7f6709b647db

signal (4): Illegal instruction
in expression starting at REPL[23]:1
macro expansion at /home/dani/.julia/packages/Enzyme/Wanbg/src/compiler.jl:4484 [inlined]
enzyme_call at /home/dani/.julia/packages/Enzyme/Wanbg/src/compiler.jl:4278
unknown function (ip: 0x7f6709b64847)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
do_apply at /buildworker/worker/package_linux64/build/src/builtins.c:713
AugmentedForwardThunk at /home/dani/.julia/packages/Enzyme/Wanbg/src/compiler.jl:4269
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
do_apply at /buildworker/worker/package_linux64/build/src/builtins.c:713
unknown function (ip: 0x7f679a61413d)
unknown function (ip: 0x7f679a61a0dd)
Allocations: 414620738 (Pool: 414550957; Big: 69781); GC: 177
Illegal instruction

@wsmoses
Copy link
Member

wsmoses commented Jun 26, 2022

The code as is doesn't reproduce locally due to an undefined variable. Moreover, can you reduce this to just be a case of calling enzyme.autodiff calling a crash rather than having to discover it from the wrappers?

wmoses@beast:~/git/Enzyme.jl (tmpactive) $ ./julia-1.7.2/bin/julia --project sciml.jl
ERROR: LoadError: UndefVarError: prob_pred not defined
Stacktrace:
 [1] top-level scope
   @ /mnt/Data/git/Enzyme.jl/sciml.jl:26
in expression starting at /mnt/Data/git/Enzyme.jl/sciml.jl:26

@ChrisRackauckas
Copy link
Author

Typo fixed. It should run now. It would take a few weeks to make the reproducer on my end, but it should just be calling autodiff on the dxdt_pred function.

@wsmoses
Copy link
Member

wsmoses commented Jun 26, 2022

Reduced to remove all of sciml. From a minimal test, I believe this error is due to the type instable TensorProductBasis (and you'll eventually hit a no reverse pass defined for jl_reshape, which we should also add). Nevertheless will investigate.

using Enzyme

u0 = [1.0,0.0]

abstract type TensorProductBasis <: Function end

struct LegendreBasis <: TensorProductBasis
    n::Int
end

function legendre_poly(x, p::Integer)
    a::typeof(x) = one(x)
    b::typeof(x) = x

    if p <= 0
        return a
    elseif p == 1
        return b
    end

    for j in 2:p
        a, b = b, ((2j-1)*x*b - (j-1)*a) / j
    end

    b
end

function (basis::LegendreBasis)(x)
    f = k -> legendre_poly(x,k-1)
    return map(f, 1:basis.n)
end


function TL(model, x, p)
    out = 1
    W = reshape(p, out, Int(length(p)/out))
    tensor_prod = model[1](x[1])
    for i in 2:length(model)
        tensor_prod = kron(tensor_prod,model[i](x[i]))
    end
    z = W*tensor_prod
    return z
end

struct MyTensorLayer{P<:AbstractArray}
    model::Array{TensorProductBasis}
    p::P
    in::Int
    out::Int
    function MyTensorLayer(model,out,p=nothing)
        number_of_weights = 1
        for basis in model
            number_of_weights *= basis.n
        end
        if p === nothing
            p = randn(out*number_of_weights)
        end
        new{typeof(p)}(model,p,length(model),out)
    end
end

function fn(layer::MyTensorLayer, x,p)
    model,out = layer.model,layer.out
    W = reshape(p, out, Int(length(p)/out))
    tensor_prod = model[1](x[1])
    for i in 2:length(model)
        tensor_prod = kron(tensor_prod,model[i](x[i]))
    end
    z = W*tensor_prod
    return z
end

const A = [LegendreBasis(10), LegendreBasis(10)]
const nn = MyTensorLayer(A, 1)

function dxdt_pred(u,p)
  fn(nn, u,p[3:end])[1]
end

th = zeros(102)
dp = zero(th)

du0 = zero(u0)

Enzyme.API.printall!(true)

dxdt_pred(u0, th)

Enzyme.autodiff(dxdt_pred, Active{Float64}, Duplicated(u0, du0), Duplicated(th, dp))

@wsmoses
Copy link
Member

wsmoses commented Jun 26, 2022

using Enzyme

abstract type TensorProductBasis <: Function end

struct LegendreBasis <: TensorProductBasis
    n::Int
end

function (basis::LegendreBasis)(x)
    return x
end

struct MyTensorLayer
    model::Array{TensorProductBasis}
end

function fn(layer::MyTensorLayer, x)
    model = layer.model
    return model[1](x)
end

const nn = MyTensorLayer([LegendreBasis(10)])

function dxdt_pred(x)
  return fn(nn, x)
end

Enzyme.API.printall!(true)

dxdt_pred(1.0)

Enzyme.autodiff(dxdt_pred, Active(1.0))
after simplification :
; Function Attrs: willreturn mustprogress
define double @preprocess_julia_dxdt_pred_1454_inner.1(double %0) local_unnamed_addr #6 !dbg !27 {
entry:
  %1 = call {}*** @julia.get_pgcstack() #7
  %2 = call fastcc double @julia_fn_1457([1 x {} addrspace(10)*] addrspace(11)* nocapture noundef nonnull readonly align 32 dereferenceable(8) addrspacecast ([1 x {} addrspace(10)*]* inttoptr (i64 140718397445472 to [1 x {} addrspace(10)*]*) to [1 x {} addrspace(10)*] addrspace(11)*), double %0) #6, !dbg !28
  ret double %2, !dbg !30
}

after simplification :
; Function Attrs: willreturn mustprogress
define internal fastcc double @preprocess_julia_fn_1457([1 x {} addrspace(10)*] addrspace(11)* nocapture noundef nonnull readonly align 32 dereferenceable(8) %0, double %1) unnamed_addr #6 !dbg !35 {
top:
  %2 = call {}*** @julia.get_pgcstack() #7
  %3 = getelementptr inbounds [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(11)* addrspacecast ([1 x {} addrspace(10)*]* inttoptr (i64 140718397445472 to [1 x {} addrspace(10)*]*) to [1 x {} addrspace(10)*] addrspace(11)*), i64 0, i64 0, !dbg !36
  %4 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %3 unordered, align 32, !dbg !36, !tbaa !12, !invariant.load !4, !nonnull !4
  %5 = call cc37 nonnull {} addrspace(10)* bitcast ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32)* @jl_apply_generic to {} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*)*)({} addrspace(10)* addrspacecast ({}* inttoptr (i64 140719969849312 to {}*) to {} addrspace(10)*), {} addrspace(10)* nonnull %4, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 140720128163936 to {}*) to {} addrspace(10)*)) #7, !dbg !38
  %6 = call {} addrspace(10)* @julia.typeof({} addrspace(10)* nonnull %5) #8, !dbg !38
  %.not = icmp eq {} addrspace(10)* %6, addrspacecast ({}* inttoptr (i64 140720121507808 to {}*) to {} addrspace(10)*), !dbg !38
  br i1 %.not, label %L8, label %L6, !dbg !38

L6:                                               ; preds = %top
  %ptls_field3 = getelementptr inbounds {}**, {}*** %2, i64 2305843009213693954, !dbg !38
  %7 = bitcast {}*** %ptls_field3 to i8**, !dbg !38
  %ptls_load45 = load i8*, i8** %7, align 8, !dbg !38, !tbaa !17
  %8 = call noalias nonnull {} addrspace(10)* @julia.gc_alloc_obj(i8* %ptls_load45, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 140719907280288 to {}*) to {} addrspace(10)*)) #9, !dbg !38
  %9 = bitcast {} addrspace(10)* %8 to double addrspace(10)*, !dbg !38
  store double %1, double addrspace(10)* %9, align 8, !dbg !38, !tbaa !19
  %10 = call cc37 nonnull {} addrspace(10)* bitcast ({} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)**, i32)* @jl_apply_generic to {} addrspace(10)* ({} addrspace(10)*, {} addrspace(10)*)*)({} addrspace(10)* nonnull %5, {} addrspace(10)* nonnull %8) #7, !dbg !38
  %11 = bitcast {} addrspace(10)* %10 to double addrspace(10)*
  %12 = load double, double addrspace(10)* %11, align 8, !tbaa !19
  br label %L8, !dbg !38

L8:                                               ; preds = %L6, %top
  %value_phi = phi double [ %12, %L6 ], [ %1, %top ]
  ret double %value_phi, !dbg !38
}

; Function Attrs: willreturn mustprogress
define internal fastcc { double } @diffejulia_fn_1457([1 x {} addrspace(10)*] addrspace(11)* nocapture noundef nonnull readonly align 32 dereferenceable(8) %0, double %1, double %differeturn) unnamed_addr #6 !dbg !57 {
top:
  %"value_phi'de" = alloca double, align 8
  store double 0.000000e+00, double* %"value_phi'de", align 8
  %2 = alloca [3 x {} addrspace(10)*], align 8
  %3 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %2, i64 0, i64 0
  store {} addrspace(10)* null, {} addrspace(10)** %3, align 8
  %4 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %2, i64 0, i64 1
  store {} addrspace(10)* null, {} addrspace(10)** %4, align 8
  %5 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %2, i64 0, i64 2
  store {} addrspace(10)* null, {} addrspace(10)** %5, align 8
  %6 = alloca [2 x {} addrspace(10)*], align 8
  %7 = alloca [2 x {} addrspace(10)*], align 8
  %8 = alloca [2 x i8], align 1
  %9 = alloca [2 x {} addrspace(10)*], align 8
  %10 = alloca [2 x {} addrspace(10)*], align 8
  %11 = alloca [2 x i8], align 1
  %"'de" = alloca double, align 8
  store double 0.000000e+00, double* %"'de", align 8
  %"'de3" = alloca double, align 8
  store double 0.000000e+00, double* %"'de3", align 8
  %"'ip_phi2_cache" = alloca {} addrspace(10)*, align 8
  store {} addrspace(10)* null, {} addrspace(10)** %"'ip_phi2_cache", align 8
  %12 = alloca [3 x {} addrspace(10)*], align 8
  %13 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %12, i64 0, i64 0
  store {} addrspace(10)* null, {} addrspace(10)** %13, align 8
  %14 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %12, i64 0, i64 1
  store {} addrspace(10)* null, {} addrspace(10)** %14, align 8
  %15 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %12, i64 0, i64 2
  store {} addrspace(10)* null, {} addrspace(10)** %15, align 8
  %16 = alloca [1 x {} addrspace(10)*], align 8
  %17 = alloca [1 x {} addrspace(10)*], align 8
  %18 = alloca [1 x i8], align 1
  %19 = alloca [1 x {} addrspace(10)*], align 8
  %20 = alloca [1 x {} addrspace(10)*], align 8
  %21 = alloca [1 x i8], align 1
  %_cache = alloca {} addrspace(10)*, align 8
  store {} addrspace(10)* null, {} addrspace(10)** %_cache, align 8
  %"'mi_cache" = alloca {} addrspace(10)*, align 8
  store {} addrspace(10)* null, {} addrspace(10)** %"'mi_cache", align 8
  %22 = call {}*** @julia.get_pgcstack() #11
  %23 = getelementptr inbounds [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(11)* addrspacecast ([1 x {} addrspace(10)*]* inttoptr (i64 140718397445472 to [1 x {} addrspace(10)*]*) to [1 x {} addrspace(10)*] addrspace(11)*), i64 0, i64 0, !dbg !58
  %24 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %23 unordered, align 32, !dbg !58, !tbaa !30, !invariant.load !4, !nonnull !4
  %25 = getelementptr inbounds [2 x {} addrspace(10)*], [2 x {} addrspace(10)*]* %6, i64 0, i64 0, !dbg !60
  store {} addrspace(10)* %24, {} addrspace(10)** %25, align 8, !dbg !60
  %26 = getelementptr inbounds [2 x i8], [2 x i8]* %8, i64 0, i64 0, !dbg !60
  store i8 0, i8* %26, align 1, !dbg !60
  %27 = getelementptr inbounds [2 x {} addrspace(10)*], [2 x {} addrspace(10)*]* %7, i64 0, i64 0, !dbg !60
  store {} addrspace(10)* null, {} addrspace(10)** %27, align 8, !dbg !60
  %28 = getelementptr inbounds [2 x {} addrspace(10)*], [2 x {} addrspace(10)*]* %6, i64 0, i64 1, !dbg !60
  store {} addrspace(10)* addrspacecast ({}* inttoptr (i64 140720128163936 to {}*) to {} addrspace(10)*), {} addrspace(10)** %28, align 8, !dbg !60
  %29 = getelementptr inbounds [2 x i8], [2 x i8]* %8, i64 0, i64 1, !dbg !60
  store i8 0, i8* %29, align 1, !dbg !60
  %30 = getelementptr inbounds [2 x {} addrspace(10)*], [2 x {} addrspace(10)*]* %7, i64 0, i64 1, !dbg !60
  store {} addrspace(10)* null, {} addrspace(10)** %30, align 8, !dbg !60
  %31 = call token (...) @llvm.julia.gc_preserve_begin({} addrspace(10)* %24, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 140720128163936 to {}*) to {} addrspace(10)*)), !dbg !60
  %32 = ptrtoint [2 x {} addrspace(10)*]* %6 to i64, !dbg !60
  %33 = ptrtoint [2 x {} addrspace(10)*]* %7 to i64, !dbg !60
  %34 = ptrtoint [2 x i8]* %8 to i64, !dbg !60
  call void @julia_runtime_generic_augfwd_2075([3 x {} addrspace(10)*]* %2, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 140719969849312 to {}*) to {} addrspace(10)*), i64 %32, i64 %33, i64 %34, i32 2), !dbg !60
  %35 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %2, i64 0, i64 1, !dbg !60
  %36 = load {} addrspace(10)*, {} addrspace(10)** %35, align 8, !dbg !60
  %37 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %2, i64 0, i64 0, !dbg !60
  %38 = load {} addrspace(10)*, {} addrspace(10)** %37, align 8, !dbg !60
  %39 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %2, i64 0, i64 2, !dbg !60
  %40 = load {} addrspace(10)*, {} addrspace(10)** %39, align 8, !dbg !60
  call void @llvm.julia.gc_preserve_end(token %31), !dbg !60
  %41 = call {} addrspace(10)* @julia.typeof({} addrspace(10)* nonnull %38) #12, !dbg !60
  %.not = icmp eq {} addrspace(10)* %41, addrspacecast ({}* inttoptr (i64 140720121507808 to {}*) to {} addrspace(10)*), !dbg !60
  br i1 %.not, label %L8, label %L6, !dbg !60

L6:                                               ; preds = %top
  %ptls_field3 = getelementptr inbounds {}**, {}*** %22, i64 2305843009213693954, !dbg !60
  %42 = bitcast {}*** %ptls_field3 to i8**, !dbg !60
  %ptls_load45 = load i8*, i8** %42, align 8, !dbg !60, !tbaa !35
  %43 = call noalias nonnull {} addrspace(10)* @julia.gc_alloc_obj(i8* %ptls_load45, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 140719907280288 to {}*) to {} addrspace(10)*)) #13, !dbg !60
  %"'mi" = call noalias nonnull {} addrspace(10)* @julia.gc_alloc_obj(i8* %ptls_load45, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 140719907280288 to {}*) to {} addrspace(10)*)) #13, !dbg !60
  call void ({} addrspace(10)*, ...) @julia.write_barrier({} addrspace(10)* %"'mi"), !dbg !60
  %44 = bitcast {} addrspace(10)* %"'mi" to i8 addrspace(10)*, !dbg !60
  call void @llvm.memset.p10i8.i64(i8 addrspace(10)* nonnull dereferenceable(8) dereferenceable_or_null(8) %44, i8 0, i64 8, i1 false), !dbg !60
  %"'ipc6" = bitcast {} addrspace(10)* %"'mi" to double addrspace(10)*, !dbg !60
  %45 = bitcast {} addrspace(10)* %43 to double addrspace(10)*, !dbg !60
  store double %1, double addrspace(10)* %45, align 8, !dbg !60, !tbaa !37
  store {} addrspace(10)* %"'mi", {} addrspace(10)** %"'mi_cache", align 8, !dbg !60, !invariant.group !61
  store {} addrspace(10)* %43, {} addrspace(10)** %_cache, align 8, !dbg !60, !invariant.group !62
  %46 = getelementptr inbounds [1 x {} addrspace(10)*], [1 x {} addrspace(10)*]* %16, i64 0, i64 0, !dbg !60
  store {} addrspace(10)* %43, {} addrspace(10)** %46, align 8, !dbg !60
  %47 = getelementptr inbounds [1 x i8], [1 x i8]* %18, i64 0, i64 0, !dbg !60
  store i8 1, i8* %47, align 1, !dbg !60
  %48 = getelementptr inbounds [1 x {} addrspace(10)*], [1 x {} addrspace(10)*]* %17, i64 0, i64 0, !dbg !60
  store {} addrspace(10)* %"'mi", {} addrspace(10)** %48, align 8, !dbg !60
  %49 = call token (...) @llvm.julia.gc_preserve_begin({} addrspace(10)* %43, {} addrspace(10)* %"'mi"), !dbg !60
  %50 = ptrtoint [1 x {} addrspace(10)*]* %16 to i64, !dbg !60
  %51 = ptrtoint [1 x {} addrspace(10)*]* %17 to i64, !dbg !60
  %52 = ptrtoint [1 x i8]* %18 to i64, !dbg !60
  call void @julia_runtime_generic_augfwd_2257([3 x {} addrspace(10)*]* %12, {} addrspace(10)* %38, i64 %50, i64 %51, i64 %52, i32 1), !dbg !60
  %53 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %12, i64 0, i64 1, !dbg !60
  %54 = load {} addrspace(10)*, {} addrspace(10)** %53, align 8, !dbg !60
  %55 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %12, i64 0, i64 0, !dbg !60
  %56 = load {} addrspace(10)*, {} addrspace(10)** %55, align 8, !dbg !60
  %57 = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %12, i64 0, i64 2, !dbg !60
  %58 = load {} addrspace(10)*, {} addrspace(10)** %57, align 8, !dbg !60, !invariant.group !63
  call void @llvm.julia.gc_preserve_end(token %49), !dbg !60
  store {} addrspace(10)* %54, {} addrspace(10)** %"'ip_phi2_cache", align 8, !invariant.group !64
  %"'ipc" = bitcast {} addrspace(10)* %54 to double addrspace(10)*
  br label %L8, !dbg !60

L8:                                               ; preds = %L6, %top
  br label %invertL8, !dbg !60

inverttop:                                        ; preds = %invertL8, %invertL6
  %59 = getelementptr inbounds [2 x {} addrspace(10)*], [2 x {} addrspace(10)*]* %9, i64 0, i64 0
  store {} addrspace(10)* %24, {} addrspace(10)** %59, align 8
  %60 = getelementptr inbounds [2 x i8], [2 x i8]* %11, i64 0, i64 0
  store i8 0, i8* %60, align 1
  %61 = getelementptr inbounds [2 x {} addrspace(10)*], [2 x {} addrspace(10)*]* %10, i64 0, i64 0
  store {} addrspace(10)* null, {} addrspace(10)** %61, align 8
  %62 = getelementptr inbounds [2 x {} addrspace(10)*], [2 x {} addrspace(10)*]* %9, i64 0, i64 1
  store {} addrspace(10)* addrspacecast ({}* inttoptr (i64 140720128163936 to {}*) to {} addrspace(10)*), {} addrspace(10)** %62, align 8
  %63 = getelementptr inbounds [2 x i8], [2 x i8]* %11, i64 0, i64 1
  store i8 0, i8* %63, align 1
  %64 = getelementptr inbounds [2 x {} addrspace(10)*], [2 x {} addrspace(10)*]* %10, i64 0, i64 1
  store {} addrspace(10)* null, {} addrspace(10)** %64, align 8
  %65 = call token (...) @llvm.julia.gc_preserve_begin({} addrspace(10)* %24, {} addrspace(10)* addrspacecast ({}* inttoptr (i64 140720128163936 to {}*) to {} addrspace(10)*), {} addrspace(10)* %40)
  %66 = ptrtoint [2 x {} addrspace(10)*]* %9 to i64
  %67 = ptrtoint [2 x {} addrspace(10)*]* %10 to i64
  %68 = ptrtoint [2 x i8]* %11 to i64
  call void @julia_runtime_generic_rev_2211({} addrspace(10)* addrspacecast ({}* inttoptr (i64 140719969849312 to {}*) to {} addrspace(10)*), i64 %66, i64 %67, i64 %68, i32 2, {} addrspace(10)* %40), !dbg !60
  call void @llvm.julia.gc_preserve_end(token %65)
  %69 = load double, double* %"'de", align 8
  %70 = insertvalue { double } undef, double %69, 0
  ret { double } %70

invertL6:                                         ; preds = %invertL8
  %71 = load double, double* %"'de3", align 8
  store double 0.000000e+00, double* %"'de3", align 8
  %72 = load {} addrspace(10)*, {} addrspace(10)** %"'ip_phi2_cache", align 8, !invariant.group !64
  %"'ipc_unwrap" = bitcast {} addrspace(10)* %72 to double addrspace(10)*
  %73 = atomicrmw fadd double addrspace(10)* %"'ipc_unwrap", double %71 monotonic
  %_unwrap = getelementptr inbounds [3 x {} addrspace(10)*], [3 x {} addrspace(10)*]* %12, i64 0, i64 2
  %_unwrap5 = load {} addrspace(10)*, {} addrspace(10)** %_unwrap, align 8, !dbg !60, !invariant.group !63
  %74 = load {} addrspace(10)*, {} addrspace(10)** %_cache, align 8, !invariant.group !62
  %75 = getelementptr inbounds [1 x {} addrspace(10)*], [1 x {} addrspace(10)*]* %19, i64 0, i64 0
  store {} addrspace(10)* %74, {} addrspace(10)** %75, align 8
  %76 = getelementptr inbounds [1 x i8], [1 x i8]* %21, i64 0, i64 0
  store i8 1, i8* %76, align 1
  %77 = load {} addrspace(10)*, {} addrspace(10)** %"'mi_cache", align 8, !invariant.group !61
  %78 = getelementptr inbounds [1 x {} addrspace(10)*], [1 x {} addrspace(10)*]* %20, i64 0, i64 0
  store {} addrspace(10)* %77, {} addrspace(10)** %78, align 8
  %79 = call token (...) @llvm.julia.gc_preserve_begin({} addrspace(10)* %74, {} addrspace(10)* %77, {} addrspace(10)* %_unwrap5)
  %80 = ptrtoint [1 x {} addrspace(10)*]* %19 to i64
  %81 = ptrtoint [1 x {} addrspace(10)*]* %20 to i64
  %82 = ptrtoint [1 x i8]* %21 to i64
  call void @julia_runtime_generic_rev_2287({} addrspace(10)* %38, i64 %80, i64 %81, i64 %82, i32 1, {} addrspace(10)* %_unwrap5), !dbg !60
  call void @llvm.julia.gc_preserve_end(token %79)
  %"'ipc6_unwrap" = bitcast {} addrspace(10)* %77 to double addrspace(10)*
  %83 = load double, double addrspace(10)* %"'ipc6_unwrap", align 8
  store double 0.000000e+00, double addrspace(10)* %"'ipc6_unwrap", align 8
  %84 = load double, double* %"'de", align 8
  %85 = fadd fast double %84, %83
  store double %85, double* %"'de", align 8
  br label %inverttop

invertL8:                                         ; preds = %L8
  store double %differeturn, double* %"value_phi'de", align 8
  %86 = load double, double* %"value_phi'de", align 8
  store double 0.000000e+00, double* %"value_phi'de", align 8
  %87 = xor i1 %.not, true
  %88 = select fast i1 %87, double %86, double 0.000000e+00
  %89 = load double, double* %"'de3", align 8
  %90 = fadd fast double %89, %86
  %91 = select fast i1 %.not, double %89, double %90
  store double %91, double* %"'de3", align 8
  %92 = select fast i1 %.not, double %86, double 0.000000e+00
  %93 = load double, double* %"'de", align 8
  %94 = fadd fast double %93, %86
  %95 = select fast i1 %.not, double %94, double %93
  store double %95, double* %"'de", align 8
  br i1 %.not, label %inverttop, label %invertL6
}

; Function Attrs: willreturn mustprogress
define internal { double } @diffejulia_dxdt_pred_1454_inner.1(double %0, double %differeturn) local_unnamed_addr #6 !dbg !49 {
entry:
  %"'de" = alloca double, align 8
  store double 0.000000e+00, double* %"'de", align 8
  %"'de1" = alloca double, align 8
  store double 0.000000e+00, double* %"'de1", align 8
  br label %invertentry, !dbg !50

invertentry:                                      ; preds = %entry
  store double %differeturn, double* %"'de", align 8
  %1 = load double, double* %"'de", align 8
  %2 = call fastcc { double } @diffejulia_fn_1457([1 x {} addrspace(10)*] addrspace(11)* addrspacecast ([1 x {} addrspace(10)*]* inttoptr (i64 140718397445472 to [1 x {} addrspace(10)*]*) to [1 x {} addrspace(10)*] addrspace(11)*), double %0, double %1), !dbg !51
  %3 = extractvalue { double } %2, 0
  %4 = load double, double* %"'de1", align 8
  %5 = fadd fast double %4, %3
  store double %5, double* %"'de1", align 8
  store double 0.000000e+00, double* %"'de", align 8
  %6 = load double, double* %"'de1", align 8
  %7 = insertvalue { double } undef, double %6, 0
  ret { double } %7
}

after simplification :
; Function Attrs: willreturn mustprogress
define {} addrspace(10)* @preprocess_julia_getindex_2619_inner.1({} addrspace(10)* nonnull align 16 dereferenceable(40) %0, i64 signext %1) local_unnamed_addr #5 !dbg !21 {
entry:
  %2 = call {}*** @julia.get_pgcstack() #6
  %3 = add i64 %1, -1, !dbg !22
  %4 = bitcast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !22
  %5 = addrspacecast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %4 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !22
  %6 = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %5, i64 0, i32 1, !dbg !22
  %7 = load i64, i64 addrspace(11)* %6, align 8, !dbg !22, !tbaa !9, !range !14
  %8 = icmp ult i64 %3, %7, !dbg !22
  br i1 %8, label %idxend.i, label %oob.i, !dbg !22

oob.i:                                            ; preds = %entry
  %malloccall = tail call noalias nonnull dereferenceable(8) dereferenceable_or_null(8) i8* @malloc(i64 8), !enzyme_fromstack !24
  %9 = bitcast i8* %malloccall to i64*
  store i64 %1, i64* %9, align 8, !dbg !22
  %10 = addrspacecast {} addrspace(10)* %0 to {} addrspace(12)*, !dbg !22
  call void @jl_bounds_error_ints({} addrspace(12)* %10, i64* noundef nonnull align 8 %9, i64 noundef 1) #7, !dbg !22
  unreachable, !dbg !22

idxend.i:                                         ; preds = %entry
  %11 = bitcast {} addrspace(10)* %0 to {} addrspace(10)* addrspace(13)* addrspace(10)*, !dbg !22
  %12 = addrspacecast {} addrspace(10)* addrspace(13)* addrspace(10)* %11 to {} addrspace(10)* addrspace(13)* addrspace(11)*, !dbg !22
  %13 = load {} addrspace(10)* addrspace(13)*, {} addrspace(10)* addrspace(13)* addrspace(11)* %12, align 16, !dbg !22, !tbaa !15, !nonnull !4
  %14 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %13, i64 %3, !dbg !22
  %15 = load {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %14, align 8, !dbg !22, !tbaa !17
  %.not = icmp eq {} addrspace(10)* %15, null, !dbg !22
  br i1 %.not, label %fail.i, label %julia_getindex_2619_inner.exit, !dbg !22

fail.i:                                           ; preds = %idxend.i
  call void @jl_throw({} addrspace(12)* noundef addrspacecast ({}* inttoptr (i64 140719959473728 to {}*) to {} addrspace(12)*)) #7, !dbg !22
  unreachable, !dbg !22

julia_getindex_2619_inner.exit:                   ; preds = %idxend.i
  ret {} addrspace(10)* %15, !dbg !25
}

; Function Attrs: willreturn mustprogress
define internal { i8*, {} addrspace(10)* } @augmented_julia_getindex_2619_inner.1({} addrspace(10)* nonnull align 16 dereferenceable(40) %0, i64 signext %1) local_unnamed_addr #5 !dbg !26 {
entry:
  %2 = alloca { i8*, {} addrspace(10)* }, align 8, !dbg !27
  store { i8*, {} addrspace(10)* } zeroinitializer, { i8*, {} addrspace(10)* }* %2, align 8, !dbg !27
  %3 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* }* %2, i32 0, i32 0, !dbg !27
  store i8* null, i8** %3, align 8, !dbg !27
  %4 = add i64 %1, -1, !dbg !27
  %5 = bitcast {} addrspace(10)* %0 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)*, !dbg !27
  %6 = addrspacecast { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(10)* %5 to { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)*, !dbg !27
  %7 = getelementptr inbounds { i8 addrspace(13)*, i64, i16, i16, i32 }, { i8 addrspace(13)*, i64, i16, i16, i32 } addrspace(11)* %6, i64 0, i32 1, !dbg !27
  %8 = load i64, i64 addrspace(11)* %7, align 8, !dbg !27, !tbaa !9, !range !14
  %9 = icmp ult i64 %4, %8, !dbg !27
  br i1 %9, label %idxend.i, label %oob.i, !dbg !27

oob.i:                                            ; preds = %entry
  %malloccall = tail call noalias nonnull dereferenceable(8) dereferenceable_or_null(8) i8* @malloc(i64 8), !enzyme_fromstack !24
  %10 = bitcast i8* %malloccall to i64*
  store i64 %1, i64* %10, align 8, !dbg !27
  %11 = addrspacecast {} addrspace(10)* %0 to {} addrspace(12)*, !dbg !27
  call void @jl_bounds_error_ints({} addrspace(12)* %11, i64* noundef nonnull align 8 %10, i64 noundef 1) #6, !dbg !27
  unreachable, !dbg !27

idxend.i:                                         ; preds = %entry
  %12 = bitcast {} addrspace(10)* %0 to {} addrspace(10)* addrspace(13)* addrspace(10)*, !dbg !27
  %13 = addrspacecast {} addrspace(10)* addrspace(13)* addrspace(10)* %12 to {} addrspace(10)* addrspace(13)* addrspace(11)*, !dbg !27
  %14 = load {} addrspace(10)* addrspace(13)*, {} addrspace(10)* addrspace(13)* addrspace(11)* %13, align 16, !dbg !27, !tbaa !15, !nonnull !4
  %15 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %14, i64 %4, !dbg !27
  %16 = load {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %15, align 8, !dbg !27, !tbaa !17
  %.not = icmp eq {} addrspace(10)* %16, null, !dbg !27
  br i1 %.not, label %fail.i, label %julia_getindex_2619_inner.exit, !dbg !27

fail.i:                                           ; preds = %idxend.i
  call void @jl_throw({} addrspace(12)* noundef addrspacecast ({}* inttoptr (i64 140719959473728 to {}*) to {} addrspace(12)*)) #6, !dbg !27
  unreachable, !dbg !27

julia_getindex_2619_inner.exit:                   ; preds = %idxend.i
  %17 = insertvalue { i8*, {} addrspace(10)* } undef, {} addrspace(10)* %16, 1, !dbg !29
  %18 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* }* %2, i32 0, i32 1, !dbg !29
  store {} addrspace(10)* %16, {} addrspace(10)** %18, align 8, !dbg !29
  %19 = load { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* }* %2, align 8, !dbg !29
  ret { i8*, {} addrspace(10)* } %19, !dbg !29
}

; Function Attrs: willreturn mustprogress
define internal void @diffejulia_getindex_2619_inner.1({} addrspace(10)* nonnull align 16 dereferenceable(40) %0, i64 signext %1, i8* %tapeArg) local_unnamed_addr #5 !dbg !30 {
entry:
  tail call void @free(i8* nonnull %tapeArg)
  br i1 true, label %idxend.i, label %oob.i, !dbg !31

oob.i:                                            ; preds = %entry
  unreachable

idxend.i:                                         ; preds = %entry
  br i1 false, label %fail.i, label %julia_getindex_2619_inner.exit, !dbg !31

fail.i:                                           ; preds = %idxend.i
  unreachable

julia_getindex_2619_inner.exit:                   ; preds = %idxend.i
  br label %invertjulia_getindex_2619_inner.exit, !dbg !33

invertentry:                                      ; preds = %invertidxend.i
  ret void

invertidxend.i:                                   ; preds = %invertjulia_getindex_2619_inner.exit
  br label %invertentry

invertjulia_getindex_2619_inner.exit:             ; preds = %julia_getindex_2619_inner.exit
  br label %invertidxend.i
}

Unreachable reached at 0x7ffb8218890b

signal (4): Illegal instruction
in expression starting at /mnt/Data/git/Enzyme.jl/sciml2.jl:32
macro expansion at /mnt/Data/git/Enzyme.jl/src/compiler.jl:4484 [inlined]
enzyme_call at /mnt/Data/git/Enzyme.jl/src/compiler.jl:4278
unknown function (ip: 0x7ffb82188977)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
do_apply at /buildworker/worker/package_linux64/build/src/builtins.c:713
AugmentedForwardThunk at /mnt/Data/git/Enzyme.jl/src/compiler.jl:4269
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
do_apply at /buildworker/worker/package_linux64/build/src/builtins.c:713
unknown function (ip: 0x7ffbe677dcbd)
unknown function (ip: 0x7ffbe677d390)
Allocations: 39502973 (Pool: 39485359; Big: 17614); GC: 42
Illegal instruction (core dumped)

@ChrisRackauckas
Copy link
Author

There might be some dynamicness, yes. The main issue here isn't that it doesn't work (I didn't expect it to) it's mostly that it segfaulted Julia. In the real use case this was done in a try catch to check Enzyme compatibility, but the way the error is currently given makes it not recoverable.

@wsmoses
Copy link
Member

wsmoses commented Jun 26, 2022

Having not yet fully reduced (nor yet investigated), my guess is what's happening is that the dynamism creates the use of split mode internally -- which triggers a bad interaction with the garbage collector.

If that is indeed the case (still need to investigate to be sure), in order for you to check in advance you need either:

  1. we enable Enzyme+Julia GC to fully work together [has been on task list, but we haven't gotten to]
  2. Throw an error for scenairos that fall outside Enzyme's present GC support and store Julia GC objects in Enzyme's cache (or vice versa) in unsupported ways.
  3. Prematurely error if in a regime (such as type unstable code) likely to lead to not-yet supported Enzyme+GC.

@wsmoses
Copy link
Member

wsmoses commented Jun 26, 2022

Never mind, seems to be something in the calling convention.

@vchuravy
Copy link
Member

Ok 0.10.3 will return an error here instead of crashing

@vchuravy
Copy link
Member

@wsmoses

  • Enzyme.jl/test/runtests.jl

    Lines 1019 to 1025 in 9f456aa

    # @test J_r_1(A, x) == [
    # 1.0 1.0 0.0 0.0 0.0 0.0;
    # 1.0 0.0 1.0 0.0 0.0 0.0;
    # 1.0 0.0 0.0 1.0 0.0 0.0;
    # 1.0 0.0 0.0 0.0 1.0 0.0;
    # 1.0 0.0 0.0 0.0 0.0 1.0;
    # ]
  • @test_throws Core.UndefRefError Enzyme.autodiff(Reverse, invsin, Active, Duplicated(x, dx))

Both also hit this case were they hit a null-pointer, but our tests indicated it was correct.

@vchuravy vchuravy changed the title Segfault: Unreachable reached at 00000000023fcb2b in DiffEqFlux Box{Int}() calling convention Jun 27, 2022
@vchuravy vchuravy changed the title Box{Int}() calling convention Box{AbstractFloat}() calling convention Jun 27, 2022
@vchuravy
Copy link
Member

Box is dead, long live sret

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants