Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential gradient issues with Flux chains when changing parameter type #533

Closed
ChrisRackauckas opened this issue Jun 23, 2022 · 21 comments
Closed
Labels

Comments

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Jun 23, 2022

MWE:

using DiffEqFlux, Flux, NeuralPDE, ModelingToolkit, DomainSets, Optimization, OptimizationFlux, Test

@parameters x y
@variables u(..)
Dxx = Differential(x)^2
Dyy = Differential(y)^2

# 2D PDE
eq  = Dxx(u(x,y)) + Dyy(u(x,y)) ~ -sin(pi*x)*sin(pi*y)

# Initial and boundary conditions
bcs = [u(0,y) ~ 0.0, u(1,y) ~ -sin(pi*1)*sin(pi*y),
       u(x,0) ~ 0.0, u(x,1) ~ -sin(pi*x)*sin(pi*1)]
# Space and time domains
domains = [x  Interval(0.0,1.0),
           y  Interval(0.0,1.0)]

@named pde_system = PDESystem(eq,bcs,domains,[x,y],[u(x, y)])

fastchain = FastChain(FastDense(2,12,Flux.σ),FastDense(12,12,Flux.σ),FastDense(12,1))
fluxchain = Chain(Dense(2,12,Flux.σ),Dense(12,12,Flux.σ),Dense(12,1))
initθ = Float64.(DiffEqFlux.initial_params(fastchain))
grid_strategy = NeuralPDE.GridTraining(0.1)

p,re = Flux.destructure(fluxchain)

discretization1 = NeuralPDE.PhysicsInformedNN(fastchain,
                                             grid_strategy;
                                             init_params = initθ)

discretization2 = NeuralPDE.PhysicsInformedNN(fluxchain,
                                             grid_strategy;
                                             init_params = initθ)


prob1 = NeuralPDE.discretize(pde_system,discretization1)
prob2 = NeuralPDE.discretize(pde_system,discretization2)
sym_prob = NeuralPDE.symbolic_discretize(pde_system,discretization1)

Zygote.gradient((x)->prob1.f(x,nothing),initθ)
Zygote.gradient((x)->prob2.f(x,nothing),initθ) # Very very different???

function callback(p,l)
    @show l
    false
end
res = Optimization.solve(prob1, ADAM(0.1); callback=callback,maxiters=1000)
phi = discretization1.phi

xs,ys = [infimum(d.domain):0.01:supremum(d.domain) for d in domains]
analytic_sol_func(x,y) = (sin(pi*x)*sin(pi*y))/(2pi^2)

u_predict = reshape([first(phi([x,y],res.minimizer)) for x in xs for y in ys],(length(xs),length(ys)))
u_real = reshape([analytic_sol_func(x,y) for x in xs for y in ys], (length(xs),length(ys)))
diff_u = abs.(u_predict .- u_real)

@show maximum(abs2,u_predict - u_real)
@test u_predict  u_real atol = 2.0

res = Optimization.solve(prob2, ADAM(0.1); callback=callback,maxiters=1000)
phi = discretization2.phi

xs,ys = [infimum(d.domain):0.01:supremum(d.domain) for d in domains]
analytic_sol_func(x,y) = (sin(pi*x)*sin(pi*y))/(2pi^2)

u_predict = reshape([first(phi([x,y],res.minimizer)) for x in xs for y in ys],(length(xs),length(ys)))
u_real = reshape([analytic_sol_func(x,y) for x in xs for y in ys], (length(xs),length(ys)))
diff_u = abs.(u_predict .- u_real)

@show maximum(abs2,u_predict - u_real)
@test u_predict  u_real atol = 2.0

See fluxchain fails and the gradient is off.

@ChrisRackauckas
Copy link
Member Author

ChrisRackauckas commented Jun 23, 2022

Here's a simplification:

using DiffEqFlux, Flux, NeuralPDE, ModelingToolkit, DomainSets

@parameters x y
@variables u(..)
Dxx = Differential(x)^2
Dyy = Differential(y)^2

# 2D PDE
eq  = Dxx(u(x,y)) + Dyy(u(x,y)) ~ -sin(pi*x)*sin(pi*y)

# Initial and boundary conditions
bcs = [u(0,y) ~ 0.0, u(1,y) ~ -sin(pi*1)*sin(pi*y),
       u(x,0) ~ 0.0, u(x,1) ~ -sin(pi*x)*sin(pi*1)]
# Space and time domains
domains = [x  Interval(0.0,1.0),
           y  Interval(0.0,1.0)]

@named pde_system = PDESystem(eq,bcs,domains,[x,y],[u(x, y)])

fastchain = FastChain(FastDense(2,12,Flux.σ),FastDense(12,12,Flux.σ),FastDense(12,1))
fluxchain = Chain(Dense(2,12,Flux.σ),Dense(12,12,Flux.σ),Dense(12,1))
initθ = Float64.(DiffEqFlux.initial_params(fastchain))
grid_strategy = NeuralPDE.GridTraining(0.1)

discretization1 = NeuralPDE.PhysicsInformedNN(fastchain,
                                             grid_strategy;
                                             init_params = initθ)

discretization2 = NeuralPDE.PhysicsInformedNN(fluxchain,
                                             grid_strategy;
                                             init_params = initθ)


prob1 = NeuralPDE.discretize(pde_system,discretization1)
prob2 = NeuralPDE.discretize(pde_system,discretization2)
sym_prob = NeuralPDE.symbolic_discretize(pde_system,discretization1)

Zygote.gradient((x)->prob1.f(x,nothing),initθ)
Zygote.gradient((x)->prob2.f(x,nothing),initθ) # Very very different???

## Fixed

initθ = DiffEqFlux.initial_params(fastchain)
grid_strategy = NeuralPDE.GridTraining(0.1)

discretization1 = NeuralPDE.PhysicsInformedNN(fastchain,
                                             grid_strategy;
                                             init_params = initθ)

discretization2 = NeuralPDE.PhysicsInformedNN(fluxchain,
                                             grid_strategy;
                                             init_params = initθ,
                                             phi = (x,p)->re(p)(x))


prob1 = NeuralPDE.discretize(pde_system,discretization1)
prob2 = NeuralPDE.discretize(pde_system,discretization2)
sym_prob = NeuralPDE.symbolic_discretize(pde_system,discretization1)

Zygote.gradient((x)->prob1.f(x,nothing),initθ)
Zygote.gradient((x)->prob2.f(x,nothing),initθ) # it's fine now!

Notice that it has incorrect gradients with initθ = Float64.(DiffEqFlux.initial_params(fastchain)), as evidenced by the fact that one case is very different from the other 3 (FastChain Float32, Float64 and Chain Float32 all match, Chain Float64 is very different and the only one that doesn't train correctly).

@ChrisRackauckas
Copy link
Member Author

I thought it would be a simple destructure/restructure bug with floating point types but isolating it failed:

using DiffEqFlux, Flux, Adapt

fastchain = FastChain(FastDense(2,12,Flux.σ),FastDense(12,12,Flux.σ),FastDense(12,1))
fluxchain = Chain(Dense(2,12,Flux.σ),Dense(12,12,Flux.σ),Dense(12,1))
initθ = DiffEqFlux.initial_params(fastchain)

p,re = Flux.destructure(fluxchain)
x = Float32[1.5,0.5]
dx1,dp1 = Zygote.gradient((x,p)->sum(fastchain(adapt(Array,x),p)),x,initθ)
dx2,dp2 = Zygote.gradient((x,p)->sum(re(p)(adapt(Array,x))),x,initθ)

dx1  dx2 # true
dp1  dp2 # true

initθ = Float64.(DiffEqFlux.initial_params(fastchain))
x = Float64[1.5,0.5]
dx3,dp3 = Zygote.gradient((x,p)->sum(fastchain(x,p)),x,initθ)
dx4,dp4 = Zygote.gradient((x,p)->sum(re(p)(x)),x,initθ)

dx3  dx1 # true
dx4  dx1 # true
dp3  dp1 # true
dp4  dp1 # true

@ChrisRackauckas
Copy link
Member Author

But it goes away if I do f64 on the fluxchain:

using DiffEqFlux, Flux, NeuralPDE, ModelingToolkit, DomainSets

@parameters x y
@variables u(..)
Dxx = Differential(x)^2
Dyy = Differential(y)^2

# 2D PDE
eq  = Dxx(u(x,y)) + Dyy(u(x,y)) ~ -sin(pi*x)*sin(pi*y)

# Initial and boundary conditions
bcs = [u(0,y) ~ 0.0, u(1,y) ~ -sin(pi*1)*sin(pi*y),
       u(x,0) ~ 0.0, u(x,1) ~ -sin(pi*x)*sin(pi*1)]
# Space and time domains
domains = [x  Interval(0.0,1.0),
           y  Interval(0.0,1.0)]

@named pde_system = PDESystem(eq,bcs,domains,[x,y],[u(x, y)])

fastchain = FastChain(FastDense(2,12,Flux.σ),FastDense(12,12,Flux.σ),FastDense(12,1))
fluxchain = Chain(Dense(2,12,Flux.σ),Dense(12,12,Flux.σ),Dense(12,1)) |> f64
initθ = Float64.(DiffEqFlux.initial_params(fastchain))
grid_strategy = NeuralPDE.GridTraining(0.1)

discretization1 = NeuralPDE.PhysicsInformedNN(fastchain,
                                             grid_strategy;
                                             init_params = initθ)

discretization2 = NeuralPDE.PhysicsInformedNN(fluxchain,
                                             grid_strategy;
                                             init_params = initθ)


prob1 = NeuralPDE.discretize(pde_system,discretization1)
prob2 = NeuralPDE.discretize(pde_system,discretization2)

Zygote.gradient((x)->prob1.f(x,nothing),initθ)
Zygote.gradient((x)->prob2.f(x,nothing),initθ) # it's fine now!

@ChrisRackauckas
Copy link
Member Author

We worked around it here by just changing the number type ourselves so NeuralPDE is safe, but @CarloLucibello @mcabbott this is a pretty dangerous bug to have lurking around. Have you considered merging @DhairyaLGandhi 's branch https://github.com/FluxML/Optimisers.jl/tree/dg/noproject and adding a test to catch this in the future?

@mcabbott
Copy link

Have not tried to reproduce this, but this change FluxML/Optimisers.jl@9c61c8a looks like it ought to allow you to make a MWE, or at least to figure out what types are actually involved here. It does not look safe to merge.

@ChrisRackauckas
Copy link
Member Author

My attempts at an MWE failed, but maybe @DhairyaLGandhi found a nicer one. I think you need a map right after the restructure or something, but 🤷 my isolations all worked, so it's something rather specific.

@mcabbott
Copy link

mcabbott commented Jun 29, 2022

Can you tell me what this prints, and also post the stacktrace somewhere?

julia> using Optimisers

julia> @eval Optimisers begin
       function _getat(y::AbstractArray, o::Int, flat::AbstractVector)
          res = ProjectTo(y)(reshape(flat[o .+ (1:length(y))], axes(y)))
          if eltype(res) != eltype(y)
            @info "found one" summary(y) summmary(flat) summary(res)
          end
          res
       end
       end
_getat (generic function with 2 methods)

julia> using DiffEqFlux, Flux, NeuralPDE, ModelingToolkit, DomainSets

So far my attempt to install everything is a bit stuck... failed on 1.7 and 1.9.

Master:

Precompiling project...
  ✗ Cassette
  ✓ Functors
  ✗ ArrayInterfaceOffsetArrays
  ✓ Optimisers
  ✗ SciMLSensitivity
  ✗ DiffEqSensitivity
  ✗ DiffEqFlux
  ✗ NeuralPDE
  160 dependencies successfully precompiled in 616 seconds. 91 already precompiled.
  2 dependencies precompiled but different versions are currently loaded. Restart julia to access the new versions
  6 dependencies errored. To see a full report either run `import Pkg; Pkg.precompile()` or load the packages
[ Info: Precompiling DiffEqFlux [aae7a2af-3d4f-5e19-a356-7da93b79d9d0]
Internal error: encountered unexpected error in runtime:
AssertionError(msg="argextype only works on argument-position values")
argextype at ./compiler/optimize.jl:371
argextype at ./compiler/optimize.jl:353 [inlined]
argextype at ./compiler/optimize.jl:353 [inlined]
stmt_effect_flags at ./compiler/optimize.jl:220
finish at ./compiler/optimize.jl:433
optimize at ./compiler/optimize.jl:523 [inlined]
_typeinf at ./compiler/typeinfer.jl:257
typeinf at ./compiler/typeinfer.jl:213
typeinf_edge at ./compiler/typeinfer.jl:914
abstract_call_method at ./compiler/abstractinterpretation.jl:605
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:166
abstract_call_known at ./compiler/abstractinterpretation.jl:1744
abstract_call at ./compiler/abstractinterpretation.jl:1813
abstract_call at ./compiler/abstractinterpretation.jl:1792
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1934
abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:2284
typeinf_local at ./compiler/abstractinterpretation.jl:2461
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2559
_typeinf at ./compiler/typeinfer.jl:230
typeinf at ./compiler/typeinfer.jl:213
typeinf_ext at ./compiler/typeinfer.jl:1038
typeinf_ext_toplevel at ./compiler/typeinfer.jl:1071
typeinf_ext_toplevel at ./compiler/typeinfer.jl:1067
jfptr_typeinf_ext_toplevel_11266 at /Users/me/.julia/dev/julia/usr/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/me/.julia/dev/julia/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/me/.julia/dev/julia/src/gf.c:2575
jl_apply at /Users/me/.julia/dev/julia/src/./julia.h:1845 [inlined]
jl_type_infer at /Users/me/.julia/dev/julia/src/gf.c:317
jl_generate_fptr_impl at /Users/me/.julia/dev/julia/src/jitlayers.cpp:345
jl_compile_method_internal at /Users/me/.julia/dev/julia/src/gf.c:2107
_jl_invoke at /Users/me/.julia/dev/julia/src/gf.c:2385 [inlined]
ijl_apply_generic at /Users/me/.julia/dev/julia/src/gf.c:2575
jl_apply at /Users/me/.julia/dev/julia/src/./julia.h:1845 [inlined]
jl_module_run_initializer at /Users/me/.julia/dev/julia/src/toplevel.c:75
ijl_init_restored_modules at /Users/me/.julia/dev/julia/src/dump.c:2597
_include_from_serialized at ./loading.jl:875
_require_search_from_serialized at ./loading.jl:990
_require_search_from_serialized at ./loading.jl:956 [inlined]
_require at ./loading.jl:1274
_require_prelocked at ./loading.jl:1170
macro expansion at ./loading.jl:1150 [inlined]
macro expansion at ./lock.jl:267 [inlined]
require at ./loading.jl:1114
jfptr_require_33158 at /Users/me/.julia/dev/julia/usr/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/me/.julia/dev/julia/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/me/.julia/dev/julia/src/gf.c:2575
jl_apply at /Users/me/.julia/dev/julia/src/./julia.h:1845 [inlined]
call_require at /Users/me/.julia/dev/julia/src/toplevel.c:466 [inlined]
eval_import_path at /Users/me/.julia/dev/julia/src/toplevel.c:503
jl_toplevel_eval_flex at /Users/me/.julia/dev/julia/src/toplevel.c:784
jl_eval_module_expr at /Users/me/.julia/dev/julia/src/toplevel.c:203 [inlined]
jl_toplevel_eval_flex at /Users/me/.julia/dev/julia/src/toplevel.c:715
jl_toplevel_eval_flex at /Users/me/.julia/dev/julia/src/toplevel.c:856
ijl_toplevel_eval at /Users/me/.julia/dev/julia/src/toplevel.c:921 [inlined]
ijl_toplevel_eval_in at /Users/me/.julia/dev/julia/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1375
_jl_invoke at /Users/me/.julia/dev/julia/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/me/.julia/dev/julia/src/gf.c:2575
_include at ./loading.jl:1435
include at ./Base.jl:418 [inlined]
include_package_for_output at ./loading.jl:1501
jfptr_include_package_for_output_48100 at /Users/me/.julia/dev/julia/usr/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/me/.julia/dev/julia/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/me/.julia/dev/julia/src/gf.c:2575
jl_apply at /Users/me/.julia/dev/julia/src/./julia.h:1845 [inlined]
do_call at /Users/me/.julia/dev/julia/src/interpreter.c:126
eval_body at /Users/me/.julia/dev/julia/src/interpreter.c:0
jl_interpret_toplevel_thunk at /Users/me/.julia/dev/julia/src/interpreter.c:750
top-level scope at stdin:1
jl_toplevel_eval_flex at /Users/me/.julia/dev/julia/src/toplevel.c:912
jl_toplevel_eval_flex at /Users/me/.julia/dev/julia/src/toplevel.c:856
ijl_toplevel_eval at /Users/me/.julia/dev/julia/src/toplevel.c:921 [inlined]
ijl_toplevel_eval_in at /Users/me/.julia/dev/julia/src/toplevel.c:971
jlplt_ijl_toplevel_eval_in_16886 at /Users/me/.julia/dev/julia/usr/lib/julia/sys.dylib (unknown line)
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1375
include_string at ./loading.jl:1385
_jl_invoke at /Users/me/.julia/dev/julia/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/me/.julia/dev/julia/src/gf.c:2575
exec_options at ./client.jl:297
_start at ./client.jl:516
jfptr__start_34472 at /Users/me/.julia/dev/julia/usr/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/me/.julia/dev/julia/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/me/.julia/dev/julia/src/gf.c:2575
jl_apply at /Users/me/.julia/dev/julia/src/./julia.h:1845 [inlined]
true_main at /Users/me/.julia/dev/julia/src/jlapi.c:567
jl_repl_entrypoint at /Users/me/.julia/dev/julia/src/jlapi.c:711
ERROR: LoadError: MethodError: no method matching Core.LineInfoNode(::Module, ::Symbol, ::Symbol, ::Int32, ::Int64)

Closest candidates are:
  Core.LineInfoNode(::Module, ::Any, ::Symbol, ::Int32, ::Int32)
   @ Core boot.jl:413

Stacktrace:
  [1] verbose_lineinfo!(ci::Core.CodeInfo, sig::Type{<:Tuple})
    @ Cassette ~/.julia/packages/Cassette/34vIw/src/overdub.jl:61
  [2] reflect(sigtypes::Tuple, world::UInt64)
    @ Cassette ~/.julia/packages/Cassette/34vIw/src/overdub.jl:122
  [3] reflect(sigtypes::Tuple)
    @ Cassette ~/.julia/packages/Cassette/34vIw/src/overdub.jl:87
  [4] top-level scope
    @ ~/.julia/packages/Cassette/34vIw/src/overdub.jl:589
  [5] include(mod::Module, _path::String)
    @ Base ./Base.jl:418
  [6] include(x::String)
    @ Cassette ~/.julia/packages/Cassette/34vIw/src/Cassette.jl:1
  [7] top-level scope
    @ ~/.julia/packages/Cassette/34vIw/src/Cassette.jl:8
  [8] include
    @ ./Base.jl:418 [inlined]
  [9] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::String)
    @ Base ./loading.jl:1501
 [10] top-level scope
    @ stdin:1
in expression starting at /Users/me/.julia/packages/Cassette/34vIw/src/overdub.jl:588
in expression starting at /Users/me/.julia/packages/Cassette/34vIw/src/Cassette.jl:1
in expression starting at stdin:1
ERROR: LoadError: Failed to precompile Cassette [7057c7e9-c182-5462-911a-8362d720325c] to /Users/me/.julia/compiled/v1.9/Cassette/jl_fsE6zS.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, ignore_loaded_modules::Bool)
    @ Base ./loading.jl:1652
  [3] compilecache
    @ ./loading.jl:1596 [inlined]
  [4] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1297

1.7:

Precompiling project...
  ✓ Parsers
  ✓ Compat
  ✓ ChainRulesCore
  ✓ JSON
  ✓ Optimisers
  164 dependencies successfully precompiled in 476 seconds (92 already precompiled)
  5 dependencies precompiled but different versions are currently loaded. Restart julia to access the new versions
[ Info: Precompiling DiffEqFlux [aae7a2af-3d4f-5e19-a356-7da93b79d9d0]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing DiffEqFlux [aae7a2af-3d4f-5e19-a356-7da93b79d9d0].
[ Info: Precompiling DataInterpolations [82cc6244-b520-54b8-b5a6-8a565e85f1d0]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing DataInterpolations [82cc6244-b520-54b8-b5a6-8a565e85f1d0].
[ Info: Precompiling RecursiveArrayTools [731186ca-8d62-57ce-b412-fbd966d074cd]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing RecursiveArrayTools [731186ca-8d62-57ce-b412-fbd966d074cd].
[ Info: Precompiling Optim [429524aa-4258-5aef-a3af-852621145aeb]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing Optim [429524aa-4258-5aef-a3af-852621145aeb].
[ Info: Precompiling NLSolversBase [d41bc354-129a-5804-8e4c-c37616107c6c]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing NLSolversBase [d41bc354-129a-5804-8e4c-c37616107c6c].
[ Info: Precompiling ForwardDiff [f6369f11-7733-5829-9624-2563aa707210]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing ForwardDiff [f6369f11-7733-5829-9624-2563aa707210].
[ Info: Precompiling SpecialFunctions [276daf66-3868-5448-9aa4-cd146d93841b]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing SpecialFunctions [276daf66-3868-5448-9aa4-cd146d93841b].
[ Info: Precompiling LogExpFunctions [2ab3a3ac-af41-5b50-aa03-7779005ae688]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing LogExpFunctions [2ab3a3ac-af41-5b50-aa03-7779005ae688].
[ Info: Precompiling ChangesOfVariables [9e997f8a-9a97-42d5-a9f1-ce6bfc15e2c0]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing ChangesOfVariables [9e997f8a-9a97-42d5-a9f1-ce6bfc15e2c0].
[ Info: Precompiling LineSearches [d3d80556-e9d4-5f37-9878-2ab0fcc64255]
┌ Warning: Module NLSolversBase with build ID 3770152153823577 is missing from the cache.
│ This may mean NLSolversBase [d41bc354-129a-5804-8e4c-c37616107c6c] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing LineSearches [d3d80556-e9d4-5f37-9878-2ab0fcc64255].
[ Info: Precompiling StatsBase [2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91]
┌ Warning: Module LogExpFunctions with build ID 3770155251217172 is missing from the cache.
│ This may mean LogExpFunctions [2ab3a3ac-af41-5b50-aa03-7779005ae688] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing StatsBase [2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91].
[ Info: Precompiling Symbolics [0c5d862f-8b57-4792-8d23-62f2024744c7]
┌ Warning: Module SpecialFunctions with build ID 3770154908074587 is missing from the cache.
│ This may mean SpecialFunctions [276daf66-3868-5448-9aa4-cd146d93841b] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module SpecialFunctions with build ID 3770154908074587 is missing from the cache.
│ This may mean SpecialFunctions [276daf66-3868-5448-9aa4-cd146d93841b] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing Symbolics [0c5d862f-8b57-4792-8d23-62f2024744c7].
[ Info: Precompiling SymbolicUtils [d1185830-fcd6-423d-90d6-eec64667417b]
┌ Warning: Module SpecialFunctions with build ID 3770154908074587 is missing from the cache.
│ This may mean SpecialFunctions [276daf66-3868-5448-9aa4-cd146d93841b] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing SymbolicUtils [d1185830-fcd6-423d-90d6-eec64667417b].
[ Info: Precompiling MultivariatePolynomials [102ac46a-7ee4-5c85-9060-abc95bfdeaa3]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing MultivariatePolynomials [102ac46a-7ee4-5c85-9060-abc95bfdeaa3].
[ Info: Precompiling DynamicPolynomials [7c1d4256-1411-5781-91ec-d7bc3513ac07]
┌ Warning: Module MultivariatePolynomials with build ID 3770195189133000 is missing from the cache.
│ This may mean MultivariatePolynomials [102ac46a-7ee4-5c85-9060-abc95bfdeaa3] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing DynamicPolynomials [7c1d4256-1411-5781-91ec-d7bc3513ac07].
[ Info: Precompiling LabelledArrays [2ee39098-c373-598a-b85f-a56591580800]
┌ Warning: Module RecursiveArrayTools with build ID 3770138868277356 is missing from the cache.
│ This may mean RecursiveArrayTools [731186ca-8d62-57ce-b412-fbd966d074cd] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing LabelledArrays [2ee39098-c373-598a-b85f-a56591580800].
[ Info: Precompiling PreallocationTools [d236fae5-4411-538c-8e31-a6e3d9e00b46]
┌ Warning: Module ForwardDiff with build ID 3770154505727790 is missing from the cache.
│ This may mean ForwardDiff [f6369f11-7733-5829-9624-2563aa707210] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing PreallocationTools [d236fae5-4411-538c-8e31-a6e3d9e00b46].
[ Info: Precompiling SciMLBase [0bca4576-84f4-4d90-8ffe-ffa030f20462]
┌ Warning: Module RecursiveArrayTools with build ID 3770138868277356 is missing from the cache.
│ This may mean RecursiveArrayTools [731186ca-8d62-57ce-b412-fbd966d074cd] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing SciMLBase [0bca4576-84f4-4d90-8ffe-ffa030f20462].
[ Info: Precompiling Groebner [0b43b601-686d-58a3-8a1c-6623616c7cd4]
┌ Warning: Module MultivariatePolynomials with build ID 3770195189133000 is missing from the cache.
│ This may mean MultivariatePolynomials [102ac46a-7ee4-5c85-9060-abc95bfdeaa3] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing Groebner [0b43b601-686d-58a3-8a1c-6623616c7cd4].
[ Info: Precompiling Distributions [31c24e10-a181-5473-b8eb-7969acd0382f]
┌ Warning: Module StatsBase with build ID 3770167353669730 is missing from the cache.
│ This may mean StatsBase [2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing Distributions [31c24e10-a181-5473-b8eb-7969acd0382f].
[ Info: Precompiling StatsFuns [4c63d2b9-4356-54db-8cca-17b64c39e42c]
┌ Warning: Module SpecialFunctions with build ID 3770154908074587 is missing from the cache.
│ This may mean SpecialFunctions [276daf66-3868-5448-9aa4-cd146d93841b] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing StatsFuns [4c63d2b9-4356-54db-8cca-17b64c39e42c].
[ Info: Precompiling HypergeometricFunctions [34004b35-14d8-5ef3-9330-4cdb6864b03a]
┌ Warning: Module SpecialFunctions with build ID 3770154908074587 is missing from the cache.
│ This may mean SpecialFunctions [276daf66-3868-5448-9aa4-cd146d93841b] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module SpecialFunctions with build ID 3770154908074587 is missing from the cache.
│ This may mean SpecialFunctions [276daf66-3868-5448-9aa4-cd146d93841b] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing HypergeometricFunctions [34004b35-14d8-5ef3-9330-4cdb6864b03a].
[ Info: Precompiling DualNumbers [fa6b7ba4-c1ee-5f82-b5fc-ecf0adba8f74]
┌ Warning: Module SpecialFunctions with build ID 3770154908074587 is missing from the cache.
│ This may mean SpecialFunctions [276daf66-3868-5448-9aa4-cd146d93841b] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing DualNumbers [fa6b7ba4-c1ee-5f82-b5fc-ecf0adba8f74].
[ Info: Precompiling RegularizationTools [29dad682-9a27-4bc3-9c72-016788665182]
┌ Warning: Module Optim with build ID 3770149024957368 is missing from the cache.
│ This may mean Optim [429524aa-4258-5aef-a3af-852621145aeb] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing RegularizationTools [29dad682-9a27-4bc3-9c72-016788665182].
[ Info: Precompiling LeastSquaresOptim [0fc2ff8b-aaa3-5acd-a817-1944a5e08891]
┌ Warning: Module ForwardDiff with build ID 3770154505727790 is missing from the cache.
│ This may mean ForwardDiff [f6369f11-7733-5829-9624-2563aa707210] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing LeastSquaresOptim [0fc2ff8b-aaa3-5acd-a817-1944a5e08891].
[ Info: Precompiling DiffEqBase [2b5f629d-d688-5b77-993f-72d75c75574e]
┌ Warning: Module ForwardDiff with build ID 3770154505727790 is missing from the cache.
│ This may mean ForwardDiff [f6369f11-7733-5829-9624-2563aa707210] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module ForwardDiff with build ID 3770154505727790 is missing from the cache.
│ This may mean ForwardDiff [f6369f11-7733-5829-9624-2563aa707210] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing DiffEqBase [2b5f629d-d688-5b77-993f-72d75c75574e].
[ Info: Precompiling NonlinearSolve [8913a72c-1f9b-4ce2-8d82-65094dcecaec]
┌ Warning: Module ForwardDiff with build ID 3770154505727790 is missing from the cache.
│ This may mean ForwardDiff [f6369f11-7733-5829-9624-2563aa707210] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing NonlinearSolve [8913a72c-1f9b-4ce2-8d82-65094dcecaec].
[ Info: Precompiling RecursiveFactorization [f2c3362d-daeb-58d1-803e-2bc74f2840b4]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing RecursiveFactorization [f2c3362d-daeb-58d1-803e-2bc74f2840b4].
[ Info: Precompiling LoopVectorization [bdcacae8-1622-11e9-2a5c-532679323890]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing LoopVectorization [bdcacae8-1622-11e9-2a5c-532679323890].
[ Info: Precompiling SIMDDualNumbers [3cdde19b-5bb0-4aaf-8931-af3e248e098b]
┌ Warning: Module ForwardDiff with build ID 3770154505727790 is missing from the cache.
│ This may mean ForwardDiff [f6369f11-7733-5829-9624-2563aa707210] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing SIMDDualNumbers [3cdde19b-5bb0-4aaf-8931-af3e248e098b].
[ Info: Precompiling TriangularSolve [d5829a12-d9aa-46ab-831f-fb7c9ab06edf]
┌ Warning: Module LoopVectorization with build ID 3770324638283087 is missing from the cache.
│ This may mean LoopVectorization [bdcacae8-1622-11e9-2a5c-532679323890] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing TriangularSolve [d5829a12-d9aa-46ab-831f-fb7c9ab06edf].
[ Info: Precompiling Polyester [f517fe37-dbe3-4b94-8317-1923a5111588]
[ Info: Precompiling FastBroadcast [7034ab61-46d4-4ed7-9d0f-46aef9175898]
[ Info: Precompiling SciMLSensitivity [1ed8b502-d754-442c-8d5d-10ac956f44a1]
┌ Warning: Module DiffEqBase with build ID 3770257209934706 is missing from the cache.
│ This may mean DiffEqBase [2b5f629d-d688-5b77-993f-72d75c75574e] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing SciMLSensitivity [1ed8b502-d754-442c-8d5d-10ac956f44a1].
[ Info: Precompiling Tracker [9f7883ad-71c0-57eb-9f7f-b5c9e6d3789c]
┌ Warning: Module LogExpFunctions with build ID 3770155251217172 is missing from the cache.
│ This may mean LogExpFunctions [2ab3a3ac-af41-5b50-aa03-7779005ae688] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing Tracker [9f7883ad-71c0-57eb-9f7f-b5c9e6d3789c].
[ Info: Precompiling NNlib [872c559c-99b0-510c-b3b7-b6c96a88d5cd]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing NNlib [872c559c-99b0-510c-b3b7-b6c96a88d5cd].
[ Info: Precompiling DiffEqCallbacks [459566f4-90b8-5000-8ac3-15dfb0a30def]
┌ Warning: Module DiffEqBase with build ID 3770257209934706 is missing from the cache.
│ This may mean DiffEqBase [2b5f629d-d688-5b77-993f-72d75c75574e] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing DiffEqCallbacks [459566f4-90b8-5000-8ac3-15dfb0a30def].
[ Info: Precompiling NLsolve [2774e3e8-f4cf-5e23-947b-6d7e65073b56]
┌ Warning: Module NLSolversBase with build ID 3770152153823577 is missing from the cache.
│ This may mean NLSolversBase [d41bc354-129a-5804-8e4c-c37616107c6c] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing NLsolve [2774e3e8-f4cf-5e23-947b-6d7e65073b56].
[ Info: Precompiling DiffEqOperators [9fdde737-9c7f-55bf-ade8-46b3f136cc48]
┌ Warning: Module DiffEqBase with build ID 3770257209934706 is missing from the cache.
│ This may mean DiffEqBase [2b5f629d-d688-5b77-993f-72d75c75574e] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing DiffEqOperators [9fdde737-9c7f-55bf-ade8-46b3f136cc48].
[ Info: Precompiling SparseDiffTools [47a9eef4-7e08-11e9-0b38-333d64bd3804]
┌ Warning: Module ForwardDiff with build ID 3770154505727790 is missing from the cache.
│ This may mean ForwardDiff [f6369f11-7733-5829-9624-2563aa707210] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing SparseDiffTools [47a9eef4-7e08-11e9-0b38-333d64bd3804].
[ Info: Precompiling Graphs [86223c79-3864-5bf0-83f7-82e725a168b6]
[ Info: Precompiling VertexSafeGraphs [19fa3120-7c27-5ec5-8db8-b0b0aa330d6f]
[ Info: Precompiling LinearSolve [7ed4a6bd-45f5-4d41-b270-4a48e9bafcae]
┌ Warning: Module RecursiveFactorization with build ID 3770309849301419 is missing from the cache.
│ This may mean RecursiveFactorization [f2c3362d-daeb-58d1-803e-2bc74f2840b4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing LinearSolve [7ed4a6bd-45f5-4d41-b270-4a48e9bafcae].
[ Info: Precompiling StochasticDiffEq [789caeaf-c7a9-5a7d-9973-96adeb23e2a0]
┌ Warning: Module DiffEqBase with build ID 3770257209934706 is missing from the cache.
│ This may mean DiffEqBase [2b5f629d-d688-5b77-993f-72d75c75574e] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing StochasticDiffEq [789caeaf-c7a9-5a7d-9973-96adeb23e2a0].
[ Info: Precompiling OrdinaryDiffEq [1dea7af3-3e70-54e6-95c3-0bf5283fa5ed]
┌ Warning: Module DiffEqBase with build ID 3770257209934706 is missing from the cache.
│ This may mean DiffEqBase [2b5f629d-d688-5b77-993f-72d75c75574e] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing OrdinaryDiffEq [1dea7af3-3e70-54e6-95c3-0bf5283fa5ed].
[ Info: Precompiling DiffEqNoiseProcess [77a26b50-5914-5dd7-bc55-306e6241c503]
┌ Warning: Module DiffEqBase with build ID 3770257209934706 is missing from the cache.
│ This may mean DiffEqBase [2b5f629d-d688-5b77-993f-72d75c75574e] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing DiffEqNoiseProcess [77a26b50-5914-5dd7-bc55-306e6241c503].
[ Info: Precompiling LevyArea [2d8b4e74-eb68-11e8-0fb9-d5eb67b50637]
┌ Warning: Module SpecialFunctions with build ID 3770154908074587 is missing from the cache.
│ This may mean SpecialFunctions [276daf66-3868-5448-9aa4-cd146d93841b] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing LevyArea [2d8b4e74-eb68-11e8-0fb9-d5eb67b50637].
[ Info: Precompiling DiffEqJump [c894b116-72e5-5b58-be3c-e6d8d4ac2b12]
┌ Warning: Module DiffEqBase with build ID 3770257209934706 is missing from the cache.
│ This may mean DiffEqBase [2b5f629d-d688-5b77-993f-72d75c75574e] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing DiffEqJump [c894b116-72e5-5b58-be3c-e6d8d4ac2b12].
[ Info: Precompiling Zygote [e88e6eb3-aa80-5325-afca-941959d7151f]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing Zygote [e88e6eb3-aa80-5325-afca-941959d7151f].
[ Info: Precompiling ChainRules [082447d4-558c-5d27-93f4-14fc19e9eca2]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing ChainRules [082447d4-558c-5d27-93f4-14fc19e9eca2].
[ Info: Precompiling AbstractFFTs [621f4979-c628-5d54-868e-fcf4e3e8185c]
┌ Warning: Module ChainRulesCore with build ID 3769598302557001 is missing from the cache.
│ This may mean ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing AbstractFFTs [621f4979-c628-5d54-868e-fcf4e3e8185c].
[ Info: Precompiling ReverseDiff [37e2e3b7-166d-5795-8a7a-e32c996b4267]
┌ Warning: Module SpecialFunctions with build ID 3770154908074587 is missing from the cache.
│ This may mean SpecialFunctions [276daf66-3868-5448-9aa4-cd146d93841b] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing ReverseDiff [37e2e3b7-166d-5795-8a7a-e32c996b4267].
[ Info: Precompiling ArrayInterfaceTracker [a2b0951a-f94f-4742-8780-617792921f9b]
┌ Warning: Module Tracker with build ID 3770379671654784 is missing from the cache.
│ This may mean Tracker [9f7883ad-71c0-57eb-9f7f-b5c9e6d3789c] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing ArrayInterfaceTracker [a2b0951a-f94f-4742-8780-617792921f9b].
[ Info: Precompiling EllipsisNotation [da5c29d0-fa7d-589e-88eb-ea29b0a81949]
[ Info: Precompiling DistributionsAD [ced4e74d-a319-5a8a-b0ac-84af2272839c]
┌ Warning: Module Distributions with build ID 3770231665151821 is missing from the cache.
│ This may mean Distributions [31c24e10-a181-5473-b8eb-7969acd0382f] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing DistributionsAD [ced4e74d-a319-5a8a-b0ac-84af2272839c].
[ Info: Precompiling Optimization [7f7a1694-90dd-40f0-9382-eb1efda571ba]
┌ Warning: Module SciMLBase with build ID 3770211295630514 is missing from the cache.
│ This may mean SciMLBase [0bca4576-84f4-4d90-8ffe-ffa030f20462] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing Optimization [7f7a1694-90dd-40f0-9382-eb1efda571ba].
[ Info: Precompiling OptimizationPolyalgorithms [500b13db-7e66-49ce-bda4-eed966be6282]
┌ Warning: Module Optimization with build ID 3770636058657594 is missing from the cache.
│ This may mean Optimization [7f7a1694-90dd-40f0-9382-eb1efda571ba] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing OptimizationPolyalgorithms [500b13db-7e66-49ce-bda4-eed966be6282].
[ Info: Precompiling OptimizationOptimJL [36348300-93cb-4f02-beb5-3c3902f8871e]
┌ Warning: Module Optimization with build ID 3770636058657594 is missing from the cache.
│ This may mean Optimization [7f7a1694-90dd-40f0-9382-eb1efda571ba] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing OptimizationOptimJL [36348300-93cb-4f02-beb5-3c3902f8871e].
[ Info: Precompiling OptimizationOptimisers [42dfb2eb-d2b4-4451-abcd-913932933ac1]
┌ Warning: Module Optimization with build ID 3770636058657594 is missing from the cache.
│ This may mean Optimization [7f7a1694-90dd-40f0-9382-eb1efda571ba] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing OptimizationOptimisers [42dfb2eb-d2b4-4451-abcd-913932933ac1].
[ Info: Precompiling OptimizationFlux [253f991c-a7b2-45f8-8852-8b9a9df78a86]
┌ Warning: Module Optimization with build ID 3770636058657594 is missing from the cache.
│ This may mean Optimization [7f7a1694-90dd-40f0-9382-eb1efda571ba] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing OptimizationFlux [253f991c-a7b2-45f8-8852-8b9a9df78a86].
[ Info: Precompiling Flux [587475ba-b771-5e3f-ad9e-33799f191a9c]
┌ Warning: Module SpecialFunctions with build ID 3770154908074587 is missing from the cache.
│ This may mean SpecialFunctions [276daf66-3868-5448-9aa4-cd146d93841b] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing Flux [587475ba-b771-5e3f-ad9e-33799f191a9c].
[ Info: Precompiling MLUtils [f1d291b0-491e-4a28-83b9-f70985020b54]
┌ Warning: Module StatsBase with build ID 3770167353669730 is missing from the cache.
│ This may mean StatsBase [2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing MLUtils [f1d291b0-491e-4a28-83b9-f70985020b54].
[ Info: Precompiling CUDA [052768ef-5323-5732-b1bb-66c8b64840ba]
┌ Warning: Module AbstractFFTs with build ID 3770601370522812 is missing from the cache.
│ This may mean AbstractFFTs [621f4979-c628-5d54-868e-fcf4e3e8185c] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1107
[ Info: Skipping precompilation since __precompile__(false). Importing CUDA [052768ef-5323-5732-b1bb-66c8b64840ba].
ERROR: LoadError: InitError: BoundsError: attempt to access 3-element Vector{Any} at index [4]
Stacktrace:
  [1] getindex
    @ ./array.jl:861 [inlined]
  [2] iterate(iter::JuliaInterpreter.ExprSplitter, state::Nothing)
    @ JuliaInterpreter ~/.julia/packages/JuliaInterpreter/Hrjsr/src/construct.jl:529
  [3] parse_source!(mod_exprs_sigs::OrderedCollections.OrderedDict{Module, OrderedCollections.OrderedDict{Revise.RelocatableExpr, Union{Nothing, Vector{Any}}}}, src::String, filename::String, mod::Module; mode::Symbol)
    @ Revise ~/.julia/packages/Revise/jHTGK/src/parsing.jl:72
  [4] parse_source!
    @ ~/.julia/packages/Revise/jHTGK/src/parsing.jl:39 [inlined]
  [5] parse_source!(mod_exprs_sigs::OrderedCollections.OrderedDict{Module, OrderedCollections.OrderedDict{Revise.RelocatableExpr, Union{Nothing, Vector{Any}}}}, filename::String, mod::Module; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Revise ~/.julia/packages/Revise/jHTGK/src/parsing.jl:27
  [6] parse_source!
    @ ~/.julia/packages/Revise/jHTGK/src/parsing.jl:23 [inlined]
  [7] #parse_source#11
    @ ~/.julia/packages/Revise/jHTGK/src/parsing.jl:10 [inlined]
  [8] parse_source
    @ ~/.julia/packages/Revise/jHTGK/src/parsing.jl:10 [inlined]
  [9] queue_includes!(pkgdata::Revise.PkgData, id::Base.PkgId)
    @ Revise ~/.julia/packages/Revise/jHTGK/src/pkgs.jl:51
 [10] #invokelatest#2
    @ ./essentials.jl:716 [inlined]
 [11] invokelatest
    @ ./essentials.jl:714 [inlined]
 [12] parse_pkg_files(id::Base.PkgId)
    @ Revise ~/.julia/packages/Revise/jHTGK/src/loading.jl:51
 [13] watch_package(id::Base.PkgId)
    @ Revise ~/.julia/packages/Revise/jHTGK/src/pkgs.jl:346
 [14] add_require(sourcefile::String, modcaller::Module, idmod::String, modname::String, expr::Expr)
    @ Revise ~/.julia/packages/Revise/jHTGK/src/pkgs.jl:188
 [15] withnotifications(::Any, ::Vararg{Any})
    @ Requires ~/.julia/packages/Requires/Z8rfN/src/require.jl:70
 [16] (::CUDA.var"#73#76")()
    @ CUDA ~/.julia/packages/Requires/Z8rfN/src/require.jl:106
 [17] listenpkg(f::Any, pkg::Base.PkgId)
    @ Requires ~/.julia/packages/Requires/Z8rfN/src/require.jl:20
 [18] macro expansion
    @ ~/.julia/packages/Requires/Z8rfN/src/require.jl:98 [inlined]
 [19] __init__()
    @ CUDA ~/.julia/packages/CUDA/tTK8Y/src/initialization.jl:35
 [20] include
    @ ./Base.jl:418 [inlined]
 [21] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1149
 [22] require(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1013
 [23] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:997
 [24] include
    @ ./Base.jl:418 [inlined]
 [25] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1149
 [26] require(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1013
 [27] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:997
 [28] include
    @ ./Base.jl:418 [inlined]
 [29] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1149
 [30] require(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1013
 [31] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:997
 [32] include
    @ ./Base.jl:418 [inlined]
 [33] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1149
 [34] require(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1013
 [35] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:997
during initialization of module CUDA
in expression starting at /Users/me/.julia/packages/CUDA/tTK8Y/src/CUDA.jl:1
in expression starting at /Users/me/.julia/packages/Flux/js6mP/src/Flux.jl:1
in expression starting at /Users/me/.julia/packages/OptimizationFlux/cpWyO/src/OptimizationFlux.jl:1
in expression starting at /Users/me/.julia/packages/DiffEqFlux/5e9D2/src/DiffEqFlux.jl:1

@ChrisRackauckas
Copy link
Member Author

That skipping precompilation is something that shows up with any packages on v1.7 if you update and reuse without restarting the REPL.

@ChrisRackauckas
Copy link
Member Author

ChrisRackauckas commented Jun 30, 2022

@DhairyaLGandhi identified the right spot, but his fix is incorrect. Here's a deterministic example

using DiffEqFlux, Flux, NeuralPDE, ModelingToolkit, DomainSets

@parameters x y
@variables u(..)
Dxx = Differential(x)^2
Dyy = Differential(y)^2

# 2D PDE
eq  = Dxx(u(x,y)) + Dyy(u(x,y)) ~ -sin(pi*x)*sin(pi*y)

# Initial and boundary conditions
bcs = [u(0,y) ~ 0.0, u(1,y) ~ -sin(pi*1)*sin(pi*y),
       u(x,0) ~ 0.0, u(x,1) ~ -sin(pi*x)*sin(pi*1)]
# Space and time domains
domains = [x  Interval(0.0,1.0),
           y  Interval(0.0,1.0)]

@named pde_system = PDESystem(eq,bcs,domains,[x,y],[u(x, y)])

fastchain = FastChain(FastDense(2,12,Flux.σ),FastDense(12,12,Flux.σ),FastDense(12,1))
fluxchain = Chain(Dense(2,12,Flux.σ),Dense(12,12,Flux.σ),Dense(12,1))
initθ = range(0,1,length=205)
grid_strategy = NeuralPDE.GridTraining(0.1)

discretization1 = NeuralPDE.PhysicsInformedNN(fastchain,
                                             grid_strategy;
                                             init_params = initθ)

discretization2 = NeuralPDE.PhysicsInformedNN(fluxchain,
                                             grid_strategy;
                                             init_params = initθ)


prob1 = NeuralPDE.discretize(pde_system,discretization1)
prob2 = NeuralPDE.discretize(pde_system,discretization2)

Zygote.gradient((x)->prob1.f(x,nothing),initθ)
# ([0.34135572161301464, 0.4405596388580093, 0.5395470482221245, 0.6382976197739982, 0.7367915738790711, 0.8350087259178556, 0.9329294568383262, 1.030534260497729, 1.1278035963200221, 1.2247180249089877  …  98.49443032452626, 98.54203213497166, 98.58792677663844, 98.63218175029297, 98.6748692924491, 98.71603782132827, 98.75571771403575, 98.7939947935279, 98.8308816355171, 99.81405020016672],)
Zygote.gradient((x)->prob2.f(x,nothing),initθ)
# ([0.34197112172842026, -3.558639347553253, 4.5405376851558685, 0.6394826173782349, 0.7381760030984879, -3.1634015142917633, -7.0652690678834915, 17.032554239034653, -6.869950324296951, -14.772801548242569  …  -413.50527000427246, 98.54232025146484, 610.5882167816162, 98.63247489929199, 98.6751537322998, 98.71630668640137, 610.7559909820557, 610.7942523956299, 98.83114624023438, 99.81404876708984],)

## Fixed

initθ = Float32.(range(0,1,length=205))
grid_strategy = NeuralPDE.GridTraining(0.1)

discretization1 = NeuralPDE.PhysicsInformedNN(fastchain,
                                             grid_strategy;
                                             init_params = initθ)

discretization2 = NeuralPDE.PhysicsInformedNN(fluxchain,
                                             grid_strategy;
                                             init_params = initθ)


prob1 = NeuralPDE.discretize(pde_system,discretization1)
prob2 = NeuralPDE.discretize(pde_system,discretization2)

Zygote.gradient((x)->prob1.f(x,nothing),initθ)
# (Float32[0.34133404, 0.440552, 0.5395764, 0.63829243, 0.73679507, 0.83497345, 0.9328923, 1.0304716, 1.1278142, 1.2247037  …  98.49864, 98.546234, 98.59311, 98.63931, 98.6693, 98.71631, 98.76674, 98.80206, 98.82138, 99.81404],)
Zygote.gradient((x)->prob2.f(x,nothing),initθ)
# (Float32[0.3413074, 0.44055194, 0.5395802, 0.6383001, 0.7367189, 0.83488965, 0.9327855, 1.0305097, 1.127738, 1.2246275  …  98.49864, 98.546234, 98.59311, 98.63931, 98.6693, 98.71631, 98.76674, 98.80206, 98.82138, 99.81404],)

# Doesn't do anything:

@eval Optimisers begin
    function _getat(y::AbstractArray, o::Int, flat::AbstractVector)
        res = ProjectTo(y)(reshape(flat[o .+ (1:length(y))], axes(y)))
        if eltype(res) != eltype(y)
        @info "found one" summary(y) summmary(flat) summary(res)
        end
        res
    end
end

initθ = range(0,1,length=205)
grid_strategy = NeuralPDE.GridTraining(0.1)

discretization1 = NeuralPDE.PhysicsInformedNN(fastchain,
                                             grid_strategy;
                                             init_params = initθ)

discretization2 = NeuralPDE.PhysicsInformedNN(fluxchain,
                                             grid_strategy;
                                             init_params = initθ)


prob1 = NeuralPDE.discretize(pde_system,discretization1)
prob2 = NeuralPDE.discretize(pde_system,discretization2)

Zygote.gradient((x)->prob1.f(x,nothing),initθ)
# ([0.34135572161301464, 0.4405596388580093, 0.5395470482221245, 0.6382976197739982, 0.7367915738790711, 0.8350087259178556, 0.9329294568383262, 1.030534260497729, 1.1278035963200221, 1.2247180249089877  …  98.49443032452626, 98.54203213497166, 98.58792677663844, 98.63218175029297, 98.6748692924491, 98.71603782132827, 98.75571771403575, 98.7939947935279, 98.8308816355171, 99.81405020016672],)
Zygote.gradient((x)->prob2.f(x,nothing),initθ)
# ([0.34197112172842026, -3.558639347553253, 4.5405376851558685, 0.6394826173782349, 0.7381760030984879, -3.1634015142917633, -7.0652690678834915, 17.032554239034653, -6.869950324296951, -14.772801548242569  …  -413.50527000427246, 98.54232025146484, 610.5882167816162, 98.63247489929199, 98.6751537322998, 98.71630668640137, 610.7559909820557, 610.7942523956299, 98.83114624023438, 99.81404876708984],)

# Doesn't do anything

using Optimisers
@eval Optimisers begin
    _getat(y::AbstractArray{T}, o::Int, flat::AbstractVector) where T =
        T.(reshape(flat[o .+ (1:length(y))], axes(y)))  # ProjectTo is just correcting eltypes
end

Zygote.gradient((x)->prob1.f(x,nothing),initθ)
# ([0.34135572161301464, 0.4405596388580093, 0.5395470482221245, 0.6382976197739982, 0.7367915738790711, 0.8350087259178556, 0.9329294568383262, 1.030534260497729, 1.1278035963200221, 1.2247180249089877  …  98.49443032452626, 98.54203213497166, 98.58792677663844, 98.63218175029297, 98.6748692924491, 98.71603782132827, 98.75571771403575, 98.7939947935279, 98.8308816355171, 99.81405020016672],)
Zygote.gradient((x)->prob2.f(x,nothing),initθ)
# ([0.34197112172842026, -3.558639347553253, 4.5405376851558685, 0.6394826173782349, 0.7381760030984879, -3.1634015142917633, -7.0652690678834915, 17.032554239034653, -6.869950324296951, -14.772801548242569  …  -413.50527000427246, 98.54232025146484, 610.5882167816162, 98.63247489929199, 98.6751537322998, 98.71630668640137, 610.7559909820557, 610.7942523956299, 98.83114624023438, 99.81404876708984],)

# Fixed!!! ?

using Optimisers
@eval Optimisers begin
    _getat(y::AbstractArray{T}, o::Int, flat::AbstractVector) where T =
        @show Float64.(reshape(flat[o .+ (1:length(y))], axes(y)))  # ProjectTo is just correcting eltypes
end

Zygote.gradient((x)->prob1.f(x,nothing),initθ)
# ([0.34135572161301464, 0.4405596388580093, 0.5395470482221245, 0.6382976197739982, 0.7367915738790711, 0.8350087259178556, 0.9329294568383262, 1.030534260497729, 1.1278035963200221, 1.2247180249089877  …  98.49443032452626, 98.54203213497166, 98.58792677663844, 98.63218175029297, 98.6748692924491, 98.71603782132827, 98.75571771403575, 98.7939947935279, 98.8308816355171, 99.81405020016672],)
Zygote.gradient((x)->prob2.f(x,nothing),initθ)
# ([0.34135578866824007, 0.44055960533039673, 0.539547040771544, 0.6382977315327074, 0.736791544076749, 0.8350087706213394, 0.932929464288907, 1.030534171090762, 1.127803417506088, 1.2247179504031824  …  98.49443127820058, 98.54203022762303, 98.58792582296412, 98.63217888927002, 98.67485975570594, 98.7160397286769, 98.75572534343029, 98.79398334943609, 98.83090070900344, 99.81405020016672],)

using Optimisers
@eval Optimisers begin
    _getat(y::AbstractArray{T}, o::Int, flat::AbstractVector) where T =
        reshape(flat[o .+ (1:length(y))], axes(y))
end

Zygote.gradient((x)->prob1.f(x,nothing),initθ)
# ([0.34135572161301464, 0.4405596388580093, 0.5395470482221245, 0.6382976197739982, 0.7367915738790711, 0.8350087259178556, 0.9329294568383262, 1.030534260497729, 1.1278035963200221, 1.2247180249089877  …  98.49443032452626, 98.54203213497166, 98.58792677663844, 98.63218175029297, 98.6748692924491, 98.71603782132827, 98.75571771403575, 98.7939947935279, 98.8308816355171, 99.81405020016672],)
Zygote.gradient((x)->prob2.f(x,nothing),initθ)
# ([0.34135578866824007, 0.44055960533039673, 0.539547040771544, 0.6382977315327074, 0.736791544076749, 0.8350087706213394, 0.932929464288907, 1.030534171090762, 1.127803417506088, 1.2247179504031824  …  98.49443127820058, 98.54203022762303, 98.58792582296412, 98.63217888927002, 98.67485975570594, 98.7160397286769, 98.75572534343029, 98.79398334943609, 98.83090070900344, 99.81405020016672],)

using Optimisers
@eval Optimisers begin
    function _getat(y::AbstractArray{T}, o::Int, flat::AbstractVector) where T
        @show eltype(y), eltype(flat)
        reshape(flat[o .+ (1:length(y))], axes(y))
    end
end

Zygote.gradient((x)->prob1.f(x,nothing),initθ)
# ([0.34135572161301464, 0.4405596388580093, 0.5395470482221245, 0.6382976197739982, 0.7367915738790711, 0.8350087259178556, 0.9329294568383262, 1.030534260497729, 1.1278035963200221, 1.2247180249089877  …  98.49443032452626, 98.54203213497166, 98.58792677663844, 98.63218175029297, 98.6748692924491, 98.71603782132827, 98.75571771403575, 98.7939947935279, 98.8308816355171, 99.81405020016672],)
Zygote.gradient((x)->prob2.f(x,nothing),initθ)
# (eltype(y), eltype(flat)) = (Float32, Float64)
# ([0.34135578866824007, 0.44055960533039673, 0.539547040771544, 0.6382977315327074, 0.736791544076749, 0.8350087706213394, 0.932929464288907, 1.030534171090762, 1.127803417506088, 1.2247179504031824  …  98.49443127820058, 98.54203022762303, 98.58792582296412, 98.63217888927002, 98.67485975570594, 98.7160397286769, 98.75572534343029, 98.79398334943609, 98.83090070900344, 99.81405020016672],)

Instead of going down to Float32, it needs to widen to Float64 to be correct.

What's the reason for that ProjectTo(y)? Maybe it needs to have a promote_type in there?

@mcabbott
Copy link

mcabbott commented Jul 1, 2022

# (eltype(y), eltype(flat)) = (Float32, Float64)

This is what's expected when the primal is Float32 for this variable y, but Float64 for others. The flat vector has to widen, but ProjectTo(y) ensures that the reconstructed gradient for y is correctly made Float32.

If something else wants a Float64 gradient for a Float32 variable, then maybe that's the problem.

@ChrisRackauckas
Copy link
Member Author

I understand why it exists now, but I don't understand why the type of y is considered the end-all be-all here. The parameters inside the neural network are never used in this example, the NN is only used for its structure and restructured with new values. Yet, the type of the encoded values in there silently will cause the precision of the solution to change, even though the user tries to avoid those values ever existing. The only place where those values creep in happens to be in one part of the backwards pass, where they end up being used for a type conversion, so even though the values are never used you have to be careful about the type. I can't be the only one that sees that as a weird action at a distance?

Basically, why wouldn't re(p) take on the element type of p and would instead preserve the types it previously had? This also explains some of the issues with GPUs then, because re(p) where the NN is CPU-based and p is GPU-based doesn't build a GPU-based version of the NN. This also explains some of the issues with TrackedArrays. Etc. This explains a lot of the bugs with now, but I just don't understand why that needs to exist.

@mcabbott
Copy link

mcabbott commented Jul 1, 2022

Basically, why wouldn't re(p) take on the element type of p and would instead preserve the types it previously had?

Because its one job is reconstruction? It's explicitly designed to allow for mixed precisions, and not forget this. And not just precisions, the help example is:

  julia> v, re = destructure((x=[1.0, 2.0], y=(sin, [3 + 4im])))
  (ComplexF64[1.0 + 0.0im, 2.0 + 0.0im, 3.0 + 4.0im], Restructure(NamedTuple, ..., 3))
  
  julia> re([3, 5-im, 7+11im])
  (x = [3.0, 5.0], y = (sin, ComplexF64[7.0 + 11.0im]))

This also explains some of the issues with GPUs then, because re(p) where the NN is CPU-based and p is GPU-based

No, it does not. destructure on something containing a mix of GPU and CPU arrays is essentially undefined behaviour. (IIRC it depends on the order of arrays.) I'm not sure I follow what you think the behaviour should be, but it could be made to be something. Make an issue if this case is useful.

This also explains some of the issues with TrackedArrays. Etc.

Does it? What issues?

@ChrisRackauckas
Copy link
Member Author

ChrisRackauckas commented Jul 1, 2022

Because its one job is reconstruction?

Well, that's why I'm confused why it's doing more than just reconstruction. It's not just reconstructing the array p into the neural network architecture, it's also changing its values to not match p, so it's not a reconstruction of p into the form of the destructured thing. I expected

  julia> re([3, 5-im, 7+11im])
  (x = [3+0*im, 5-im], y = (sin, [7 + 11im]))

i.e. it would be "the same form" but with the values of p. In fact, your example there shows something very scary because it changed 5-im to 5.0: those aren't just different types, but different values because 5-im != 5.0! Also, it doesn't seem to be very consistent. With x it tries to convert everything back to Float64, even though the values are not representable as Float64s. But with y, it's perfectly happy with ComplexF64[7.0 + 11.0im] instead of changing it back to [7 + 11im] when the original values were in Complex{Int}? So there's no guarantee that the values are the same as p, and there's no guarantee that the types match those of the destructured thing, and there's no guarantee that it matches the types of p! ComplexF64 shows up as the input to none of the functions, but still shows up in the result. What is the rule then? I honestly would prefer to just get an error in this case then, as this is definitely not what I expected.

No, it does not. destructure on something containing a mix of GPU and CPU arrays is essentially undefined behaviour. (IIRC it depends on the order of arrays.) I'm not sure I follow what you think the behaviour should be, but it could be made to be something. Make an issue if this case is useful.

No, I'm saying I would've expected:

  julia> re(cu([3, 5-im, 7+11im]))
  (x = cu([3+0*im, 5-im]), y = (sin, cu([7 + 11im])))

"It's the same as the destructured thing, but with the values taken from p, x is just p[1:2] and y is just (sin,p[3:3])" is the rule. Simple, straightforward, and always matches the values of p. This also would make it support TrackedArray, since then

  julia> re(Tracked([3, 5-im, 7+11im]))
  (x = Tracked([3+0*im, 5-im]), y = (sin, Tracked([7 + 11im])))

But anyways, now I'm worried about that complex case: that should definitely be counted as a bug IMO, or throw an error.

@DhairyaLGandhi
Copy link
Member

Right, p is the actual source of truth in the case of reconstruction, and my fix was only showing that we don't respect that right now. This is currently only doing basic eltype conversion which comes at the cost of extra copies. I didn't push that as a PR since we need to be certain that we want to let Julia figure out the types and reconstruct the model with p as the source of truth.

It is actually a related issue issue as why we expect some custom leaf types to return structures as adjoints in the backpass and therefore need to reconstruct the type as opposed to operating on the fields directly. Complex is a case of that behaviour since its interaction with Numbers is specialised. In this case, I would much rather get a MethodError saying that there is no accum(::MyType, ::NamedTuple{fieldnames(MyType)}) or better yet give us a method we can use to reconstruct back the type from a primal and its gradient.

@ToucheSir
Copy link

ToucheSir commented Jul 1, 2022

The fundamental issue here is that destructure has been pulled in two mutually incompatible directions by two disparate use cases:

  1. A template generator for hypernetworks. AIUI this was the original motivation.
  2. A workaround/interop mechanism when one wants to use Optim, ForwardDiff (either standalone or for nested AD), etc. My understanding is that this came later, but it's come to dominate while 1) has mostly faded into obscurity.

Chris has already described why no type conversion during reconstruction makes sense for 1). For 2), I recall we went down this path after many issues where users were expecting the following invariants to be upheld:
a. p, re = destructure(m); re(p) should be the identity operation.
b. gradient(m -> loss(m, ...), m) |> flatten == gradient(p -> loss(re(p), ...), p)

The complex example is actually a great one because it shows how these can be broken without type conversion. If you pass in a Dense(weight=Float[], bias=Complex[]), you get out a Dense(weight=Complex[], bias=Complex[]) and break a). This means your gradient will be a NamedTuple(weight=Complex[], bias=Complex[]). If the imaginary component of weight's gradient is non-zero, then woops, the training trajectories will diverge and b) will no longer hold.

I think the only way to resolve this tension is to bifurcate the destructure interface. Perhaps it would make sense to still keep the same name for both functions, but I don't see a way to support both use cases using the same code path.

@mcabbott
Copy link

mcabbott commented Jul 1, 2022

For use as 1, is it really too much to ask that you make the "template" model with the desired number type? That seems like a simpler, easier-to-understand API. Rather than having some special additional mode to know about, document & test.

At present it re from a CPU model will work happily on a GPU v, or the reverse -- the location is not stored. (This isn't really by design, it just falls out of re-using ProjectTo for convenience, which never checks that gradient has the same storage location.)

@ChrisRackauckas
Copy link
Member Author

For use as 1, is it really too much to ask that you make the "template" model with the desired number type?

If that's the case it should probably throw an error instead of computing incorrect gradients. The complex number case would be a particularly nasty one to try and debug. Even finding this behavior took a long time.

@mcabbott
Copy link

mcabbott commented Jul 1, 2022

But there are no incorrect gradients here. Like everything else using ChainRules, these are dy_final = ProjectTo(y)(dy_raw). Complex numbers were literally the original motivating case for introducing such projection operators. And allowing Float64 gradients for Float32 variables was, for a long time, the number 1 way to accidentally get awful performance from Flux.

Your complaint is, if I understand right, entirely that _, re = destructure(m); re(new_p) reconstructs a model with the same element types as m, rather than always following new_p. I'm sorry if this was very surprising, but it's now clearly documented: https://fluxml.ai/Optimisers.jl/dev/api/#Optimisers.destructure

@ChrisRackauckas
Copy link
Member Author

At present it re from a CPU model will work happily on a GPU v, or the reverse -- the location is not stored. (This isn't really by design, it just falls out of re-using ProjectTo for convenience, which never checks that gradient has the same storage location.)

And that makes it surprising. re(::CuArray) gives a GPU-based version, so re(::Array{Complex}) gives a Float32 version. It would at least be easier to understand if there was consistency here. If it's either always obey p or always obey the functor then either way it's at least easy to guess what the restructured object would do. I thought it was "always obey p" because of how it acted with GPUs.

Chris has already described why no type conversion during reconstruction makes sense for 1). For 2), I recall we went down
this path after many issues where users were expecting the following invariants to be upheld:
a. p, re = destructure(m); re(p) should be the identity operation.
b. gradient(m -> loss(m, ...), m) |> flatten == gradient(p -> loss(re(p), ...), p)

This whole discussion has been about 2. The issue is that type conversion presents itself as incorrect gradients in the case of (2). A calculation which says "I want to use Complex{Float64}" will silently use Float32, returning 0's for the complex values and computing with incorrect precision on the real parts. The only way this is exposed to the user if one checks the gradient calculation (something that isn't a user level property anyways, so it's actually just hidden as "it didn't train").

One issue here is that it only even represents itself as existing in the forward pass in isolation. Here we do things like u0 .+ re(p)(u), and so if you look in any diagnostic function you see complex in -> complex out, things look "fine", but only because you didn't check that aha re(p)(u) actually downconverted p from complex to real, and then it upconverted in the next operation. The way it would expose itself is again only in the adjoints because you'd see + 0im everywhere.

Look back at FluxML/Flux.jl#1901 (comment) . Now that I've finally isolated this 5 months later, I realized that this behavior change is what caused the downstream tests to fail, sometimes, depending on the Optimisers version that was received. The precision change caused there to be a higher probability for test failure (since it was still random initializations), so the tests actually found it, but run enough times and the last one was green, it looked like a fluke. Almost imperceptibly the behavior just was "things are a little bit more janky these days, nobody really knows why" until I finally got it isolated as just that the gradient precision was different from the precision that was specified. It might now be clearly documented, but this is very easy to accidentally hit, and very hard to diagnose unless you already know it's possible. Multiple people took a look and no one realized that passing Float64's around isn't a good idea if you forget |> f64.

a. p, re = destructure(m); re(p) should be the identity operation.

I don't see why that should be the case. p is an array so it will type promote. If you then re(p) then you'll get the operations of whatever p is, which would be the promoted versions (or at least, that's how I thought it worked).

@mcabbott
Copy link

mcabbott commented Jul 1, 2022

I thought it was "always obey p" because of how it acted with GPUs.

Great, well now that the documentation is clear, no need to guess.

type conversion presents itself as incorrect gradients

Again, no incorrect gradients have been exhibited here.

which says "I want to use Complex{Float64}"

The way you say this is by making the primal complex. Real numbers may not have complex gradients. The answer to "which way is uphill from the summit of this mountain?" cannot sensibly be a complex vector. Allowing that was a source of truly mysterious gradient bugs.

behavior change is what caused the downstream tests to fail

As you know, the old destructure was cobbled together in 5 minutes, had approximately zero tests, for all of its years, and had many, many known bugs. It's inevitable that, sadly, some code relied on such bugs.

surprising. re(::CuArray) gives a GPU-based version

If you think this ought to be yet more strict, please make an issue.

@ChrisRackauckas
Copy link
Member Author

Continuing this discussion upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants