Skip to content

Conversation

@utkarsh530
Copy link
Member

@utkarsh530 utkarsh530 commented Apr 6, 2022

Hi,

I have added Tim Besard's work here to continue our implementation.
@ChrisRackauckas

@utkarsh530
Copy link
Member Author

@maleadt I am trying to implement the adaptive version of vectorized_solve called vectorized_asolve on similar lines. However, I am new to CUDA programming and tried to follow this https://github.com/SciML/SimpleDiffEq.jl/blob/master/src/tsit5/gpuatsit5.jl#L64-L178 to have an adaptive version. However, I am getting this error:

julia> gpu_sol = vectorized_asolve(prob, ps, GPUSimpleTsit5(); dt, debug = false)
ERROR: InvalidIRError: compiling kernel atsit5_kernel(ODEProblem{SVector{3, Float32}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceMatrix{SVector{3, Float32}, 1}, CuDeviceMatrix{Float32, 1}, Float32, Float32, Float32, Val{false}, Val{true}) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to >)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:247
Reason: unsupported dynamic function invocation (call to materialize)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:265
Reason: unsupported dynamic function invocation (call to *)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:265
Reason: unsupported dynamic function invocation (call to broadcasted)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:265
Reason: unsupported dynamic function invocation (call to min)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:278
Reason: unsupported dynamic function invocation (call to max)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:279
Reason: unsupported dynamic function invocation (call to getindex)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:295
Reason: unsupported dynamic function invocation (call to afoldl)
Stacktrace:
 [1] +
   @ ./operators.jl:655
 [2] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:258
Reason: unsupported dynamic function invocation (call to afoldl)
Stacktrace:
 [1] +
   @ ./operators.jl:655
 [2] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:260
Reason: unsupported dynamic function invocation (call to afoldl)
Stacktrace:
 [1] +
   @ ./operators.jl:655
 [2] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:263
Reason: unsupported dynamic function invocation (call to ODE_DEFAULT_NORM)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:266
Reason: unsupported dynamic function invocation (call to iszero)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:268
Reason: unsupported use of an undefined name (use of 'qmax')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:269
Reason: unsupported dynamic function invocation (call to inv)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:269
Reason: unsupported dynamic function invocation (call to >)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:275
Reason: unsupported use of an undefined name (use of 'qmin')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:276
Reason: unsupported dynamic function invocation (call to inv)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:276
Reason: unsupported use of an undefined name (use of 'qmax')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:278
Reason: unsupported dynamic function invocation (call to inv)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:278
Reason: unsupported use of an undefined name (use of 'qmin')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:278
Reason: unsupported use of an undefined name (use of 'gamma')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:278
Reason: unsupported dynamic function invocation (call to /)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:278
Reason: unsupported use of an undefined name (use of 'qoldinit')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:279
Reason: unsupported use of an undefined name (use of 'cur_t')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:295
Reason: unsupported dynamic function invocation (call to <=)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:295
Reason: unsupported dynamic function invocation (call to max)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:278
Reason: unsupported use of an undefined name (use of 'beta1')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:271
Reason: unsupported dynamic function invocation (call to ^)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:271
HINT: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code
Stacktrace:
  [1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(DiffEqGPU.atsit5_kernel), Tuple{ODEProblem{SVector{3, Float32}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceMatrix{SVector{3, Float32}, 1}, CuDeviceMatrix{Float32, 1}, Float32, Float32, Float32, Val{false}, Val{true}}}}, args::LLVM.Module)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/I9fZc/src/validation.jl:119
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/I9fZc/src/driver.jl:327 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/nDhDw/src/TimerOutput.jl:252 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/GPUCompiler/I9fZc/src/driver.jl:325 [inlined]
  [5] emit_asm(job::GPUCompiler.CompilerJob, ir::LLVM.Module; strip::Bool, validate::Bool, format::LLVM.API.LLVMCodeGenFileType)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/I9fZc/src/utils.jl:64
  [6] cufunction_compile(job::GPUCompiler.CompilerJob)
    @ CUDA ~/.julia/packages/CUDA/5jdFl/src/compiler/execution.jl:326
  [7] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/I9fZc/src/cache.jl:90
  [8] cufunction(f::typeof(DiffEqGPU.atsit5_kernel), tt::Type{Tuple{ODEProblem{SVector{3, Float32}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceMatrix{SVector{3, Float32}, 1}, CuDeviceMatrix{Float32, 1}, Float32, Float32, Float32, Val{false}, Val{true}}}; name::Nothing, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA ~/.julia/packages/CUDA/5jdFl/src/compiler/execution.jl:297
  [9] cufunction(f::typeof(DiffEqGPU.atsit5_kernel), tt::Type{Tuple{ODEProblem{SVector{3, Float32}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceMatrix{SVector{3, Float32}, 1}, CuDeviceMatrix{Float32, 1}, Float32, Float32, Float32, Val{false}, Val{true}}})
    @ CUDA ~/.julia/packages/CUDA/5jdFl/src/compiler/execution.jl:291
 [10] macro expansion
    @ ~/.julia/packages/CUDA/5jdFl/src/compiler/execution.jl:102 [inlined]
 [11] vectorized_asolve(prob::ODEProblem{SVector{3, Float32}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, ps::CuArray{SVector{3, Float32}, 1, CUDA.Mem.DeviceBuffer}, alg::GPUSimpleTsit5; dt::Float32, saveat::Nothing, save_everystep::Bool, abstol::Float32, reltol::Float32, debug::Bool)
    @ DiffEqGPU ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:91
 [12] top-level scope
    @ REPL[13]:1

MWE:

using DiffEqGPU, SimpleDiffEq, StaticArrays, CUDA
#using Plots

function loop(u, p, t)
    σ = p[1]; ρ = p[2]; β = p[3]
    du1 = σ*(u[2]-u[1])
    du2 = u[1]*(ρ-u[3]) - u[2]
    du3 = u[1]*u[2] - β*u[3]
    return SVector{3}(du1, du2, du3)
end
func = ODEFunction(loop)
u0 = 10ones(Float32,3)
su0= SVector{3}(u0)
dt = 1f-1
tspan = (0.0f0, 10.0f0)

prob = ODEProblem{false}(loop, SVector{3}(u0), (0.0f0, 10.0f0),  SA_F32[10, 28, 8/3])
# sol2 = solve(prob,GPUSimpleTsit5())

@info "GPU version"
N = 1000
ps = CuArray([@SVector [10f0,28f0,8/3f0] for i in 1:N])
gpu_sol = vectorized_solve(prob, ps, GPUSimpleTsit5(); dt, debug = false)

gpu_sol = vectorized_asolve(prob, ps, GPUSimpleTsit5(); dt, debug = false)

Could you give me some pointers (documentation or some example code?) on how to write an adaptive version and fix the above errors? Thank you!

@maleadt
Copy link
Contributor

maleadt commented Apr 7, 2022

There's probably a type instability (which can also be caused by a typo) in your implementation, so try the suggestion in the error to run under Cthulhu. Alternatively, prefix the invocation with @device_code_warntype interactive=true.

@utkarsh530 utkarsh530 changed the title [WIP] New GPU Tsit 5 solvers New GPU Tsit 5 solvers Apr 12, 2022
@utkarsh530
Copy link
Member Author

The saveat functionality is working in the adaptive version.
Log:

julia> using DiffEqGPU, SimpleDiffEq, StaticArrays, CUDA

julia> #using Plots

       function loop(u, p, t)
           σ = p[1]; ρ = p[2]; β = p[3]
           du1 = σ*(u[2]-u[1])
           du2 = u[1]*(ρ-u[3]) - u[2]
           du3 = u[1]*u[2] - β*u[3]
           return SVector{3}(du1, du2, du3)
       end
loop (generic function with 1 method)

julia> func = ODEFunction(loop)
(::ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}) (generic function with 7 methods)

julia> u0 = 10ones(Float64,3)
3-element Vector{Float64}:
 10.0
 10.0
 10.0

julia> su0= SVector{3}(u0)
3-element SVector{3, Float64} with indices SOneTo(3):
 10.0
 10.0
 10.0

julia> dt = 1f-1
0.1f0

julia> tspan = (0.0f0, 10.0f0)
(0.0f0, 10.0f0)

julia> prob = ODEProblem{false}(loop, SVector{3}(u0), (0.0f0, 10.0f0),  SA_F32[10, 28, 8/3])
ODEProblem with uType SVector{3, Float64} and tType Float32. In-place: false
timespan: (0.0f0, 10.0f0)
u0: 3-element SVector{3, Float64} with indices SOneTo(3):
 10.0
 10.0
 10.0

julia> # sol2 = solve(prob,GPUSimpleTsit5())

       @info "GPU version"
[ Info: GPU version

julia> N = 10
10

julia> ps = CuArray([@SVector [10.f0,28.f0,8/3f0] for i in 1:10])
10-element CuArray{SVector{3, Float32}, 1, CUDA.Mem.DeviceBuffer}:
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]

julia> gpu_asol = vectorized_asolve(prob, ps, GPUSimpleATsit5(); dt, saveat = 0.0:0.1:1.0, debug = false)
10-element Vector{ODESolution}:
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]), false, 0, nothing, :Default)

julia> gpu_asol[1]
retcode: Default
Interpolation: ┌ Warning: Performing scalar indexing on task Task (runnable) @0x00007f39bf1482f0.
│ Invocation of getindex resulted in scalar indexing of a GPU array.
│ This is typically caused by calling an iterating implementation of a method.
│ Such implementations *do not* execute on the GPU, but very slowly on the CPU,
│ and therefore are only permitted from the REPL for prototyping purposes.
│ If you did intend to index this array, annotate the caller with @allowscalar.
└ @ GPUArrays ~/.julia/packages/GPUArrays/Zecv7/src/host/indexing.jl:56
1st order linear
t: 0.0:0.1:1.0
u: 11-element CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}:
 [10.0, 10.0, 10.0]
 [14.939179399612234, 21.21584054774498, 26.228830924283788]
 [14.51451398040364, 6.097818481131051, 41.68494141740392]
 [4.573042134088737, -3.6573951745458366, 32.07245155323134]
 [-0.6628893344605427, -3.5903993329192176, 24.102225759179902]
 [-2.7273511916210076, -4.468727764250234, 19.097119415749088]
 [-4.902333774628836, -7.78738353404349, 16.6526400102735]
 [-8.836125860495535, -13.763465687428345, 19.33237322377555]
 [-13.305871172982945, -15.76759644231623, 30.61905129256846]
 [-11.434115073464367, -5.804090592244281, 36.18234142425265]
 [-5.555995501670298, -0.7991161813121691, 29.668759375139512]

@utkarsh530
Copy link
Member Author

utkarsh530 commented Apr 12, 2022

@maleadt, is there a reason you've wrapped this https://gist.github.com/maleadt/3a848e5743e6001116666f4c3c68a061#file-gpu_ode-jl-L239-L242 instead of directly calling vectorized_solve?
I am directly calling vectorized_asolve for solving the ODE. Could you also verify the saveat code that I've written is alright? https://github.com/SciML/DiffEqGPU.jl/pull/148/files#diff-930c7186b2b6826d7462c29f83229942810af75edbc47f5b5e69976c6656db79R316-R325

src/DiffEqGPU.jl Outdated
elseif ensemblealg isa EnsembleGPUArray
batch_solve_up(ensembleprob, probs, alg, ensemblealg, I, u0, p; kwargs...)
end
[ensembleprob.output_func(SciMLBase.build_solution(probs[i],alg,sol.t,solus[i],destats=sol.destats,retcode=sol.retcode),i)[1] for i in 1:length(probs)]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code performs scalar indexing on GPU Arrays when it solves from EnsembleGPUArrayAutonomous. What to do in this case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is scalar indexing on a GPU array here? I don't see where that would happen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solus[i] is the scalar indexing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but that's an array of solutions?

@utkarsh530
Copy link
Member Author

MWE:

using DiffEqGPU, OrdinaryDiffEq, StaticArrays
function lorenz(u, p, t)
    σ = p[1]
    ρ = p[2]
    β = p[3]
    du1 = σ * (u[2] - u[1])
    du2 = u[1] * (ρ - u[3]) - u[2]
    du3 = u[1] * u[2] - β * u[3]
    return SVector{3}(du1, du2, du3)
end

u0 = @SVector [1.f0;0.f0;0.f0]
tspan = (0.0f0,100.0f0)
p = @SVector [10.0f0,28.0f0,8/3f0]
prob = ODEProblem{false}(lorenz, u0, tspan, p)
prob_func = (prob,i,repeat) -> remake(prob,p=rand(Float32,3).*p)
monteprob = EnsembleProblem(prob, prob_func = prob_func, safetycopy=false)
sol = solve(monteprob,Tsit5(),EnsembleGPUArrayAutonomous(),trajectories=10,saveat = 0.f0:1.f0:10.f0)

@utkarsh530
Copy link
Member Author

utkarsh530 commented May 11, 2022

@ChrisRackauckas any comments on the name EnsembleGPUArrayAutonomous?

@ChrisRackauckas
Copy link
Member

EnsembleGPUAutonomous, there's no array-ification here.

@@ -0,0 +1,121 @@
using StaticArrays, SimpleDiffEq, BenchmarkTools, DiffEqGPU, CUDA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed and the MWE usage can be done in the current test files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't done.

Copy link
Member Author

@utkarsh530 utkarsh530 May 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@utkarsh530
Copy link
Member Author

@ChrisRackauckas LGTM. We can merge it and make new PRs for improvements.

@ChrisRackauckas
Copy link
Member

Can you get tests passing first? Might need @vchuravy to help out. Something broke in KernelAbstractions and I don't know when/

@utkarsh530
Copy link
Member Author

@vchuravy
Copy link
Contributor

vchuravy commented Jun 8, 2022

Something broke in KernelAbstractions and I don't know when

I would need a small MWE to dig in, I don't currently have the time to dive into everything here

@ChrisRackauckas
Copy link
Member

Same failure in a different part of the code. Looks like CUDA.jl or KernelAbstractions.jl has issues with the gpuci drivers or something? We can follow up in another issue on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants