Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of ODE solver on GPU in the presence of parameters #620

Closed
ghost opened this issue Jun 14, 2020 · 1 comment · Fixed by SciML/DiffEqBase.jl#535
Closed

Performance of ODE solver on GPU in the presence of parameters #620

ghost opened this issue Jun 14, 2020 · 1 comment · Fixed by SciML/DiffEqBase.jl#535

Comments

@ghost
Copy link

ghost commented Jun 14, 2020

using CUDA, CUDA.CUSPARSE
using DifferentialEquations
using LinearAlgebra, MKLSparse, SparseArrays
using BenchmarkTools

N = 2^8

a = sprand(Float64, N, N, 0.2)
x = rand(N)

cu_a = CUSPARSE.CuSparseMatrixCSC(a)
cu_x = CuArray(x)

function func(du, u, p, t)
    CUSPARSE.mv!('T', 1.0, p, u, 0.0, du, 'O')
end

tspan = (0.0, 1.0)
prob = ODEProblem(func, cu_x, tspan, cu_a)

@btime sol = solve(prob, Tsit5(), save_everystep=false, save_start=false)

global cu_b = cu_a

function func_global(du, u, p, t)
    CUSPARSE.mv!('T', 1.0, cu_b, u, 0.0, du, 'O')
end

prob_global = ODEProblem(func_global, cu_x, tspan)

@btime sol_global = solve(prob_global, Tsit5(), save_everystep=false, save_start=false)

I am benchmarking the performance of ODE solver on a GPU with Julia 1.4.2. The above piece of code reproduces the behavior. The benchmarks indicate that the performance is significantly better when I declare the parameters needed for ODE as global variables. On GTX 1060, to solve 'prob' it takes 6.363 s (4197131 allocations: 154.08 MiB) and 'prob_global' takes 14.548 ms (30185 allocations: 1.21 MiB).

@ChrisRackauckas
Copy link
Member

Thanks for the report! Fixed in SciML/DiffEqBase.jl#535 . Quite an edge case!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant