Fully automatic GPU offloading of linear solves #273

ChrisRackauckas · 2019-07-04T19:53:08Z

Currently throws a warning:

┌ Warning: Package DiffEqBase does not have CuArrays in its dependencies:
│ - If you have DiffEqBase checked out for development and have
│   added CuArrays as a dependency but haven't updated your primary
│   environment's manifest file, try `Pkg.resolve()`.
│ - Otherwise you may need to report an issue with DiffEqBase
└ Loading CuArrays into DiffEqBase from project dependency, future warnings for DiffEqBase are suppressed.

I think this may warrant some discussion. In DiffEq, we like to have our defaults as "good as possible". This follows the idea of Xeon Phi's which auto-offloaded large matrix computations when it knew it would give a speedup. We would like to similarly do that. This PR is not tuned yet, and we might want a stricter GPU memory restriction (and make sure that the matrix will fit), but that's details.

The question isn't if we want to do something like this, we definitely want to as an overridable default, but whether this is a good or correct way to do it. Using Requires.jl would require that the user does using CuArrays which defaults the whole purpose of this since then it's sensitive to the user remembering to add a using statement but not use the package.

I am curious if @vchuravy and @StefanKarpinski have comments.

fix cuify

ViralBShah · 2019-07-04T22:29:04Z

In my experience, hard-coded defaults don't work well over time. I personally prefer making it easy for the user to switch as the simple option with a predictable performance model or having some kind of auto-tuning that can adapt to a variety of GPUs.

ChrisRackauckas · 2019-07-05T00:54:23Z

Playing with:

using OrdinaryDiffEq
using Random
Random.seed!(123)
gr()
# 2D Linear ODE
function f(du,u,p,t)
  @inbounds for i in eachindex(u)
    du[i] = 1.01*u[i]
  end
end
function f_analytic(u₀,p,t)
  u₀*exp(1.01*t)
end
tspan = (0.0,10.0)
prob = ODEProblem(ODEFunction(f,analytic=f_analytic),rand(3000,1),tspan)

abstols = 1.0 ./ 10.0 .^ (3:13)
reltols = 1.0 ./ 10.0 .^ (0:10);

@time solve(prob,Rodas5())
using LinearAlgebra
@time solve(prob,Rodas5(linsolve = LinSolveFactorize(lu!)))

Seems highly architecture independent in a way that can't be understood by querying memory. You really do need to know the number of CUDA cores to do this right.

I think I might spawn this off to a package which type pirates ldiv! and * to check sizes and then knows whether to offload.

YingboMa · 2019-07-06T21:56:57Z

src/DiffEqBase.jl

+
+        # Piracy, should get upstreamed
+        function Base.ldiv!(x::CuArrays.CuArray,_qr::CuArrays.CUSOLVER.CuQR,b::CuArrays.CuArray)
+          _x = UpperTriangular(_qr.R) \ (_qr.Q' * reshape(b,length(b),1))


Is it possible to do it in-place?

ldiv!(UpperTriangular(_qr.R), mul!(x, _qr.Q', reshape(b,length(b),1)))

Does that work? If it does, open up a separate PR on that.

ChrisRackauckas · 2020-02-02T01:23:18Z

@maleadt is there a way to query for the number of CUDA cores?

maleadt · 2020-02-02T08:43:22Z

It's not readily available, but you can compute if from other attributes (CUDAdrv.attribute(::CuDevice, attribute)): https://stackoverflow.com/a/32531982

Fully automatic GPU offloading of linear solves

bce81d1

ChrisRackauckas requested a review from YingboMa July 4, 2019 20:24

fix typo

c10c279

fix cuify

ChrisRackauckas force-pushed the autogpu branch from 1140dbd to c10c279 Compare July 4, 2019 20:29

ChrisRackauckas mentioned this pull request Jul 4, 2019

Automatic GPUified linear solves SciML/DifferentialEquations.jl#472

Closed

ChrisRackauckas added 2 commits July 4, 2019 20:42

Fix AutoGPU for linear ODE

3e05bc4

fix override

f3f61dc

YingboMa reviewed Jul 6, 2019

View reviewed changes

ChrisRackauckas closed this Mar 14, 2022

ChrisRackauckas deleted the autogpu branch March 14, 2022 01:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fully automatic GPU offloading of linear solves #273

Fully automatic GPU offloading of linear solves #273

ChrisRackauckas commented Jul 4, 2019 •

edited

ViralBShah commented Jul 4, 2019

ChrisRackauckas commented Jul 5, 2019

YingboMa Jul 6, 2019

ChrisRackauckas Jul 6, 2019

ChrisRackauckas commented Feb 2, 2020 •

edited

maleadt commented Feb 2, 2020

Fully automatic GPU offloading of linear solves #273

Fully automatic GPU offloading of linear solves #273

Conversation

ChrisRackauckas commented Jul 4, 2019 • edited

ViralBShah commented Jul 4, 2019

ChrisRackauckas commented Jul 5, 2019

YingboMa Jul 6, 2019

Choose a reason for hiding this comment

ChrisRackauckas Jul 6, 2019

Choose a reason for hiding this comment

ChrisRackauckas commented Feb 2, 2020 • edited

maleadt commented Feb 2, 2020

ChrisRackauckas commented Jul 4, 2019 •

edited

ChrisRackauckas commented Feb 2, 2020 •

edited