-
-
Notifications
You must be signed in to change notification settings - Fork 51
/
solvers.md
161 lines (127 loc) · 7.4 KB
/
solvers.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
# [Linear System Solvers](@id nonlinearsystemsolvers)
`solve(prob::LinearProlem,alg;kwargs)`
Solves for ``Au=b`` in the problem defined by `prob` using the algorithm
`alg`. If no algorithm is given, a default algorithm will be chosen.
This page is solely focused on the methods for nonlinear systems.
## Recommended Methods
The default algorithm `nothing` is good for choosing an algorithm that will work,
but one may need to change this to receive more performance or precision. If
more precision is necessary, `QRFactorization()` and `SVDFactorization()` are
the best choices, with SVD being the slowest but most precise.
For efficiency, `RFLUFactorization` is the fastest for dense LU-factorizations.
For sparse LU-factorizations, `KLUFactorization` if there is less structure
to the sparsity pattern and `UMFPACKFactorization` if there is more structure.
Pardiso.jl's methods are also known to be very efficient sparse linear solvers.
As sparse matrices get larger, iterative solvers tend to get more efficient than
factorization methods if a lower tolerance of the solution is required.
IterativeSolvers.jl uses a low-rank Q update in its GMRES so it tends to be
faster than Krylov.jl for CPU-based arrays, but it's only compatible with
CPU-based arrays whilc Krylov.jl is more general and will support accelerators
like CUDA.
## Full List of Methods
### RecursiveFactorization.jl
- `RFLUFactorization()`: a fast pure Julia LU-factorization implementation
using RecursiveFactorization.jl. This is by far the fastest LU-factorization
implementation, usually outperforming OpenBLAS and MKL, but generally optimized
only for Base `Array` with `Float32`, `Float64`, `ComplexF32`, and `ComplexF64`.
### Base.LinearAlgebra
These overloads tend to work for many array types, such as `CuArrays` for GPU-accelerated
solving, using the overloads provided by the respective packages. Given that this can be
customized per-package, details given below describe a subset of important arrays
(`Matrix`, `SparseMatrixCSC`, `CuMatrix`, etc.)
- `LUFactorization(pivot=LinearAlgebra.RowMaximum())`: Julia's built in `lu`.
- On dense matrices this uses the current BLAS implementation of the user's computer
which by default is OpenBLAS but will use MKL if the user does `using MKL` in their
system.
- On sparse matrices this will use UMFPACK from SuiteSparse. Note that this will not
cache the symbolic factorization.
- On CuMatrix it will use a CUDA-accelerated LU from CuSolver.
- On BandedMatrix and BlockBandedMatrix it will use a banded LU.
- `QRFactorization(pivot=LinearAlgebra.NoPivot(),blocksize=16)`: Julia's built in `qr`.
- On dense matrices this uses the current BLAS implementation of the user's computer
which by default is OpenBLAS but will use MKL if the user does `using MKL` in their
system.
- On sparse matrices this will use SPQR from SuiteSparse
- On CuMatrix it will use a CUDA-accelerated QR from CuSolver.
- On BandedMatrix and BlockBandedMatrix it will use a banded QR.
- `SVDFactorization(full=false,alg=LinearAlgebra.DivideAndConquer())`: Julia's built in `svd`.
- On dense matrices this uses the current BLAS implementation of the user's computer
which by default is OpenBLAS but will use MKL if the user does `using MKL` in their
system.
- `GenericFactorization(fact_alg)`: Constructs a linear solver from a generic
factorization algorithm `fact_alg` which complies with the Base.LinearAlgebra
factorization API. Quoting from Base:
- If `A` is upper or lower triangular (or diagonal), no factorization of `A` is
required and the system is solved with either forward or backward substitution.
For non-triangular square matrices, an LU factorization is used.
For rectangular `A` the result is the minimum-norm least squares solution computed by a
pivoted QR factorization of `A` and a rank estimate of `A` based on the R factor.
When `A` is sparse, a similar polyalgorithm is used. For indefinite matrices, the `LDLt`
factorization does not use pivoting during the numerical factorization and therefore the
procedure can fail even for invertible matrices.
### SuiteSparse.jl
By default, the SuiteSparse.jl are implemented for efficiency by caching the
symbolic factorization. I.e. if `set_A` is used, it is expected that the new
`A` has the same sparsity pattern as the previous `A`. If this algorithm is to
be used in a context where that assumption does not hold, set `reuse_symbolic=false`.
- `KLUFactorization(;reuse_symbolic=true)`: A fast sparse LU-factorization which
specializes on sparsity patterns with "less structure".
- `UMFPACKFactorization(;reuse_symbolic=true)`: A fast sparse multithreaded
LU-factorization which specializes on sparsity patterns that are more
structured.
### Pardiso.jl
This package is not loaded by default. Thus in order to use this package you
must first `using Pardiso`. The following algorithms are pre-specified:
- `MKLPardisoFactorize(;kwargs...)`: A sparse factorization method.
- `MKLPardisoIterate(;kwargs...)`: A mixed factorization+iterative method.
Those algorithms are defined via:
```julia
MKLPardisoFactorize(;kwargs...) = PardisoJL(;fact_phase=Pardiso.NUM_FACT,
solve_phase=Pardiso.SOLVE_ITERATIVE_REFINE,
kwargs...)
MKLPardisoIterate(;kwargs...) = PardisoJL(;solve_phase=Pardiso.NUM_FACT_SOLVE_REFINE,
kwargs...)
```
The full set of keyword arguments for `PardisoJL` are:
```julia
Base.@kwdef struct PardisoJL <: SciMLLinearSolveAlgorithm
nprocs::Union{Int, Nothing} = nothing
solver_type::Union{Int, Pardiso.Solver, Nothing} = nothing
matrix_type::Union{Int, Pardiso.MatrixType, Nothing} = nothing
fact_phase::Union{Int, Pardiso.Phase, Nothing} = nothing
solve_phase::Union{Int, Pardiso.Phase, Nothing} = nothing
release_phase::Union{Int, Nothing} = nothing
iparm::Union{Vector{Tuple{Int,Int}}, Nothing} = nothing
dparm::Union{Vector{Tuple{Int,Int}}, Nothing} = nothing
end
```
### CUDA.jl
Note that `CuArrays` are supported by `GenericFactorization` in the "normal" way.
The following are non-standard GPU factorization routines.
- `GPUOffloadFactorization()`: An offloading technique used to GPU-accelerate CPU-based
computations. Requires a sufficiently large `A` to overcome the data transfer
costs.
### IterativeSolvers.jl
- `IterativeSolversJL_CG(args...;kwargs...)`: A generic CG implementation
- `IterativeSolversJL_GMRES(args...;kwargs...)`: A generic GMRES implementation
- `IterativeSolversJL_BICGSTAB(args...;kwargs...)`: A generic BICGSTAB implementation
- `IterativeSolversJL_MINRES(args...;kwargs...)`: A generic MINRES implementation
The general algorithm is:
```julia
IterativeSolversJL(args...;
generate_iterator = IterativeSolvers.gmres_iterable!,
Pl=nothing, Pr=nothing,
gmres_restart=0, kwargs...)
```
### Krylov.jl
- `KrylovJL_CG(args...;kwargs...)`: A generic CG implementation
- `KrylovJL_GMRES(args...;kwargs...)`: A generic GMRES implementation
- `KrylovJL_BICGSTAB(args...;kwargs...)`: A generic BICGSTAB implementation
- `KrylovJL_MINRES(args...;kwargs...)`: A generic MINRES implementation
The general algorithm is:
```julia
KrylovJL(args...; KrylovAlg = Krylov.gmres!,
Pl=nothing, Pr=nothing,
gmres_restart=0, window=0,
kwargs...)
```