Additional functionalities for exp utility: error estimate and IOP #372

MSeeker1340 · 2018-05-30T22:31:09Z

The aim of this PR is to address the second point of #243 (comment).

Some numerical results for the corrector:

n = 100
K = 4
A = diagm(-2 * ones(n)) + diagm(ones(n - 1), -1) + diagm(ones(n - 1), 1)
b = randn(n);

function error_comparison(K, t, m)
    # Compute exact result using dense phi
    w_exact = zeros(n, K)
    P = phi(t*A, K)
    for i = 1:K
        w_exact[:, i] = P[i] * b
    end
    # Krylov with and without correction
    w = phiv(t, A, b, K; m=m, correct=false)[:, 1:K]
    w_correct = phiv(t, A, b, K; m=m, correct=true)[:, 1:K]
    err = [norm(w[:,i] - w_exact[:,i]) / norm(w_exact[:,i]) for i in 1:K]
    err_correct = [norm(w_correct[:,i] - w_exact[:,i]) / norm(w_exact[:,i]) for i in 1:K]
    for i = 1:K
        println("p = $(i-1), before: $(err[i]), after: $(err_correct[i])")
    end
end;

error_comparison(4, 5.0, 20)

p = 0, before: 1.851580144556005e-8, after: 1.1866137192062755e-8
p = 1, before: 1.1604243216282569e-9, after: 6.930957387640787e-10
p = 2, before: 1.4084870014088947e-10, after: 7.894351673381101e-11
p = 3, before: 2.417853000686249e-11, after: 1.2786553243680304e-11

error_comparison(4, 5.0, 15)

p = 0, before: 1.670064384606024e-5, after: 1.3529002061306636e-5
p = 1, before: 1.523366736059646e-6, after: 1.1210370888360611e-6
p = 2, before: 2.577756909429866e-7, after: 1.7459774282018524e-7
p = 3, before: 5.971387392772451e-8, after: 3.7583355289957666e-8

error_comparison(4, 5.0, 10)

p = 0, before: 0.0039036186519269406, after: 0.004777657784517128
p = 1, before: 0.0005983342152897502, after: 0.0006318794323758182
p = 2, before: 0.00015660771392520995, after: 0.0001468087636175307
p = 3, before: 5.3046898782640144e-5, after: 4.495612453983268e-5

error_comparison(4, 10.0, 20)

p = 0, before: 2.7920782891613483e-5, after: 3.5001350862247234e-5
p = 1, before: 2.3777873965132265e-6, after: 2.740946017107914e-6
p = 2, before: 3.8466149667992783e-7, after: 4.123593087931721e-7
p = 3, before: 8.52677289464572e-8, after: 8.57052414404906e-8

error_comparison(4, 10.0, 15)

p = 0, before: 0.0016615112563970835, after: 0.0025777546011967675
p = 1, before: 0.00021408895792534243, after: 0.00029448496471261217
p = 2, before: 4.9115234177339934e-5, after: 6.131688135215691e-5
p = 3, before: 1.477590119497644e-5, after: 1.6994996067541433e-5

error_comparison(4, 10.0, 10)

p = 0, before: 0.041799641994719756, after: 0.08964522220170576
p = 1, before: 0.008921966901360257, after: 0.016377920187105897
p = 2, before: 0.0030763723943405454, after: 0.005008332379204382
p = 3, before: 0.0013105118114586309, after: 0.0019318170942597493

So the corrector works best when the error is already small but actually amplifies error when it is big. IMO this is not very practical in real use... on the other hand, using the correction term purely as an error estimate should work well.

MSeeker1340 · 2018-05-30T22:36:14Z

Not very sure about the interface for the error estimate at the moment. Do we just add an output argument err to phiv!? Seems to over complicate the interface a bit, also we don't always want the error estimate. How can I add a non-allocating output argument that is optional, without having to maintain two separate versions of the function?

coveralls · 2018-05-30T23:14:59Z

Coverage increased (+5.2%) to 83.615% when pulling 3004f66 on MSeeker1340:krylov-error-estimate into 57f1eae on JuliaDiffEq:master.

codecov · 2018-05-31T21:35:40Z

Codecov Report

Merging #372 into master will decrease coverage by 0.02%.
The diff coverage is 65.51%.

@@            Coverage Diff             @@
##           master     #372      +/-   ##
==========================================
- Coverage   83.64%   83.61%   -0.03%     
==========================================
  Files          75       75              
  Lines       21639    21652      +13     
==========================================
+ Hits        18099    18104       +5     
- Misses       3540     3548       +8

Impacted Files	Coverage Δ
src/exponential_utils.jl	`83.03% <65.51%> (-3.82%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 57f1eae...3004f66. Read the comment docs.

MSeeker1340 · 2018-05-31T21:38:16Z

So I've probably overcomplicated myself yesterday about error estimates... In both phipm and KIOPS the only necessary error estimate to have is for the highest order phi (in the case of KIOPS it's the exp). So there's no need to return a separate error estimate for all the outputs of phiv(!), at least for now.

Error estimates in action:

using OrdinaryDiffEq: phi, phiv

n = 100
m = 15
K = 4
A = diagm(-2 * ones(n)) + diagm(ones(n - 1), -1) + diagm(ones(n - 1), 1)
b = randn(n)
t = 5.0;

P = phi(t*A, K)
w_exact = P[K] * b
w, errest = phiv(t, A, b, K; m=m, errest=true)
err = norm(w[:,K] - w_exact)
println("Error: $err, error estimate: $errest")

Error: 4.9003876156387905e-8, error estimate: 7.513596116146876e-8

MSeeker1340 · 2018-06-01T15:44:52Z

IOP is easier than I thought to implement so I'll just append the commit here.

Because I'm not using a dedicated data structure for H (is there btw? H here is an upper Heisenberg matrix with an length-p upper band), all I need to modify is adding a bound to the inner arnoldi loop.

Turns out expv and phiv stays just the same, as well as the code for error estimate and correction. IOP does change how adaptation is performed, however, but this is not the concern of this PR.

As a reference, some papers about using IOP to approximate matrix exponential-vector products:

https://www.researchgate.net/profile/Antti_Koskela/publication/273697107_Approximating_the_Matrix_Exponential_of_an_Advection-Diffusion_Operator_Using_the_Incomplete_Orthogonalization_Method/links/550954880cf27e990e0e5045/Approximating-the-Matrix-Exponential-of-an-Advection-Diffusion-Operator-Using-the-Incomplete-Orthogonalization-Method.pdf

https://onlinelibrary.wiley.com/doi/pdf/10.1002/nla.2090 (Algorithm 1 gives a clear definition of IOP).

MSeeker1340 · 2018-06-01T15:46:53Z

Sanity checks:

using OrdinaryDiffEq: arnoldi, getH

srand(0)
n = 100
m = 15
A = diagm(-2 * ones(n)) + diagm(ones(n - 1), -1) + diagm(ones(n - 1), 1)
Aperm = A + 1e-5 * randn(n, n)
b = randn(n);

For symmetric matrices, IOP(2) should be almost the same as Lanczos:

Ks = arnoldi(A, b; m=m) # Lanczos
Ksperm = arnoldi(Aperm, b; m=m) # full Arnoldi
Ksperm2 = arnoldi(Aperm, b; m=m, iop=2); # IOP(2)

getH(Ks)

16×15 SubArray{Float64,2,Array{Float64,2},Tuple{UnitRange{Int64},UnitRange{Int64}},false}:
 -2.21741   1.2582    0.0       0.0       …   0.0        0.0        0.0    
  1.2582   -1.84706   1.1293    0.0           0.0        0.0        0.0    
  0.0       1.1293   -2.05088   1.07224       0.0        0.0        0.0    
  0.0       0.0       1.07224  -2.03328       0.0        0.0        0.0    
  0.0       0.0       0.0       0.844917      0.0        0.0        0.0    
  0.0       0.0       0.0       0.0       …   0.0        0.0        0.0    
  0.0       0.0       0.0       0.0           0.0        0.0        0.0    
  0.0       0.0       0.0       0.0           0.0        0.0        0.0    
  0.0       0.0       0.0       0.0           0.0        0.0        0.0    
  0.0       0.0       0.0       0.0           0.0        0.0        0.0    
  0.0       0.0       0.0       0.0       …   0.0        0.0        0.0    
  0.0       0.0       0.0       0.0           0.984855   0.0        0.0    
  0.0       0.0       0.0       0.0          -1.87424    0.946788   0.0    
  0.0       0.0       0.0       0.0           0.946788  -1.89749    0.98828
  0.0       0.0       0.0       0.0           0.0        0.98828   -2.16322
  0.0       0.0       0.0       0.0       …   0.0        0.0        1.00082

getH(Ksperm)

16×15 SubArray{Float64,2,Array{Float64,2},Tuple{UnitRange{Int64},UnitRange{Int64}},false}:
 -2.2174   1.25819  -1.20995e-6   4.74473e-6  …  -1.56411e-5  -1.09724e-5
  1.2582  -1.84709   1.1293      -9.36381e-6     -1.28886e-5   3.87006e-6
  0.0      1.1293   -2.05088      1.07224        -8.84203e-6   1.21384e-5
  0.0      0.0       1.07224     -2.03327         1.99436e-5   9.47316e-6
  0.0      0.0       0.0          0.844914        1.74995e-5   4.67657e-6
  0.0      0.0       0.0          0.0         …   4.02996e-7   1.5533e-5 
  0.0      0.0       0.0          0.0            -1.25157e-5   2.853e-7  
  0.0      0.0       0.0          0.0             2.05337e-5   1.3109e-6 
  0.0      0.0       0.0          0.0            -1.6915e-5    6.61577e-7
  0.0      0.0       0.0          0.0            -4.99333e-6   1.18318e-5
  0.0      0.0       0.0          0.0         …  -2.15769e-6  -7.03456e-6
  0.0      0.0       0.0          0.0             1.90411e-5   1.22223e-6
  0.0      0.0       0.0          0.0             0.946821     7.6984e-6 
  0.0      0.0       0.0          0.0            -1.89743      0.988291  
  0.0      0.0       0.0          0.0             0.988298    -2.16308   
  0.0      0.0       0.0          0.0         …   0.0          1.00084

getH(Ksperm2)

16×15 SubArray{Float64,2,Array{Float64,2},Tuple{UnitRange{Int64},UnitRange{Int64}},false}:
 -2.2174   1.25819   0.0       0.0       …   0.0        0.0        0.0     
  1.2582  -1.84709   1.1293    0.0           0.0        0.0        0.0     
  0.0      1.1293   -2.05088   1.07224       0.0        0.0        0.0     
  0.0      0.0       1.07224  -2.03327       0.0        0.0        0.0     
  0.0      0.0       0.0       0.844914      0.0        0.0        0.0     
  0.0      0.0       0.0       0.0       …   0.0        0.0        0.0     
  0.0      0.0       0.0       0.0           0.0        0.0        0.0     
  0.0      0.0       0.0       0.0           0.0        0.0        0.0     
  0.0      0.0       0.0       0.0           0.0        0.0        0.0     
  0.0      0.0       0.0       0.0           0.0        0.0        0.0     
  0.0      0.0       0.0       0.0       …   0.0        0.0        0.0     
  0.0      0.0       0.0       0.0           0.984791   0.0        0.0     
  0.0      0.0       0.0       0.0          -1.87428    0.946821   0.0     
  0.0      0.0       0.0       0.0           0.946813  -1.89743    0.988291
  0.0      0.0       0.0       0.0           0.0        0.988298  -2.16308 
  0.0      0.0       0.0       0.0       …   0.0        0.0        1.00084

I'll do an detailed performance-cost analysis for IOP in the afternoon with large sparse A in mind.

MSeeker1340 · 2018-06-01T22:45:13Z

Some test results for IOP:

The test case is an n = 1000 sparse 1D Laplacian with reflecting boundary condition. The reason for reflecting BC is that the Laplacian is not symmetric, so Lanczos does not apply. But since it is "almost symmetric", we should expect good performance for IOP(2).

As a comparison, the Laplacian is permuted to get a dense version. They differ a lot in their A_mul_B! cost:

@time A_mul_B!(w, Asparse, b);

  0.000009 seconds (4 allocations: 160 bytes)

@time A_mul_B!(w, Adense, b);

  0.001130 seconds (4 allocations: 160 bytes)

This means that arnoldi will be dominated by dot and axpy! in the sparse case and by A_mul_B! in the second case. Indeed this is confirmed by the profiler:

Sparse A:
Dense A:

Because IOP saves the number of dot products and axpys, we expect to see a large increase in performance for the sparse case but not for the dense case:

@time profile_arnoldi(Asparse, 0);

  0.000901 seconds (2.61 k allocations: 130.250 KiB)

@time profile_arnoldi(Asparse, 2);

  0.000390 seconds (257 allocations: 20.000 KiB)

@time profile_arnoldi(Adense, 0);

  0.039249 seconds (2.61 k allocations: 138.250 KiB)

@time profile_arnoldi(Adense, 2);

  0.040866 seconds (260 allocations: 28.000 KiB)

(Here an IOP value of 0 indicates full Arnoldi).

Finally, the work-precision diagrams for the sparse case (note I didn't use a lot of trials so the results jiggle a bit):

It is safe to say that in this case using IOP is a strictly better choice.

If A is not as symmetric, then the results may not be as obvious and we may need to use longer IOP lengths. But since most of the differential operators from PDE fall in this "almost symmetric" category, I believe we'll be fine using IOP(2).

Full test script:

Incomplete Orthogonalization Procedure.zip

ChrisRackauckas · 2018-06-02T03:56:07Z

Looks great!

ChrisRackauckas · 2018-06-02T11:38:29Z

src/exponential_utils.jl

 end
 # TODO: switch to overload `getproperty` in v0.7
-getH(Ks::KrylovSubspace) = @view(Ks.H[1:Ks.m, 1:Ks.m])
-getV(Ks::KrylovSubspace) = @view(Ks.V[:, 1:Ks.m])
+getH(Ks::KrylovSubspace) = @view(Ks.H[1:Ks.m + 1, 1:Ks.m])


why is everything +1?

This is to get the "extended" H and V matrix from Arnoldi. Originally I just left out the last terms to make the outputs of getH and getV square for easier coding.

Since error estimate & correction is a need now the extra terms are required, so I added them back and also changed the "happy-breakdown" termination logic of the Arnoldi loop so that they are always calculated.

ChrisRackauckas · 2018-06-02T11:39:16Z

src/exponential_utils.jl

@@ -228,6 +238,9 @@ function arnoldi!(Ks::KrylovSubspace{B, T}, A, b::AbstractVector{T}; tol=1e-7,
  end
  V, H = getV(Ks), getH(Ks)
  vtol = tol * norm(A, Inf)
+  if iop == 0
+    iop = m


in the PR you only show for 2? What does higher look like?

For the output of iop=p, the Heisenberg matrix H will have a length-p upper band (i.e. only p non-zero diagonals for the upper part). I checked this before running the performance tests and it's working as intended.

Performance & accuracy wise, haven't really tested yet but the idea is by increasing p you get better accuracy and worse performance. For the almost symmetric operators that I showed in the tests, since IOP(2) is already good enough there's no need to check higher ps. For general operators I would expect a more visible performance-accuracy tradeoff for adjusting p.

MSeeker1340 · 2018-06-04T14:43:34Z

One thing I forgot to mention is that: from the profiling results it is clear that arnoldi is computation bounded both for the dense and sparse case, so I don't think there's a need to use a dedicated banded matrix type for H as I originally thought.

At best it saves some space when the KrylovSubspace is originally constructed. But this is done only once for an integrator so the difference would be negligible.

OK to merge this PR now?

MSeeker1340 added 2 commits May 30, 2018 16:41

Change to use extended H and V

61eca01

Add corrector to phiv

fd82c2c

MSeeker1340 added 2 commits May 31, 2018 16:58

Merge remote-tracking branch 'origin/master' into krylov-error-estimate

ad917a8

Add option to return error estimation

91e3f96

MSeeker1340 changed the title ~~[WIP] Error estimate and correction for Krylov phiv~~ Error estimate and correction for Krylov phiv May 31, 2018

Add IOP support to arnoldi

3004f66

MSeeker1340 changed the title ~~Error estimate and correction for Krylov phiv~~ Additional functionalities for exp utility: error estimate and IOP Jun 1, 2018

ChrisRackauckas reviewed Jun 2, 2018

View reviewed changes

ChrisRackauckas merged commit 33bf748 into SciML:master Jun 4, 2018

MSeeker1340 deleted the krylov-error-estimate branch June 4, 2018 16:15

This was referenced Jun 5, 2018

Krylov Phiv internal time stepping #375

Merged

Adaptive Krylov exponential integrators (EPIRK) #243

Open

MSeeker1340 mentioned this pull request Oct 21, 2018

Add benchmarks SciML/ExponentialUtilities.jl#12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional functionalities for exp utility: error estimate and IOP #372

Additional functionalities for exp utility: error estimate and IOP #372

MSeeker1340 commented May 30, 2018

MSeeker1340 commented May 30, 2018

coveralls commented May 30, 2018 •

edited

Loading

codecov bot commented May 31, 2018 •

edited

Loading

MSeeker1340 commented May 31, 2018

MSeeker1340 commented Jun 1, 2018

MSeeker1340 commented Jun 1, 2018

MSeeker1340 commented Jun 1, 2018

ChrisRackauckas commented Jun 2, 2018

ChrisRackauckas Jun 2, 2018

MSeeker1340 Jun 4, 2018

ChrisRackauckas Jun 2, 2018

MSeeker1340 Jun 4, 2018

MSeeker1340 commented Jun 4, 2018

Additional functionalities for exp utility: error estimate and IOP #372

Additional functionalities for exp utility: error estimate and IOP #372

Conversation

MSeeker1340 commented May 30, 2018

MSeeker1340 commented May 30, 2018

coveralls commented May 30, 2018 • edited Loading

codecov bot commented May 31, 2018 • edited Loading

Codecov Report

MSeeker1340 commented May 31, 2018

MSeeker1340 commented Jun 1, 2018

MSeeker1340 commented Jun 1, 2018

MSeeker1340 commented Jun 1, 2018

ChrisRackauckas commented Jun 2, 2018

ChrisRackauckas Jun 2, 2018

Choose a reason for hiding this comment

MSeeker1340 Jun 4, 2018

Choose a reason for hiding this comment

ChrisRackauckas Jun 2, 2018

Choose a reason for hiding this comment

MSeeker1340 Jun 4, 2018

Choose a reason for hiding this comment

MSeeker1340 commented Jun 4, 2018

coveralls commented May 30, 2018 •

edited

Loading

codecov bot commented May 31, 2018 •

edited

Loading