New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exponential utilities - phi functions #355

Merged
merged 7 commits into from May 22, 2018

Conversation

Projects
None yet
3 participants
@MSeeker1340
Contributor

MSeeker1340 commented May 21, 2018

Includes what I've been working for the past week (scalar/dense matrix phi functions using the Sidje formula as well as a simply one-step Krylov phimv).

I have written the crucial parts to be non-allocating whenever possible. @ChrisRackauckas can you help take a look at it, especially the code regarding the caches? This is my first time writing performance-intensive code so not sure if I got everything right.

BTW, since phimv is currently being imported from Expokit, I named phimv and phimv! to _phimv and _phimv! to avoid name clash. The aim is, of course, to drop Expokit dependency and just use our more generic, arbitrary order phimv function.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls May 21, 2018

Coverage Status

Coverage decreased (-29.2%) to 55.546% when pulling 9582c11 on MSeeker1340:exponential_utils into 5b58be8 on JuliaDiffEq:master.

coveralls commented May 21, 2018

Coverage Status

Coverage decreased (-29.2%) to 55.546% when pulling 9582c11 on MSeeker1340:exponential_utils into 5b58be8 on JuliaDiffEq:master.

@codecov

This comment has been minimized.

Show comment
Hide comment
@codecov

codecov bot May 21, 2018

Codecov Report

❗️ No coverage uploaded for pull request base (master@f9e1178). Click here to learn what that means.
The diff coverage is 89.28%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #355   +/-   ##
=========================================
  Coverage          ?   55.54%           
=========================================
  Files             ?       75           
  Lines             ?    20781           
  Branches          ?        0           
=========================================
  Hits              ?    11543           
  Misses            ?     9238           
  Partials          ?        0
Impacted Files Coverage Δ
src/exponential_utils.jl 89.28% <89.28%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f9e1178...507a8e9. Read the comment docs.

codecov bot commented May 21, 2018

Codecov Report

❗️ No coverage uploaded for pull request base (master@f9e1178). Click here to learn what that means.
The diff coverage is 89.28%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #355   +/-   ##
=========================================
  Coverage          ?   55.54%           
=========================================
  Files             ?       75           
  Lines             ?    20781           
  Branches          ?        0           
=========================================
  Hits              ?    11543           
  Misses            ?     9238           
  Partials          ?        0
Impacted Files Coverage Δ
src/exponential_utils.jl 89.28% <89.28%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f9e1178...507a8e9. Read the comment docs.

@MSeeker1340

This comment has been minimized.

Show comment
Hide comment
@MSeeker1340

MSeeker1340 May 21, 2018

Contributor

Changed all allocating array slicing to views/@inbounds loops.

I'm a little reserved about phim though. While phimv (and at its base, phimv_dense! and arnoldi!) are repeatedly used by the integrators so we should aim for maximal performance, phim is mainly used by small systems to precompute the expRK coefficients just once, so its performance doesn't really matter (also considering the dimension involved is usually small), so would it be better to use simpler notation which makes the code easier to read?

Contributor

MSeeker1340 commented May 21, 2018

Changed all allocating array slicing to views/@inbounds loops.

I'm a little reserved about phim though. While phimv (and at its base, phimv_dense! and arnoldi!) are repeatedly used by the integrators so we should aim for maximal performance, phim is mainly used by small systems to precompute the expRK coefficients just once, so its performance doesn't really matter (also considering the dimension involved is usually small), so would it be better to use simpler notation which makes the code easier to read?

@ChrisRackauckas

This comment has been minimized.

Show comment
Hide comment
@ChrisRackauckas

ChrisRackauckas May 21, 2018

Member

Maybe I've been doing this too long but IMO in-place code just looks more proper package code while code that's allocating a bunch looks like a script. It doesn't matter all too much (these allocations would definitely not change timings too much), but to make it all match I would just prefer to see it use views and inplace operations.

Member

ChrisRackauckas commented May 21, 2018

Maybe I've been doing this too long but IMO in-place code just looks more proper package code while code that's allocating a bunch looks like a script. It doesn't matter all too much (these allocations would definitely not change timings too much), but to make it all match I would just prefer to see it use views and inplace operations.

@MSeeker1340

This comment has been minimized.

Show comment
Hide comment
@MSeeker1340

MSeeker1340 May 22, 2018

Contributor

From slack:

Been thinking about Expokit's approach of decomposing exp(tA)v into smaller time intervals. While the idea is appealing, I think we should probably stick to plain, one-stage Krylov.
Two reasons: first, while exp has this nice semigroup property, the same isn't true for phis. While it is possible to also express phi_1(tA)v as a series of steps (Expokit paper (17)), I don't think this is easily generalizable to higher order ones.

Second one has to do with the structure of expRK algorithms. In many of the algorithms we need to evaluate phi_k(tL)v for different t but the same L and v. For the one-step Krylov method, we can simply do one Arnoldi iteration and use the same results for different t. If we were to use a method similar to Expokit, however, this is not enough because the intermediate stages are not known and depends on t.

The advantage of Expokit's method is that we can use a large t... but since finer time steps are used under the hood this is most likely an illusion. Since we are doing ODE integration anyways, why not just use a smaller time step to begin with? The whole accuracy control side of Expokit can probably be delegated to the integrator side.

Nevertheless, the error estimation part of Expokit is imo still useful, especially in adaptive expRK that we eventually aim to build.

Contributor

MSeeker1340 commented May 22, 2018

From slack:

Been thinking about Expokit's approach of decomposing exp(tA)v into smaller time intervals. While the idea is appealing, I think we should probably stick to plain, one-stage Krylov.
Two reasons: first, while exp has this nice semigroup property, the same isn't true for phis. While it is possible to also express phi_1(tA)v as a series of steps (Expokit paper (17)), I don't think this is easily generalizable to higher order ones.

Second one has to do with the structure of expRK algorithms. In many of the algorithms we need to evaluate phi_k(tL)v for different t but the same L and v. For the one-step Krylov method, we can simply do one Arnoldi iteration and use the same results for different t. If we were to use a method similar to Expokit, however, this is not enough because the intermediate stages are not known and depends on t.

The advantage of Expokit's method is that we can use a large t... but since finer time steps are used under the hood this is most likely an illusion. Since we are doing ODE integration anyways, why not just use a smaller time step to begin with? The whole accuracy control side of Expokit can probably be delegated to the integrator side.

Nevertheless, the error estimation part of Expokit is imo still useful, especially in adaptive expRK that we eventually aim to build.

@MSeeker1340

This comment has been minimized.

Show comment
Hide comment
@MSeeker1340

MSeeker1340 May 22, 2018

Contributor

I changed the caching versions of all the existing exponential integrators to use phim and removed functions that calculate the caching operators using mixed precision.

This is more of a sanity check for phim, so the computation isn't exactly optimal. On the other hand we need to think about the interface of expRK and related methods in detail in the future, so I wouldn't worry about optimization now.

Contributor

MSeeker1340 commented May 22, 2018

I changed the caching versions of all the existing exponential integrators to use phim and removed functions that calculate the caching operators using mixed precision.

This is more of a sanity check for phim, so the computation isn't exactly optimal. On the other hand we need to think about the interface of expRK and related methods in detail in the future, so I wouldn't worry about optimization now.

@ChrisRackauckas

This comment has been minimized.

Show comment
Hide comment
@ChrisRackauckas

ChrisRackauckas May 22, 2018

Member

Out of curiousity, what's the timing difference between the old and new dense constructions?

Member

ChrisRackauckas commented May 22, 2018

Out of curiousity, what's the timing difference between the old and new dense constructions?

@ChrisRackauckas ChrisRackauckas merged commit 9a41a37 into JuliaDiffEq:master May 22, 2018

2 of 3 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls First build on master at 55.704%
Details
@ChrisRackauckas

This comment has been minimized.

Show comment
Hide comment
@ChrisRackauckas

ChrisRackauckas May 22, 2018

Member

Thanks! Looks great. Just curious about the timing difference still. In reality it doesn't make much of a difference since it's the startup stage, so it's really the numerical stability that matters. But it would be good to have documented.

Member

ChrisRackauckas commented May 22, 2018

Thanks! Looks great. Just curious about the timing difference still. In reality it doesn't make much of a difference since it's the startup stage, so it's really the numerical stability that matters. But it would be good to have documented.

@MSeeker1340

This comment has been minimized.

Show comment
Hide comment
@MSeeker1340

MSeeker1340 May 22, 2018

Contributor

Comparison between the old (mixed precision) and new (Sidje phim) approach:

using OrdinaryDiffEq
using DiffEqOperators
Pkg.status("OrdinaryDiffEq")
 - OrdinaryDiffEq                3.12.0+            exponential_utils
srand(0)
n = 20
f1 = DiffEqArrayOperator(randn(n,n))
f2 = (u,p,t) -> u
u0 = randn(n)
dt = 0.01; tspan = (0.0, 1.0)
prob = SplitODEProblem(f1, f2, u0, tspan);
for method in [NorsettEuler, ETD2, ETDRK4]
    println(method)
    @time init(prob, NorsettEuler(); dt=dt)
end
OrdinaryDiffEq.NorsettEuler
  0.005885 seconds (976 allocations: 2.033 MiB)
OrdinaryDiffEq.ETD2
  0.003104 seconds (976 allocations: 2.033 MiB)
OrdinaryDiffEq.ETDRK4
  0.005806 seconds (976 allocations: 2.033 MiB)
using OrdinaryDiffEq
using DiffEqOperators
Pkg.status("OrdinaryDiffEq")
 - OrdinaryDiffEq                3.12.0+            master
srand(0)
n = 20
f1 = DiffEqArrayOperator(randn(n,n))
f2 = (u,p,t) -> u
u0 = randn(n)
dt = 0.01; tspan = (0.0, 1.0)
prob = SplitODEProblem(f1, f2, u0, tspan);
for method in [NorsettEuler, ETD2, ETDRK4]
    println(method)
    @time init(prob, NorsettEuler(); dt=dt)
end
OrdinaryDiffEq.NorsettEuler
  0.006184 seconds (46.44 k allocations: 2.043 MiB)
OrdinaryDiffEq.ETD2
  0.007714 seconds (46.44 k allocations: 2.043 MiB)
OrdinaryDiffEq.ETDRK4
  0.007179 seconds (46.44 k allocations: 2.043 MiB)
Contributor

MSeeker1340 commented May 22, 2018

Comparison between the old (mixed precision) and new (Sidje phim) approach:

using OrdinaryDiffEq
using DiffEqOperators
Pkg.status("OrdinaryDiffEq")
 - OrdinaryDiffEq                3.12.0+            exponential_utils
srand(0)
n = 20
f1 = DiffEqArrayOperator(randn(n,n))
f2 = (u,p,t) -> u
u0 = randn(n)
dt = 0.01; tspan = (0.0, 1.0)
prob = SplitODEProblem(f1, f2, u0, tspan);
for method in [NorsettEuler, ETD2, ETDRK4]
    println(method)
    @time init(prob, NorsettEuler(); dt=dt)
end
OrdinaryDiffEq.NorsettEuler
  0.005885 seconds (976 allocations: 2.033 MiB)
OrdinaryDiffEq.ETD2
  0.003104 seconds (976 allocations: 2.033 MiB)
OrdinaryDiffEq.ETDRK4
  0.005806 seconds (976 allocations: 2.033 MiB)
using OrdinaryDiffEq
using DiffEqOperators
Pkg.status("OrdinaryDiffEq")
 - OrdinaryDiffEq                3.12.0+            master
srand(0)
n = 20
f1 = DiffEqArrayOperator(randn(n,n))
f2 = (u,p,t) -> u
u0 = randn(n)
dt = 0.01; tspan = (0.0, 1.0)
prob = SplitODEProblem(f1, f2, u0, tspan);
for method in [NorsettEuler, ETD2, ETDRK4]
    println(method)
    @time init(prob, NorsettEuler(); dt=dt)
end
OrdinaryDiffEq.NorsettEuler
  0.006184 seconds (46.44 k allocations: 2.043 MiB)
OrdinaryDiffEq.ETD2
  0.007714 seconds (46.44 k allocations: 2.043 MiB)
OrdinaryDiffEq.ETDRK4
  0.007179 seconds (46.44 k allocations: 2.043 MiB)

@MSeeker1340 MSeeker1340 deleted the MSeeker1340:exponential_utils branch May 22, 2018

@MSeeker1340

This comment has been minimized.

Show comment
Hide comment
@MSeeker1340

MSeeker1340 May 22, 2018

Contributor

Archiving last week's numerical stability analysis for future reference:

The straightforward method of calculating the phi functions is either compute them directly or use the recurrence formula. Either way it is unstable for very small arguments. Using mixed precision calculations (BigFloat) can help but also become unstable for higher order phis.

phi_scalar

phi_matrix

Since the existing methods use at most phi_3, the mixed precision approach works but as can be seen above as soon as we go to phi_4 it also breaks.

The alternative method, using a formula from Sidje (Sidje, R. B. (1998). Expokit: a software
package for computing matrix exponentials. ACM Transactions on Mathematical
Software (TOMS), 24(1), 130-156. Theorem 1), is unconditionally stable as long as expm is stable (that is, no very large positive real eigenvalue for the matrix). This is confirmed by numerical experiments:

phi_stability_alt

stability_phim_sidje

Finally, a notebook for the error analysis of expm, which should more or less indicate how phim using the Sidje formula would perform. The error grows when the spectral radius of the matrix, which could be problematic in some cases.

Error analysis of expm.pdf

Contributor

MSeeker1340 commented May 22, 2018

Archiving last week's numerical stability analysis for future reference:

The straightforward method of calculating the phi functions is either compute them directly or use the recurrence formula. Either way it is unstable for very small arguments. Using mixed precision calculations (BigFloat) can help but also become unstable for higher order phis.

phi_scalar

phi_matrix

Since the existing methods use at most phi_3, the mixed precision approach works but as can be seen above as soon as we go to phi_4 it also breaks.

The alternative method, using a formula from Sidje (Sidje, R. B. (1998). Expokit: a software
package for computing matrix exponentials. ACM Transactions on Mathematical
Software (TOMS), 24(1), 130-156. Theorem 1), is unconditionally stable as long as expm is stable (that is, no very large positive real eigenvalue for the matrix). This is confirmed by numerical experiments:

phi_stability_alt

stability_phim_sidje

Finally, a notebook for the error analysis of expm, which should more or less indicate how phim using the Sidje formula would perform. The error grows when the spectral radius of the matrix, which could be problematic in some cases.

Error analysis of expm.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment