Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warm start 2.0 #8

Open
andreadelprete opened this issue Mar 8, 2020 · 0 comments
Open

Warm start 2.0 #8

andreadelprete opened this issue Mar 8, 2020 · 0 comments

Comments

@andreadelprete
Copy link
Owner

The previous warm-starting algorithm hasn't given good results, so I am investigating a new way for warm-starting the matrix exponential (all very preliminary).

The key idea is that you have already computed e^A and now you have to compute e^(A+B), where |B| << |A| (the norm is the 1 norm). The scaling-and-squaring algorithm relies on Padè approximants, which approximate the exponential as the ratio between two polynomials. Given that the matrices A we are dealing with have a large norm (greater than 5.4), the used polynomials are always of order 13.

e^A = q(A)^-1 * p(A)
p(A) = sum_{i=0}^13 b_i A^i
q(A) = p(-A)

Computing e^(A+B) thus boils down to computing p(A+B) and q(A+B). Can we exploit the previous computations of p(A) and q(A) to do that?

Option 1

We could approximate the high-order (>k) terms of p(A+B) with those of p(A), thus evaluating only the low-order terms of p(A+B). Since |B| << |A| the high-order terms should be well approximated even if we replace A+B with A:

p(A+B) = sum_{i=0}^13 b_i (A+B)^i = sum_{i=0}^k b_i (A+B)^i + sum_{i=k+1}^13 b_i A^i

Option 2

We could use a truncated Taylor expansion of p(A) around A to evaluate p(A+B). For instance we could stop at the first derivative:

p(A+B) = p(A) + p'(A)*B

where the derivative:

p'(A) = sum_{i=1}^13 i * b_i A^(i-1)

can be easily computed in terms of the powers of A, which have already been evaluated for computing p(A). However, in the computation of p(A) not all the powers of A are explicitly computed, but just the power 2, 4 and 6, thanks to an efficient algorithm than evaluates p(A) with only 6 matrix-matrix multiplications (MMM) rather than 13. If we used the same algorithm to evaluate p'(A), exploiting the fact that the first three multiplications have been already computed because they are the ones needed to compute the three above-mentioned powers of A, we can get p'(A) with at most 3 MMM. To that we need to add another MMM to compute p'(A)*B, getting to a total of 4 MMM, rather than the usual 6 MMM we would need to evaluate p(A+B).

This second option seems more promising to me because the error would be O(B^2), therefore it should work well for small |B|. The main issue I see is that the error would be accumulating over the iterations, so we could only use it for a few iterations before needing to do a standard computation. On the bright side, once we have computed p'(A) we can use it several times, as long as the input matrix is not too far from A. Therefore after the first iterations, which costs 4 MMM, the other ones would cost only 1 MMM, making this method extremely advantageous!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant