Warm start 2.0 #8

andreadelprete · 2020-03-08T14:48:09Z

The previous warm-starting algorithm hasn't given good results, so I am investigating a new way for warm-starting the matrix exponential (all very preliminary).

The key idea is that you have already computed e^A and now you have to compute e^(A+B), where |B| << |A| (the norm is the 1 norm). The scaling-and-squaring algorithm relies on Padè approximants, which approximate the exponential as the ratio between two polynomials. Given that the matrices A we are dealing with have a large norm (greater than 5.4), the used polynomials are always of order 13.

e^A = q(A)^-1 * p(A)
p(A) = sum_{i=0}^13 b_i A^i
q(A) = p(-A)

Computing e^(A+B) thus boils down to computing p(A+B) and q(A+B). Can we exploit the previous computations of p(A) and q(A) to do that?

Option 1

We could approximate the high-order (>k) terms of p(A+B) with those of p(A), thus evaluating only the low-order terms of p(A+B). Since |B| << |A| the high-order terms should be well approximated even if we replace A+B with A:

p(A+B) = sum_{i=0}^13 b_i (A+B)^i = sum_{i=0}^k b_i (A+B)^i + sum_{i=k+1}^13 b_i A^i

Option 2

We could use a truncated Taylor expansion of p(A) around A to evaluate p(A+B). For instance we could stop at the first derivative:

p(A+B) = p(A) + p'(A)*B

where the derivative:

p'(A) = sum_{i=1}^13 i * b_i A^(i-1)

can be easily computed in terms of the powers of A, which have already been evaluated for computing p(A). However, in the computation of p(A) not all the powers of A are explicitly computed, but just the power 2, 4 and 6, thanks to an efficient algorithm than evaluates p(A) with only 6 matrix-matrix multiplications (MMM) rather than 13. If we used the same algorithm to evaluate p'(A), exploiting the fact that the first three multiplications have been already computed because they are the ones needed to compute the three above-mentioned powers of A, we can get p'(A) with at most 3 MMM. To that we need to add another MMM to compute p'(A)*B, getting to a total of 4 MMM, rather than the usual 6 MMM we would need to evaluate p(A+B).

This second option seems more promising to me because the error would be O(B^2), therefore it should work well for small |B|. The main issue I see is that the error would be accumulating over the iterations, so we could only use it for a few iterations before needing to do a standard computation. On the bright side, once we have computed p'(A) we can use it several times, as long as the input matrix is not too far from A. Therefore after the first iterations, which costs 4 MMM, the other ones would cost only 1 MMM, making this method extremely advantageous!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warm start 2.0 #8

Warm start 2.0 #8

andreadelprete commented Mar 8, 2020

Warm start 2.0 #8

Warm start 2.0 #8

Comments

andreadelprete commented Mar 8, 2020

Option 1

Option 2