-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nail down stability of IBM with very high orders. #404
Nail down stability of IBM with very high orders. #404
Conversation
Codecov Report
@@ Coverage Diff @@
## master #404 +/- ##
==========================================
- Coverage 83.67% 83.67% -0.01%
==========================================
Files 99 99
Lines 4957 4955 -2
Branches 670 669 -1
==========================================
- Hits 4148 4146 -2
Misses 598 598
Partials 211 211
|
src/probnum/statespace/integrator.py
Outdated
process_noise_cov_cholesky = np.linalg.cholesky( | ||
self._proc_noise_cov_mat | ||
+ self._process_noise_damping * np.eye(len(self._proc_noise_cov_mat)) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this nugget necessary for numerical stability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only for orders > 11 :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we then actually add this argument? I guess it will be needed quite rarely, if not only for the test you added in this PR 😄 Correct me if this is wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be ignored, I read your answer below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, not really. I was playing around with it and thought it was cool, but I see your concerns. Shall I remove it?
state_trans_mat_fun=lambda t: state_trans_mat, | ||
shift_vec_fun=lambda t: shift_vec, | ||
proc_noise_cov_mat_fun=lambda t: proc_noise_cov_mat, | ||
proc_noise_cov_cholesky_fun=lambda t: proc_noise_cov_cholesky, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Thanks! I always get a bit twitchy when more than 2 arguments are passed positionally
# The Cholesky factor of the process noise covariance matrix of the IBM | ||
# always exists, even for non-square root implementations. | ||
proc_noise_cov_cholesky = ( | ||
self.precon(dt) | ||
@ self.equivalent_discretisation_preconditioned.proc_noise_cov_cholesky | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we get any benefit from computing this Cholesky factor when not using the sqrt
implementation?
Otherwise, we might make this conditional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm my thinking here was that this cholesky factor is created for the equivalent discretisation (preconditioned), so it was super weird not to translate it here, too. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make the computation of the Cholesky factor - both in equivalent_discretisation_preconditioned
and in discretise
- conditional on self.forward
-/backward_implementation
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we could. Thought it only happens once, so the impact on runtime should be almost neglectible. Do you think that it is worth it nevertheless?
step=1e-12, | ||
forward_implementation="sqrt", | ||
backward_implementation="sqrt", | ||
_process_noise_damping=1e-14, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am surprised that a nugget of 1e-14
does anything at all.
Do you think it necessary to establish an entirely new kwarg, solely for the case that somebody wants to use a 14 times integrated model? 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I dont want to stretch the word "necessary" too much here... :) I just thought it was cool to get these high orders! ;)
And it is a "hidden" kwarg that defaults to 0, so I thought it doesnt hurt too much. Though if you find it annoying I am open for talking about it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well maybe I am biased, but I quickly get confused when a method offers me to tweak every part of the inner workings, and especially if the documentation takes half an hour to read (not that this would be the case here!) So I often prefer clarity over control, given that often, the kwargs are only used in very special cases. Do hidden kwargs appear in the docs?
"""With damping, the filter achieves really high orders (we test 25, but 50 was | ||
possible too, for some reason).""" | ||
regression_problem, statespace_components = filtsmooth_zoo.car_tracking( | ||
model_ordint=25, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for the jokes, the test passes for order=75 as well (no typo!), but I thought 25 was sufficiently extreme :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All just by removing the matrix inversion, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and by using this mini damping factor that makes the process noise always positive definite.
Just checking: Your PR description is outdated, right?
this should be 25 by now, right? I don't want to be petty here, just check if the 25-order thing is not some kind of special case. 😆 |
25 is only filter. filtsmooth starts complaining at 15 |
I see. Still. That's quite remarkable |
as I said in one of the comments as well: 75 worked, too, but I thought no one would believe me so I used 25 for the test ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Crazy stuff! Thanks for working that out :)
…obnum into high_order_ibm_test
When playing with the code, I noticed that the changes in #386 were absolutely crucial for stability of high order IBMs.
I thought it would be good to have a test that nails this down. The changes in this PR are: