Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Iterated EKF and Iterated EKS #135

Merged
merged 23 commits into from Aug 15, 2022
Merged

Conversation

petergchang
Copy link
Collaborator

@petergchang petergchang commented Aug 9, 2022

Description

  • Implemented IEKF as iterated_extended_kalman_filter() in ekf/inference.py. Note that the parameter num_iter is the number of re-linearizations around the posterior, which means that the regular EKF would correspond to num_iter=0.
  • Changed extended_kalman_smoother() to take in an optional filtered_posterior parameter which the smoother uses as the filtered posterior in carrying out smoothing. This is useful for IEKS.
  • As the current draft of the book does not contain a pseudocode of IEKS, I generated one, as seen in Figure below.
  • Implemented IEKS as iterated_extended_kalman_smoother() in ekf/inference.py.

Figure

Screen Shot 2022-08-12 at 7 42 23 AM

Issue

#139

@AdrienCorenflos
Copy link
Collaborator

Hi,

It's worth noting your implementation will incur a very large memory consumption when taking gradients of the log-likelihood, see https://arxiv.org/abs/2207.00426, Section VII.

This can be avoided by remarking that the result of the posterior linearisation verifies a fixed point equation.
I have implemented it carefully in the IEKS case, see here https://github.com/EEA-sensors/sqrt-parallel-smoothers/blob/16d9bbd5aa021be5f88277d3c87242b2c9a7bd53/parsmooth/methods.py#L71

This extends directly to IEKF, where instead of the full trajectory obeying a fixed point, it will simply be the marginal states.

@slinderman
Copy link
Collaborator

Thanks @AdrienCorenflos! Out of curiosity, have you encountered numerical instability with the custom vjp for fixed_point derived from the implicit function theorem? I played with this in some related work where we were trying to take gradients through CAVI algorithms and ran into all sorts of trouble. Maybe my issues were unique to my CAVI setting though.

@AdrienCorenflos
Copy link
Collaborator

Well, not really no. I think that the gradient fixed point has exactly the same convergence properties as the forward one (because they have the same Lipschitz constants).

Technically, I think one should probably use line search or similar just like you would do for the IEKS but I don't think there are empirical or theoretical studies on the topic

@slinderman
Copy link
Collaborator

Interesting, that’s good to hear. I’ll have to dig up that old CAVI example. Maybe it just needed a smaller convergence threshold, or maybe it was something more fundamental about that setup. I know IFT requires certain properties of the fixed point, and it could be that they weren’t satisfied in that problem. Anyway, separate issue!

@AdrienCorenflos
Copy link
Collaborator

AdrienCorenflos commented Aug 9, 2022

Although, I'm saying this, but for param estimation, the algo may diverge fairly easily if your initial trajectory is not "good enough". We had this problem with @Fatemeh-Yaghoobi in our paper. We fixed it by essentially using the initial trajectory that would have corresponded to just inverting the observation model by ignoring the observation noise and dynamics and it made everything much more robust. Can't remember if we mentioned this implementation detail in the paper.

https://github.com/EEA-sensors/sqrt-parallel-smoothers/blob/16d9bbd5aa021be5f88277d3c87242b2c9a7bd53/notebooks/experiment_bearing_only_parameter_estimation.ipynb

See positions = inverse_bearings(ys, s1, s2)

@slinderman
Copy link
Collaborator

Ugh, initialization was the bane of my existence with SLDS models. I ended up doing the same thing to initialize continuous state estimates. We’ve ignored that issue so far in this repo, but we’ll need to confront it sooner or later.

@AdrienCorenflos
Copy link
Collaborator

Also, some recent work that we didn't cite ('cause I read it after the we uploaded the paper) is concerned with describing very generally how one should implement gradient of fixed points https://arxiv.org/abs/2105.15183 if you are into this kind of thing.

@murphyk
Copy link
Member

murphyk commented Aug 10, 2022

We might want to keep Peter's current straightforward implementation as a readable reference, before adding your fixed point version.

@petergchang petergchang marked this pull request as ready for review August 12, 2022 12:50
@murphyk murphyk merged commit 7a7eaa0 into probml:main Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants