Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wpimath] Fix UnscentedKalmanFilter and improve math docs #7850

Open
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

PascalSkylake
Copy link

@PascalSkylake PascalSkylake commented Mar 3, 2025

Throughout the code the state sqrt covariance S and innovation covariance Sy are maintained as upper triangular cholesky factors of those covariance matrices. The original paper defines P=S*S', so S should be lower triangular. The functions in the paper reflect this. In the code implementation, the sqrt covariance matrices are upper triangular, but the algorithm expects them to be lower triangular.

This bug was likely missed because the incorrect version of the filter is able to converge for some systems where all the states are observed, and the test case is set up such that all states are observed.

To fix the bug, a couple things needed to be changed:
all instances of rankUpdate() needed to be changed to use the lower triangular cholesky factor,
In the unscented transform, when S is found via QR decomposition, we need to take the transpose because R is upper triangular,
P() and SetP() functions need to be modified to be P=S*S' instead of P=S'*S, and P.llt().matrixL() instead of P.llt().matrixU() respectively.

Each part of the algorithm has also had the comments changed to clarify exactly which equation from the paper it implements.

@PascalSkylake PascalSkylake requested a review from a team as a code owner March 3, 2025 17:33
@github-actions github-actions bot added the component: wpimath Math library label Mar 3, 2025
Copy link
Member

@calcmogul calcmogul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fixes check out.

See https://github.com/wpilibsuite/allwpilib/blob/main/CONTRIBUTING.md#math-documentation for a tool that'll let you write formatted math in inline comments. It can do Greek letters as well as superscripts and subscripts for most things. For example, \chi_{k,k-1} becomes χₖ,ₖ₋₁. For the stuff it can't do, we just write out _ or ^, or take up multiple lines with essentially ASCII art.

When writing equations out, please format them like so:

// Description here.
//
//   equation here indented 2 spaces
//
// More description. Reference equation (1) like that. Reference equations (2) and (3) like that.

If it's just a one-line equation with no description, you can skip the indentation.

By the way, you can write transposes with ^T -> .

@calcmogul
Copy link
Member

calcmogul commented Mar 3, 2025

We should probably add a second model to the tests that breaks with the old impl and works with the new impl. The current test only has a differential drive model that measures the full state, which doesn't adequately exercise the filter's behavior. If you need help with any implementation work, let me know.

@PascalSkylake
Copy link
Author

PascalSkylake commented Mar 4, 2025

Comments have been updated. I don't really want to include the _{k} or _{k,k-1} on everything, I know the paper has them, but it's much more readable without, and still clear which part of the paper they map to without them.

Copy link
Member

@calcmogul calcmogul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor grammar stuff.

@calcmogul
Copy link
Member

calcmogul commented Mar 4, 2025

By the way, the wpiformat errors are for linewrapping and some trailing whitespace at the ends of lines. The workflow summary has a patch file you can apply.

@calcmogul
Copy link
Member

Implementation and comments look good. Now it just needs a unit test with a model that broke the old filter to sanity check the new filter.

@PascalSkylake
Copy link
Author

did not mean to close, we could use the system I mentioned on discord and check convergence of kV and kA? It was confirmed to diverge with the old implementation

@PascalSkylake PascalSkylake reopened this Mar 4, 2025
@calcmogul
Copy link
Member

That'll work.

Co-authored-by: Tyler Veness <calcmogul@gmail.com>
@auscompgeek auscompgeek requested a review from calcmogul March 4, 2025 08:49
@PascalSkylake
Copy link
Author

the test that failed in ci succeeded on my machine, not sure why will investigate later today

@calcmogul
Copy link
Member

 D:\a\allwpilib\allwpilib\wpimath\src\test\native\cpp\estimator\UnscentedKalmanFilterTest.cpp(268): error: The difference between true_kV and observer.Xhat(2) is 0.060949554391734218, which exceeds 0.05, where
true_kV evaluates to 3,
observer.Xhat(2) evaluates to 3.0609495543917342, and
0.05 evaluates to 0.050000000000000003.

By the way, our CI doesn't seed randomness the same every time. We should probably fix that since that can cause spurious failures.

@PascalSkylake
Copy link
Author

PascalSkylake commented Mar 4, 2025

that shouldn't matter, I can't really run the full ./gradlew :wpimath:testDesktopCpp because it either makes my computer unusable for 15 minutes or just crashes it entirely, but running an identical test in a separate program the result has been well within 0.05 (more like within 0.005) consistently even with differently seeded rng, so I don't really know why it failed here. This is a bad solution but we could just increase the tolerance for now since the value did at least converge, which was the goal of the test.

by "identical test" I mean the same functions f and h, same stddevs, same initial state and covariance, same dt, same step count

@calcmogul
Copy link
Member

calcmogul commented Mar 4, 2025

You can limit the number of CPU cores used for compilation with --max-workers N, where N is the number of cores. That should keep RAM usage under control (some files need 2-4 GB of RAM each to compile).

The CMake build is much faster than the Gradle one because it builds for just one platform at a time instead of all of them.

@calcmogul
Copy link
Member

calcmogul commented Mar 4, 2025

Rerunning the tests passed, which tells me it's the random number generator (the only source of nondeterminism here) causing the issues. We may want to clamp the random values we get since technically, the random number generator can generate values as large as it likes (they're just unlikely).

@calcmogul
Copy link
Member

calcmogul commented Mar 6, 2025

The Java test results are all over the place on my Linux x86-64 machine.

Run 1:

UnscentedKalmanFilterTest > testMotorConvergence() FAILED
    org.opentest4j.AssertionFailedError: expected: <0.2> but was: <2.627480574235188>
        at app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
        at app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
        at app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
        at app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:86)
        at app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:81)
        at app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1014)
        at app//edu.wpi.first.math.estimator.UnscentedKalmanFilterTest.testMotorConvergence(UnscentedKalmanFilterTest.java:374)

Run 2:

UnscentedKalmanFilterTest > testMotorConvergence() FAILED
    org.opentest4j.AssertionFailedError: expected: <3.0> but was: <2.9465465894423537>
        at app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
        at app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
        at app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
        at app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:86)
        at app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:81)
        at app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1014)
        at app//edu.wpi.first.math.estimator.UnscentedKalmanFilterTest.testMotorConvergence(UnscentedKalmanFilterTest.java:373)

Run 3:

UnscentedKalmanFilterTest > testMotorConvergence() FAILED
    org.opentest4j.AssertionFailedError: expected: <3.0> but was: <2.805124031420146>
        at app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
        at app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
        at app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
        at app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:86)
        at app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:81)
        at app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1014)
        at app//edu.wpi.first.math.estimator.UnscentedKalmanFilterTest.testMotorConvergence(UnscentedKalmanFilterTest.java:373)

@PascalSkylake
Copy link
Author

PascalSkylake commented Mar 6, 2025

yeah when I tried to port it I got really inconsistent results as well, never ended up pushing because I've been busy the last couple days, I'm gonna double check all the algorithm changes in Java soon when I get a minute.

Though I'm not sure something being wrong in the UKF would cause these issues rather than a total divergence, gut feeling is it's a test thing.

@calcmogul
Copy link
Member

calcmogul commented Mar 6, 2025

We should also confirm experimentally that the white noise being generated has the correct standard deviations, and check how close it is to zero-mean. The latter specifically establishes the accuracy lower bound.

@calcmogul
Copy link
Member

Noise means look OK.

Java:

    noise mean =
    5.711066564515542E-4
    0.004906610900215656
    -0.002073389170414037

C++:

noise mean =
0.0009678891847917698
-8.400388067173148e-05
-0.00518208773993269

@calcmogul
Copy link
Member

calcmogul commented Mar 7, 2025

I got it... mostly working? I fixed two issues:

  1. The exact discrete model wasn't being used for ground truth.
  2. The 1000 scaling factor on the motorControlInput() argument made the voltage not change fast enough, so the dynamics weren't sufficiently sampled to estimate kₐ.

I'm still seeing occasional values outside the tolerances. I'm not sure if there's an off-by-one error in which values are passed to the measurement model, the input function not sampling well enough, or the dt being too big for the filter to accurately estimate the dynamics (the time constant is kₐ/kᵥ).

If we can't figure it out, I'd by OK with just increasing the tolerance since we're mainly interested in the filter converging at all.

@PascalSkylake
Copy link
Author

PascalSkylake commented Mar 7, 2025

are the java and c++ at least consistent with each other now? The c++ test failed 3/20 runs on my machine, at this point I might be in favor of just upping the tolerance rather than trying to tune the test more since what we care about is that it doesn't diverge to some insane number.

I wonder about dt being too large though, I just went with 10ms because that's how fast vex motors output data so it's what I needed.

I don't know if there's a good way to run the java tests repeatedly like you can with cmake, I really don't know much about developing for wpilib, so I haven't been able to do much for java.

@calcmogul
Copy link
Member

Yea, the C++ and Java tests seem to have a similar failure rate now.

@PascalSkylake
Copy link
Author

fwiw when I said I was having the really tiny tolerance I was not discretizing anything anywhere, either when creating the true states or for the process noise in the ukf, when I made the c++ test the first time, I had to retune it to discretize in the filter but evidently I forgot to for the true states, which is honestly probably why I had to tune it again in the first place (the stddevs I used were just chosen to work well, not based on anything in particular)

@PascalSkylake
Copy link
Author

unrelated, reading through Merwe's dissertation: https://www.researchgate.net/publication/2478046 he mentions regenerating sigma points for the measurement update in a clarification under the UKF formulation, (ctrl f for "In (3.174)" to find that) so seemingly this is where that came from originally

@calcmogul calcmogul changed the title [wpimath] Fix UnscentedKalmanFilter and Improve Comments [wpimath] Fix UnscentedKalmanFilter and improve math docs Mar 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants