EKF: replacement of autocode with sympy generated output #868

priseborough · 2020-07-16T08:05:29Z

This is a work in progress. Derivations and conversions in c++ code are complete, however test benches and SITL/flight testing require further work.

This is built on proof of concept work by @RomanBapst - see https://github.com/RomanBapst/ecl/tree/ekf_python

Test comparison programs have been written that compare the output from the sympy generated code to the legacy code in master. The largest fraction difference defined as (new_value - old_value)/old_value is found and printed. For example the covariance prediction gives:

max_diff_fraction = 2.074566e-06 , old = -1.490614e+17 , new = -1.490610e+17 , location index = 4,7

Comparing the observation Jacobians for the various yaw fusion methods gives:

321 option A max_diff_fraction = 1.322706e-07 , old = 9.012533e-01 , new = 9.012532e-01 , location index = 3
321 option B max_diff_fraction = 3.555763e-07 , old = 1.005770e+00 , new = 1.005769e+00 , location index = 2
312 option A max_diff_fraction = 1.696653e-07 , old = -2.810457e+00 , new = -2.810457e+00 , location index = 2
312 option B max_diff_fraction = 5.844245e-07 , old = -4.589488e-01 , new = -4.589485e-01 , location index = 1

Differences have so far have been the last significant figure.

Testing

Comparison test programs for the covariance prediction and observation methods. See https://github.com/PX4/ecl/blob/pr-ekfSymPyDerivationConversion/EKF/python/ekf_derivation/run_test_programs.sh

SITL testing that exercises all observation methods. See the following:
SITL gazebo test https://review.px4.io/plot_app?log=a08ff241-6840-456b-b509-491235c4267d
SITL gazebo_iris_opt_flow test https://logs.px4.io/plot_app?log=9c29150e-ab23-4245-869c-bb2a978cebc6
SITL gazebo plane test https://logs.px4.io/plot_app?log=a568008f-92f8-48c8-93ed-694b7a94dd19

TODO: SITL testing for MC body frame specific force fusion and GPS yaw fusion.

kamilritz

Very cool. I am super happy that this is finally happening. I appreciate that you paid attention to const correctness and used const references to grab the state's values.

I would like to raise the discussion about the size of the observation jacobian container. I saw that it is not done consistently throughout this PR. At some points it is a container of size 24 with lot of entries being 0, sometimes it is a container of size << 24 with only containing non-zero elements. I am not sure of to trade-off efficiency and readability here. Maybe @jkflying can help us here.
Any way it would be great if we could decide on a single approach. As it will help us to simplify the sparse computation of KHP in a single function.

Additionally, would it be possible to swap the autogenerated code in multiple steps/PRs? If think this would make it much simpler to review, and rebase on any change in master.

Write comparison test programs for the rest of the observation methods.
SITL and/or testing that exercises all observation methods. I can do the MC and FW which will exercise most, but am unsure how to exercise the external vision/VIO fusion methods.

I am happy to help here. Can you be a bit more specific about the nature of these comparison test programs? Do you want to compare state outputs similar to the "change output predictor", or write unit tests or general SITL tests?

kamilritz · 2020-07-16T12:47:50Z

EKF/airspeed_fusion.cpp

+
+	if (update_wind_only) {
+		// If we are getting aiding from other sources, then don't allow the airspeed measurements to affect the non-windspeed states
+		for (unsigned row = 0; row <= 21; row++) {


As Kfusion is a matrix::Vector it will be already initialized with zeros. No need to set it to zero again.

kamilritz · 2020-07-16T12:50:41Z

EKF/airspeed_fusion.cpp


-		for (uint8_t i = 0; i < _k_num_states; i++) { H_TAS[i] = 0.0f; }
+	// Intermediate variables
+	const float HK0 = vn - vwn;


The naming HK* for the intermediate variables could be a bit confusing as there are also variables with the name KHP or KH.

I don't see how it makes it more confusing. They are clearly intermediate variables.

There is indeed a comment saying that it is an intermediate variable. Nevertheless if someone sees HK0, he might think it is the 0th element of the result of the operation between two vector/matrix quantities H and K. And that is also the way I perceived it in the first moment.
The former t0, t1, ... notation - at least for me - was clearer on the first glance. I understand that this is probably not so safe, when copying generated code output. Maybe there is a good option in between, something like t_hk_0 or tmp_hk_0.

kamilritz · 2020-07-16T12:55:55Z

EKF/covariance.cpp

-	SPP[8] = 2*q0*q3 + 2*q1*q2;
-	SPP[9] = 2*q0*q2 + 2*q1*q3;
-	SPP[10] = SF[16];
+	const float PS0 = powf(q1, 2);


Should this powf be replaced with sq?

The compiler should optimise out powf(x,2) and doing a find and replace on it has risks.

kamilritz · 2020-07-16T12:57:16Z

EKF/covariance.cpp

-	SPP[9] = 2*q0*q2 + 2*q1*q3;
-	SPP[10] = SF[16];
+	const float PS0 = powf(q1, 2);
+	const float PS1 = 0.25F*daxVar;


In most situation we used a small f for specifing the type of the literal. Is there a specific reason why a capital F is used here? Would be nice if we have consistency.

Yes, but thats what python outputs.

kamilritz · 2020-07-16T12:58:22Z

EKF/covariance.cpp

+	const float PS45 = P(0,13) - P(1,13)*PS11 + P(10,13)*PS6 + P(11,13)*PS7 + P(12,13)*PS9 - P(2,13)*PS12 - P(3,13)*PS13;
+	const float PS46 = PS37 + PS38;
+	const float PS47 = P(0,15) - P(1,15)*PS11 + P(10,15)*PS6 + P(11,15)*PS7 + P(12,15)*PS9 - P(2,15)*PS12 - P(3,15)*PS13;
+	const float PS48 = 2*PS47;


Would it make a difference using 2.f instead of 2?

The difference testing says no.

Cool. I guess the test tests the numerical output. What about computational speed? I guess the compiler is forcing a float multiplication.

kamilritz · 2020-07-16T13:04:11Z

EKF/drag_fusion.cpp

+			const float HK29 = -HK12*P(1,23) + HK12*P(1,5) - HK13*P(1,6) + HK14*P(1,1) + HK15*P(0,1) - HK16*P(1,2) + HK17*P(1,3) - HK7*P(1,22) + HK7*P(1,4);
+			const float HK30 = -HK12*P(2,23) + HK12*P(2,5) - HK13*P(2,6) + HK14*P(1,2) + HK15*P(0,2) - HK16*P(2,2) + HK17*P(2,3) - HK7*P(2,22) + HK7*P(2,4);
+			const float HK31 = -HK12*P(3,23) + HK12*P(3,5) - HK13*P(3,6) + HK14*P(1,3) + HK15*P(0,3) - HK16*P(2,3) + HK17*P(3,3) - HK7*P(3,22) + HK7*P(3,4);
+			float HK32;


No need to declare the variable already here. Can be declared const on initialization.

I don't see how that is possible. It needs to be declared in the larger scope and only updated if the inverse is numerically OK.

kamilritz · 2020-07-16T13:05:17Z

EKF/gps_yaw_fusion.cpp

+	const float HK29 = 4/powf(HK17, 2);
+	const float HK30 = -HK16*P(0,2) - HK24*P(1,2) - HK25*P(2,2) + HK26*P(2,3);
+	const float HK31 = -HK16*P(0,3) - HK24*P(1,3) - HK25*P(2,3) + HK26*P(3,3);
+	float HK32;


No need to declare the variable already here. Can be declared const on initialization.

I don't see how that is possible. It needs to be declared in the larger scope and only updated if the inverse is numerically OK.

As we directly return inside the if statement, we could declare the variable afterwards.

float HK32; if(bla) { HK32 = ...; status_ = A; } else { // Do stuff not related to HK32 return; }

could be

if(!bla) { // Do stuff not related to HK32 return; } const float HK32 = ...; status_ = A;

kamilritz · 2020-07-16T13:08:37Z

EKF/mag_fusion.cpp

+				}
+			} else {
+				for (uint8_t i = 0; i < 16; i++) {
+					Kfusion(i) = 0.0f;


No need to set 0 to 0. Kfusion is initialized to 0.

kamilritz · 2020-07-16T13:12:40Z

EKF/optflow_fusion.cpp

-		Vector24f gain;
+		// calculate observation Jocobians and Kalman gains
+		if (obs_index == 0) {
+			// Observation Jacobians - axis 0


Very nice, I really like that we can now compute K and H after computing the innovation variance of both axis. This is how it has been done everywhere else.

kamilritz · 2020-07-16T13:16:18Z

EKF/sideslip_fusion.cpp

+				}
+			} else {
+				for (int i = 16; i <= 21; i++) {
+					Kfusion(i) = 0.0f;


no need to set 0 to 0.

dagar · 2020-07-18T17:04:29Z

The flash savings here are looking great. What this likely means is that we'll be able to comfortably afford to increase the optimization level from -Os -> -O2 on the embedded side, which can make a significant difference for Matrix library usage.

For reference, the entire estimator (PX4 ekf2 frontend + ecl/EKF backend) is about 140 KB of flash on px4_fmu-v5.

priseborough · 2020-07-20T07:07:54Z

I've added test programs to compare the before and after autocode and rebased.

priseborough · 2020-07-20T07:10:23Z

@kamilritz I did some simple test programs for the covariance prediction and observation jacobian and Kalman gains that compare them before and after. Differences are within what is expected for single precision calculations.

9136a0f

kamilritz · 2020-07-20T07:55:02Z

I would like to raise the discussion about the size of the observation jacobian container. I saw that it is not done consistently throughout this PR and in the codebase in general. At some points it is a container of size 24 with lot of entries being 0, sometimes it is a container of size << 24 with only containing the non-zero elements. I am not sure about the trade-off between efficiency and readability here. Maybe @jkflying can help us here.
Any way it would be great if we could decide on a single approach. As it will help us to simplify the sparse computation of KHP in a single function.

@priseborough @jkflying

jkflying · 2020-07-20T09:13:45Z

This is a pretty big PR, and I'm not able to pull out the patterns in the time I have. However, it sounds like the 'gather-scatter' pattern would work well here, allowing you to work with small dense matrices rather than large sparse matrices.

How this works is you create a permutation matrix which selects which rows/cols you keep (and can reorder them too, if you want). So rather than having a large matrix Q (dimensions mxm), we have a smaller D (nxn) with the non-zero values, and permutation P (mxn) with a single 1 per column, such that PDP' = Q. We would have to add support for permutation matrix multiplications to Matrix, which would enable this kind of design work. To be efficient, and because we don't have the same compiler type optimization Eigen does, we'd need something like an operator+=(PermuationMatrix<M,Q>, Matrix<Q,R>, PermutationMatrix<R,N>); in Matrix, to expand and fill (aka 'scatter') the small dense matrix into the large matrix.

kamilritz · 2020-07-20T09:40:15Z

How this works is you create a permutation matrix which selects which rows/cols you keep (and can reorder them too, if you want). So rather than having a large matrix Q (dimensions mxm), we have a smaller D (nxn) with the non-zero values, and permutation P (mxn) with a single 1 per column, such that PDP' = Q. We would have to add support for permutation matrix multiplications to Matrix, which would enable this kind of design work. To be efficient, and because we don't have the same compiler type optimization Eigen does, we'd need something like an operator+=(PermuationMatrix<M,Q>, Matrix<Q,R>, PermutationMatrix<R,N>); in Matrix, to expand and fill (aka 'scatter') the small dense matrix into the large matrix.

@jkflying That is a cool concept. I am not sure if we need to go that far. The observation jacobian H is almost the only consistently sparse vector/matrix quantity. I think it would be enough to have a sparse KHP computation, that takes benefit of knowing which are the non-zero element of H.

The question is should we store H as a Vector<float, 24>, initialize everything to 0, set the nonzero elements and pass it to a function computeKHP(..., Vector<float,24> H, Vector<size_t, numberOfNonZeroElements> indices) or
should we store H as a Vector<float, numberOfNonZeroElements>, set all elements and pass it to a function computeKHP(..., Vector<float,numberOfNonZeroElements> H, Vector<size_t, numberOfNonZeroElements> indices).
In the former version the entries of H can be easily interpreted, in the latter version one needs to know the mapping to the state. It should also be easier to autogenerate the code for the first version. But in the first version, we almost need to initialize the rest to 0, so it would be less efficient.

jkflying · 2020-07-20T10:16:39Z

Yes, my usage of this pattern was in a very different situation, with matrices of 5000+ x 5000+, but with lots of sparsity, that could be reduced down to a ~100x100 dense matrix with a permutation matrix. Particularly on larger processors (and GPUs) there is a lot of efficiency gained (~20x) by having your computations not being spread over a matrix, but by operating on adjacent elements. I'm not sure how much benefit it will have here, considering the small sizes of our vectors and that we don't have lots of tiered caches for memory (ie. L1/L2/L3/RAM/swap).

I don't see the (mathematical) need to initialize the other elements of the matrix to 0, since we can ignore them if we know what indices we need. The initialization is just because it is in a Vector, right? You could also make a 'reverse index' which maps indices in the sparse matrix to the dense matrix. Then the autogenerated code could use the 24-based indexing, which gets translated to the dense indexing. But this might be going too far, and with the small amount of computation we're doing I don't think would be worth the benefit.

priseborough · 2020-07-20T11:24:03Z

I've added a script file to run the test programs.

Signed-off-by: RomanBapst <bapstroman@gmail.com>

Also update outtput formatting

Add shell script to run tests. Update tests with Pass/Fail indicator.

priseborough · 2020-07-22T05:52:00Z

With respect to the use of the powf(x, y) function, although the compiler appears to be optimising it to x*x, it leaves the code at the discretion of the compiler which is not ideal. At yesterdays estimation call we discussed the options. One suggestion was to define an inlined ecl::powfx,y)( function to be used for integer values of y. I have implemented that in 0a4f5cc

The other options is to modify the python script to do a find and replace for all powf(x, 2) instances with sq(x). The total number of instances of powf(x,y) is not large, so an editor guided manual replacement is possible, but not ideal.

@dagar what is your view on relying on the compiler optinisations for this function. Can compilers be relied on to interpret powf(x,2) as x*x ? i don't have access to the ISO/IEC 9899:2011 document.

jkflying · 2020-07-22T11:40:45Z

GCC at least has been able to optimize powf(x, 2) for a very long time, I'm more concerned about the cases where we have something like powf(x, 1/2.) which will be orders of magnitude slower (and less accurate) than a sqrt(x) and doesn't get optimized even on current compilers. The ecl::powf solution will at least give a compiler error here, which can be manually fixed. At some point, if we really decide we hate the manual fixes, we could even make an ecl::powf(float, float) version which errors on everything except some obvious use-cases like 0.5.

Also maybe I missed it, but the ecl::powf could easily be added to the generator with something like

sed 's/\(ecl::\)\{0,1\}powf/ecl::powf/' generated/*.cpp

to make it happen with all files automatically.

dagar · 2020-07-22T14:01:52Z

@dagar what is your view on relying on the compiler optinisations for this function. Can compilers be relied on to interpret powf(x,2) as x*x ? i don't have access to the ISO/IEC 9899:2011 document.

Is the concern performance or correctness?

priseborough · 2020-07-23T03:28:44Z

@dagar The number of usages is not large so the concern is firstly correctness and secondly performance.

Edit: What i don't want happening is a future change in compiler producing a slightly different numerical result.

Also improve protection against badly conditioned calculation

priseborough · 2020-07-23T07:32:33Z

@jkflying There is one place in airspeed fusion where the code generator used powf(x, -1.0F/2.0F), but that can be reworke by hand. As you point out, the ecl::powf will generate a compiler error for instances where non integer exponents are used so your proposal for a global find and replace should work OK and we can handle the odd exception manually

priseborough · 2020-07-23T07:39:25Z

The feedback to date is that the overall approach looks OK, so I will now break this down into a number of smaller PR's starting with the covariance prediction and a separate PR for each fusion type.

kamilritz · 2020-07-23T07:46:09Z

@priseborough Regarding the observation jacobian H, it would be great if all of them could be of type Vector<float, numberOfNonZeroElements>

priseborough · 2020-07-23T08:00:16Z

@priseborough Regarding the observation jacobian H, it would be great if all of them could be of type Vector<float, numberOfNonZeroElements>

Doesn't work in all instances because sometimes we only need a few entries and optimise accordingly.

kamilritz · 2020-07-23T08:07:05Z

@priseborough Regarding the observation jacobian H, it would be great if all of them could be of type Vector<float, numberOfNonZeroElements>

Doesn't work in all instances because sometimes we only need a few entries and optimise accordingly.

The numberOfNonZeroElemens could vary for each measurement, eg for yaw Vector<float,4> H_YAW, for airspeed Vector<float, 5> H_TAS and so on . Would it be doable to optimise the non zero entries away in each single case? Or is this too much manual work?
It would be great if we could be consistent throughout all measurements. Because then we can have a single function doing the sparse KHP computation.

priseborough · 2020-07-23T08:21:32Z

@kamilritz There are sparse matrix techniques that could be applied, but we would need to provide an occupancy mask. This would end up trading memory use for code commonality. Lets have the discussion when the PR for the first observation method goes in.

If you are able to look at the options for doing this efficiently, I will in the meantime prepare the covariance prediction PR so we don't have to make a decision right now.

dagar · 2020-07-23T15:47:56Z

@dagar The number of usages is not large so the concern is firstly correctness and secondly performance.

Edit: What i don't want happening is a future change in compiler producing a slightly different numerical result.

It's not an unreasonable concern, although in this particular case I really don't think it's going to be an issue.

https://godbolt.org/z/7d1q87 - With any optimization at all (-O1) you can see powf(n, 2.f) is the same as n * n (__aeabi_fmul).

In general for this kind of thing it's another reason why I want to make sure we're continuously updating PX4/Firmware with each PX4/ecl change. On the PX4/Firmware side we can fairly easily monitor changes in execution time, memory usage, and flash usage over time across a range of boards. We lose a lot of detail when each PX4/ecl bump is large.

priseborough added the ekf label Jul 16, 2020

priseborough self-assigned this Jul 16, 2020

priseborough added the enhancement label Jul 16, 2020

priseborough changed the title ~~EKF: preliminary replacement of autocode with sympy generated output~~ EKF: replacement of autocode with sympy generated output Jul 16, 2020

kamilritz suggested changes Jul 16, 2020

View reviewed changes

priseborough force-pushed the pr-ekfSymPyDerivationConversion branch from dba97cc to a93349c Compare July 20, 2020 06:45

RomanBapst and others added 16 commits July 21, 2020 20:28

added python script with ekf derivation (WIP)

7cbc41c

Signed-off-by: RomanBapst <bapstroman@gmail.com>

worked on c code auto-generation

31faa03

Signed-off-by: RomanBapst <bapstroman@gmail.com>

save before variable name change

a15dd69

Signed-off-by: RomanBapst <bapstroman@gmail.com>

changed symbol names

ac4379b

Signed-off-by: RomanBapst <bapstroman@gmail.com>

added codegeneration class

34038c2

Signed-off-by: RomanBapst <bapstroman@gmail.com>

improve 3D mag fusion derivation

f677e26

Signed-off-by: RomanBapst <bapstroman@gmail.com>

EKF: Add all observation types to main filter sympy derivation

be55233

Also update outtput formatting

EKF: Add sympy derivation for yaw estimator equations

f1793a7

EKF: Replace matlab generated equations with sympy output

ae22583

EKF: Add test programs to compare sympy and matlab generated equations

3ca7c50

EKF: fix drag fusion rebase errors

6d41de6

EKF: fix mag fusion rebase errors

245a6e0

EKF: Don't do unnecessary Vector24f initialisation

bf5cf5f

EKF: Update sympy and matlab output comparison tests

28ac229

Add shell script to run tests. Update tests with Pass/Fail indicator.

EKF: Create inlined function for integer powers

0a4f5cc

EKF: Improve test rigour for sympy covariance comparison

7d74568

priseborough force-pushed the pr-ekfSymPyDerivationConversion branch from 757de17 to 7d74568 Compare July 21, 2020 11:59

priseborough added 2 commits July 23, 2020 17:13

EKF: improve protection against badly conditioned gps yaw fusion

e20d4ca

EKF: Use ecl:powf for integer exponents in EKFGSF_Yaw

d0072a5

Also improve protection against badly conditioned calculation

priseborough mentioned this pull request Jul 23, 2020

EKF: replacement of covariance prediction autocode with sympy generated output #870

Merged

EKF: simplify ecl::powf function

475ecfc

priseborough closed this Jul 30, 2020

EKF: replacement of autocode with sympy generated output #868

EKF: replacement of autocode with sympy generated output #868

Conversation

priseborough commented Jul 16, 2020 • edited Loading

kamilritz left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dagar commented Jul 18, 2020

priseborough commented Jul 20, 2020

priseborough commented Jul 20, 2020

kamilritz commented Jul 20, 2020

jkflying commented Jul 20, 2020

kamilritz commented Jul 20, 2020 • edited Loading

jkflying commented Jul 20, 2020

priseborough commented Jul 20, 2020

priseborough commented Jul 22, 2020

jkflying commented Jul 22, 2020

dagar commented Jul 22, 2020

priseborough commented Jul 23, 2020 • edited Loading

priseborough commented Jul 23, 2020

priseborough commented Jul 23, 2020

kamilritz commented Jul 23, 2020

priseborough commented Jul 23, 2020

kamilritz commented Jul 23, 2020

priseborough commented Jul 23, 2020

dagar commented Jul 23, 2020

priseborough commented Jul 16, 2020 •

edited

Loading

kamilritz left a comment •

edited

Loading

kamilritz commented Jul 20, 2020 •

edited

Loading

priseborough commented Jul 23, 2020 •

edited

Loading