Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constrained MPC: Differences to MATLAB and Numeric Issues for different OS #929

Closed
JMK593 opened this issue Sep 6, 2023 · 11 comments · Fixed by #930
Closed

Constrained MPC: Differences to MATLAB and Numeric Issues for different OS #929

JMK593 opened this issue Sep 6, 2023 · 11 comments · Fixed by #930
Assignees

Comments

@JMK593
Copy link

JMK593 commented Sep 6, 2023

In the context of a masters course on constrained optimal control, we were looking to implement the classic cart-pendulum example, with a hard state constraint on the angle of the pendulum. Our implementation followed the constrained MPC aircraft model example.

During our implementation we observed that the output control sequences from the “optimal.solve_ocp” were different than from MATLAB’s “mpcActiveSetSolver” for the same model, stage cost, and prediction horizon. The following are the control sequences at time 0:
solve_ocp: 8.085e-01 1.464e-02 4.912e-01 7.621e-01 5.672e-01 5.080e-01 4.151e-01 2.398e-01 0.000e+00 0.000e+00
MATLAB: 0.8750 -0.1281 0.6964 0.7551 0.7363 0.7937 0.6934 0.6644 0.6041 0.4130

We also observed that the output from “input_output_response” simulation differed based on platform (Windows vs Linux/MacOS). On Linux/MacOS the optimiser would fail to meet the state constraints, while this behaviour does not occur on Windows.

I have included an implementation of our code in both Python and in JupyterLab, environment package lists, plots of the constrained state, and a comparison version in MATLAB: MPC_PythonControl_Analysis.zip

Are there any plans for classic MPC implementation for the cost function to directly compute the state predictions using the discrete-time linear dynamics rather than numerically computed using the trapezoid rule (line 311-322 in optimal.py)?

Are you able to provide any advice or guidance on why the numeric simulation result is different between OS versions?

@murrayrm murrayrm self-assigned this Sep 6, 2023
@murrayrm
Copy link
Member

murrayrm commented Sep 6, 2023

Thanks for your comments!

The control.optimal module is just a wrapper around scipy.optimize.minimize, so the differences in platforms probably trace back to SciPy (nothing in python-control is OS dependent, since it is pure Python).

During our implementation we observed that the output control sequences from the “optimal.solve_ocp” were different than from MATLAB’s “mpcActiveSetSolver” for the same model, stage cost, and prediction horizon.

It's odd that the results seem to be very different from MATLAB, since they should be solving the same underlying optimization problem (though with different solvers). One thing to try might be using different solvers (using the minimize_method keyword) to see if that changes the results.

We also observed that the output from “input_output_response” simulation differed based on platform (Windows vs Linux/MacOS). On Linux/MacOS the optimiser would fail to meet the state constraints, while this behaviour does not occur on Windows.

We have seen in some unit testing that different builds can sometimes have convergence issues (see, for example, #838, #918). These seems to come from different choices of BLAS and other routines. Can you use np.show_config() on the two platforms and post the results here so that we can see if that might explain the differences.

Are there any plans for classic MPC implementation for the cost function to directly compute the state predictions using the discrete-time linear dynamics rather than numerically computed using the trapezoid rule (line 311-322 in optimal.py)?

I think the only way to do this right now is to explicitly create a discrete time nonlinear I/O system. If someone wanted to add a integral_cost_method parameter that allowed different choices, that would be another approach.

@JMK593
Copy link
Author

JMK593 commented Sep 7, 2023

Thank you Professor Murray for your quick response.

It's odd that the results seem to be very different from MATLAB, since they should be solving the same underlying optimization problem (though with different solvers).

I have tried all solvers available to scipy.optimize.minize on the Mac with ARM and Linux systems. Results here: Scipy_optimization_solver.zip

No method produced results the same as MATLAB. Several solver types returned "Method %s cannot handle constraints.", or returned an error requiring the Jacobian function directly as an input. I am intrigued to notice that the optimization results change based on the OS/architecture, see for example 'TNC' or 'Powell' methods.

This result indicates to me that the underlying optimization problem is not the same. Perhaps this difference in the underlying optimization problem comes from the use of the trapezoidal integration of the cost function over the prediction horizon in optimal.py. In the case of my problem (deterministic, discrete-time, LTI) the Hessian can be directly computed at initalization of the controller. I perhaps expected 'create_mpc_iosystem' or 'solve_ocp' to do this step if it was available based on the dynamics.

Can you use np.show_config() on the two platforms and post the results

Please see numpy.show_config() on four platforms (Windows-intel, Mac-intel, Mac-arm, Linux): numpy_configs.zip

The control.optimal module is just a wrapper around scipy.optimize.minimize, so the differences in platforms probably trace back to SciPy (nothing in python-control is OS dependent, since it is pure Python).

different builds can sometimes have convergence issues (see, for example, #838, #918).

I am concerned that scipy.optimize.minimize and numpy appear to return different numeric results depending on OS or system architecture! Additionally, for teaching purposes we intended to use JupyterLab hosted on our university's HPC. Our HPC runs openSUSE OS and has both Intel and non-Intel CPU nodes, thus making it an 'exotic' platform in the sense of #838.

@murrayrm
Copy link
Member

murrayrm commented Sep 9, 2023

I took a look at this example. A couple of comments:

  • At a high level, all of the methods are giving roughly the same response for the MPC controller. There are small differences, as would be expected from using different algorithms, with different convergence criteria. Some of the optimizers are struggling because the optimal controller pushes up against the angle constraint.
  • The default solver is SciPy-version (and perhaps OS) dependent. On my Mac, it defaulted to 'SLSQP', which was not able to converge on at least a few of the MPC iterations (which led to a slight constraint violation), but when I changed the method to 'trust-constr', it worked OK (but took a long time).
  • One issue that I uncovered is that there is no way to set the minimize_method when setting up an MPC problem (via create_mpc_iosystem). I'll fix that in an upcoming PR (and I used that local change to switch the solver).

I don't have MATLAB on my computer, so I can't yet sort out why the MATLAB solution to the fixed horizon problem is (quite) different than the python-control version. Since this is a discrete time system, the trapezoidal rule is not being used (the code lines mentioned above only gets used if ct.isctime(self.system) evaluates to True, which won't be the case here).

I'll keep looking at this, but wanted to post some initial observations.

@murrayrm
Copy link
Member

I did a bit more looking into this, trying to understand the difference in the results from MATLAB versus those computed using solve_ocp.

I used the following code to compute the cost of the trajectory associated with the inputs returned by MATLAB (as reported above):

matlab_inputs = [
    0.8750, -0.1281, 0.6964, 0.7551, 0.7363, 
    0.7937, 0.6934, 0.6644, 0.6041, 0.4130]
matlab_resp = ct.input_output_response(system, np.arange(0, finiteHorizon) * Ts, matlab_inputs, x_0)
matlab_cost = sum(map(
                cost, np.transpose(matlab_resp.states[:, :-1]),
                np.transpose(matlab_resp.inputs[:, :-1])))

This returns a cost of 569.2907752964345.

The result of the same computation applied to the results from the optimal cost returned by obc.solve_ocp is 568.2949584258572. This would imply that the MATLAB cost is not in fact optimal, at least according to the cost function used here.

A possibility is that MATLAB is including the terminal point of the trajectory in the cost calculation (whereas solve_ocp uses the terminal_cost parameter to set the value for that). Replacing the solve_ocp call with

res_terminal = obc.solve_ocp(
    model, np.arange(0, finiteHorizon) * Ts, x_0.flatten(), cost, constraints,
    terminal_cost=cost)

yields

res_terminal.inputs = [
    0.83727566 -0.04713427  0.58002926  0.86883298  0.62362542  
    0.59600354  0.54358056  0.41263552  0.21386074  0.        ]

and a cost of 628.2607745091476. The MATLAB optimal trajectory with the added terminal cost gives total cost of 628.9253400095968, so still not optimal.

Conclusion based on these data: MATLAB seems to be solving a different problem than solve_ocp. This could be a bug in solve_ocp or just a difference in the way that problems are set up.

@JMK593: If you have any thoughts/insights, please post here.

@JMK593
Copy link
Author

JMK593 commented Sep 13, 2023

Thank you @murrayrm for your continued investigation.

I have done some further analysis.
First, I found an error in the computation of my Hessian, my apologies. I left out a bracket.
Previous: 2 * Phi.'*Qbar*Phi + Rbar
Correct: 2 * (Phi.'*Qbar*Phi + Rbar)
This changes the control action penalty and thus the output control sequence from the optimizer. However, the underlying issue still remains.

Second, following your most recent post I investigated the ct.input_output_response. I believe the bug may be here as I assume it is used in the creation of the optimizer problem.

For clarity I have outlined my dynamics and finite horizon cost function here: Python_Control_Testing_Issue_929.pdf

Note:

  • For simplicity I use same state penalty in the stage cost for the terminal state penalty.
  • Also there are two ways of writing the cost function, a full form and simplified. The simplified version removes the constant component. The optimization problem is equivalent.

Python and updated MATLAB Code here: Code.zip

In MATLAB I implemented the simplified cost function (equation (4)) using the mpcActiveSetSolver subject to the inequality constraints (equation (5)).
Following my fix to the Hessian (noted above) the control sequence is: 0.851223168355364 -0.077048264865308 0.623036904869950 0.826831770859904 0.665257404304965 0.685763761198126 0.575030284310556 0.495404310892419 0.363276926486268 0.180415598218110
Cost Simplified: -16.904383936469159
Cost Full: 6.870956160635309e+02

To provide a fair comparison, I implemented this same simplified cost function in scipy.optimize.minmize directly, subject to the linear inequality constraints. This bypasses obc.solve_ocp. Subject to slight differences from the different optimization routines, one would expect the result to be the same, within floating point error tolerance.

costFunc = lambda u: 0.5 * u.transpose() @ H @ u + fT.transpose() @ u
output_directHessianLambda = scipy.optimize.minimize(costFunc, guess0.flatten(), constraints=constrainFunc)

Control Sequence: [ 8.512e-01 -7.706e-02 6.231e-01 8.268e-01 6.653e-01 6.858e-01 5.750e-01 4.954e-01 3.632e-01 1.804e-01]
Cost (simplified): -16.904383924727043

The result is almost the same as MATLAB! (Difference in cost appears to be smaller than single floating point error tolerance.)

Consider a direct implementation of the cost function using the dynamics. This is implementation of equation (3).

def costFuncFULLWithState(useq):
    state = x_0
    cost = 0
    err = state - xstar
    cost += err.transpose() @ Q @ err
    for uk in useq:
        state = A@state + B * uk
        err = state - xstar
        cost += err.transpose() @ Q @ err + uk * R * uk
    return cost

Control Sequence: [8.512e-01 -7.706e-02 6.231e-01 8.268e-01 6.653e-01 6.857e-01 5.750e-01 4.954e-01 3.632e-01 1.804e-01]
Cost (full): 687.0956160759122

The result is almost the same sequence as above, and is almost the same as MATLAB (both sequence and cost)!

Now let us try with the ct.input_output_response function to compute the states over the finite horizon as a function of the input control sequence. Then using the predicted states, we compute the cost.

def costFuncWithState_io_resp(useq):
    io_state = ct.input_output_response(system, np.arange(0, finiteHorizon) * Ts, useq, x_0.flatten())
    cost = 0
    counter = -1
    for state in io_state.states.transpose():
        err = np.reshape(state,(4,1)) - xstar
        counter += 1
        cost += err.transpose() @ Q @ err + useq[counter]*R*useq[counter]
    return cost

Control sequence: [ 8.373e-01 -4.719e-02 5.801e-01 8.687e-01 6.237e-01 5.959e-01 5.436e-01 4.126e-01 2.140e-01 2.518e-04]
Cost (full): 628.2607745678216

Intriguingly, the result is now different, but is similar to the output in your recent response. As you noted the cost is also smaller than the output from MATLAB.

Let us inspect whether ct.input_output_response for a given control sequence of length 10 (i.e. u_0, u_1, ..., u_9) returns a state sequence of length 11 (i.e. x_0, x_1, ..., x_10).

Terminal output (state dynamics directly): terminaloutput_direct.txt

Terminal output (ct.input_output_response): terminaloutput_input_output_response.txt

Confirmed it does not! Given the control sequence u_0, u_1, ..., u_9 (10 values), the function ct.input_output_response only returned x_0, x_1, ..., x_9 (10 values). The terminal state value (x_10 = A x_9 + B u_9) was not returned and the final value of the control sequence was not utilized! Thus the terminal state and final control action was not used in the cost function! This perhaps explains why there is a non-zero number for the last control action while the obc.solve_ocp returns a 0 for the final control action.

Now including the final state value x_10 directly. Inserting the following code in the cost function computation after the for loop:

finalstate = A@np.reshape(state,(4,1)) + B * useq[-1]
err = finalstate - xstar
cost += err.transpose() @ Q @ err

Control sequence: [8.512e-01 -7.706e-02 6.231e-01 8.268e-01 6.653e-01 6.858e-01 5.750e-01 4.954e-01 3.632e-01 1.804e-01]
Cost (full): 687.0956160758178
This is now approximately the same sequence and cost as MATLAB, and the earlier implementations.

This confirms the original suspicion that the obc.solve_ocp was not solving the same underlying optimization problem as the MATLAB implementation. The ct.input_output_response function does not return the terminal state as a function of the final control action.

I have not found anything yet to suggest that platform (OS/CPU architecture) influenced the output from scipy.optimize.minize. Given the same functional problem, scipy.optimize.minize returned the same output as MATLAB (subject to tolerances). If it is of interest, I can run this script on other machines and report back, although I do not anticipate any differences.

Final note, in the above I used the linear constraints computed in the form Ain u <= Bin. This was consistent across all implementations of scipy.optimize.minimize and my MATLAB code.

Ain = np.concatenate((-Phi,Phi,-np.eye(finiteHorizonINT),np.eye(finiteHorizonINT)),axis=0)
Bin = np.concatenate((Lambda@x_0 - xmin, xmax - Lambda@x_0, -umin, umax),axis=0) # note: function of the state
constrainFunc = scipy.optimize.LinearConstraint(Ain, lb=-np.inf, ub=Bin.flatten(), keep_feasible=False)

If the constriants in obc.solve_ocp are computed using ct.input_output_response then I might suspect that a fix to ct.input_output_response to include the final state may change the constraints given to the optimizer through obc.solve_ocp.

@murrayrm
Copy link
Member

Thanks for all of this extra information @JMK593. I think it is now clear what is going on and indeed the problem that was solved in MATLAB is different than the one solved by solve_ocp. However, I don't think the issue is a bug in input_output_response, but rather a change needs to be made to solve_ocp. Here's why...

As with MATLAB, input_output_response (lsim in MATLAB) returns a simulation with the same number of time points as the length of the time and input vectors. So if you pass time and input arrays of length $N$ (so $[t_0, t_1, ..., t_{N-1}]$ and $[u_0, u_1, ..., u_{N-1}]$), you get back a response of length $N$ (so $[x_0, x_1, ..., x_{N-1}]$). The "final" state $x_N$ is not computed and hence the final input $u_{N-1}$ is not used. This matches the documented behavior for both input_output_response and the MATLAB lsim command ("y = lsim(sys,u,t) returns the system response y, sampled at the same times t as the input. For single-output systems, y is a vector of the same length as t").

The optimal control computation in solve_ocp uses input_output_response internally. However the "trajectory cost" (cost parameter) does not currently include the last point in the simulation (so time point $t_{N-1}$). You could include that point by using the terminal_cost parameter, but I think this is still not quite the behavior we want. What I think most people would expect is that the terminal point for a discrete time system would correspond to the state $x_N$, which is the state that you get after applying the final input $u_{N-1}$.

With the current code, if you want to get the same optimal control problem that you passed to “mpcActiveSetSolver” in MATLAB, then the function call needs to look like this:

res = obc.solve_ocp(
    model, np.arange(0, finiteHorizon + 1) * Ts, x_0.flatten(), cost, constraints,
    terminal_cost=cost)

This returns the input

[ 0.8512555  -0.07711761  0.62313661  0.8267344   0.66535392  0.68563258
  0.57505296  0.4952667   0.36327124  0.18020503  0.        ]

which is the same one you reported above for the manually constructed optimization.

However, I believe that the proper implementation of solve_ocp on discrete time systems would be to have the trajectory cost include everything up to time $t_{N-1}$ and then have the terminal cost be applied to the state at time $t_N$. In particular, for a finite-time, linear quadratic optimal control problem, the cost should be

$J = \Sigma_{k=0}^{k=N-1} \left( x_k^T Q x_k + u_k^T R u_k \right) + x_N^T Q_N x_N$

With that change (not yet implemented), then the original function call

res = obc.solve_ocp(
    model, np.arange(0, finiteHorizon) * Ts, x_0.flatten(), cost, constraints)

should work just fine.

Making this change in the code will cause inconsistencies for anyone relying on the current behavior for solve_ocp applied to discrete time systems (and for MPC controllers created using create_mpc_iosystem). However, since we are moving from version 0.9.4 to version 0.10.0, this would probably be the right time to make this change. I'll include it in PR #930 and that should fix things so that your original call returns an output similar to what you see in MATLAB (and the hand constructed optimal control problem you passed to SciPy).

@JMK593
Copy link
Author

JMK593 commented Sep 14, 2023

Thank you @murrayrm. I now see the underlying issue. I look forward to testing version 0.10.0 with the revised solve_ocp and create_mpc_iosystem.

Regarding MATLAB's lsim, I think it is misleading to take a sequence of control actions of length N then return a state sequence of length N and completely ignoring the final control action.

As an improvement to python-control over this misleading MATLAB usage, in the case of a discrete-time system, if given an intial condition (x_0) with associated time (t_0) and a control sequence of length N, is it necessary to also require an input of the time points? Would it not be easy to compute the output times associated with the state sequence? Then given an initial state condition and control sequence of length N, the output sequence would be the appropriate length of N+1 for that control sequence of length N.

@murrayrm
Copy link
Member

I played around a bit more with implementing the change I described above and decided that the way we have it is actually the most clear (and of course maintains backward compatibility). The main reason for sticking with the current implementation is that all sorts of things get complicated when you start returning inputs and states that are a different length than the list of time points that you specified (which I found out by seeing how many unit tests I started having to modify, as well as the number of examples that broke).

So for now the current behavior is as follows: for a discrete time system if you give a list of time points $[t_0, \dots, t_N]$, corresponding to a horizon of length $N$ (and a array of length $N+1$), then the trajectory cost is computed at the time points $[t_0, \dots, t_{N-1}]$ and the terminal cost is computed at the time point $t_N$. This makes sense (to me) since you are explicitly specifying the points in time and the last point corresponds to the final time. I've added some additional comments in the documentation to make this more clear.

Following that documentation (see PR #930), if you want to have the stage cost applied to all points include the last one, you should use a call of the form

res = obc.solve_ocp(
    model, np.arange(0, finiteHorizon + 1) * Ts, x_0, cost, constraints,
    terminal_cost=cost)

To see why this makes sense, suppose represent the value of the finiteHorizon as $N$ and assume Ts is 1 (to simplify the discussion). This will set the timepts vector to $[0, 1, ..., N-1, N]$ and the final point in the optimization is at time $N$. If you want to include the cost of the state at time $N$ in the overall cost, you need to set terminal_cost=cost. All of that gives the same optimization problem as the one assembled manually in MATLAB or SciPy in the discussion above.

This may all seem odd, but to me it makes more sense when you think about the fact that np.arange returns a list of points that does not include the stop value. So if you have a horizon of length $N=1$, you get the associated time points ($[0, 1]$) by using the command np.arange(0, 2), and your initial time is $k = 0$ and time at the end of the horizon is $k = 1$.

@murrayrm murrayrm linked a pull request Sep 16, 2023 that will close this issue
@JMK593
Copy link
Author

JMK593 commented Sep 18, 2023

Thank you @murrayrm for considering my suggestion.

I further explored the current work around for solve_ocp in the context of simulating the system. In the following I explore the difference between the control sequences from using solve_ocp and from scipy.optimize.minimize with the cost function directly used.
Files here: Code_V2.zip

The current workaround to achieve the same cost function as a finite horizon of length N, is to add +1 to the horizon which will return a control sequence of length N+1 where the final value is 0. We would expect that the final value of the returned control sequences from the optimizer would stay at 0 for all time as this control action appears to not be included in the cost function.

For optimal control problem

obc.solve_ocp(system, np.arange(0, finiteHorizon+1) * Ts, xin, cost, constraints, terminal_cost=cost, squeeze=True)

and dynamics simulated by

x[:,ind+1] = (A@xcurrent + B*uapplied[ind]).flatten()

intriguingly the final value (N+1, the extra output from solve_ocp) does not remain at 0.
1_CONTROLOPTIMALSOLVEOCP_final_sequence_value
This difference impacts the first value of the control sequence (the applied control action).
2_CONTROLOPTIMALSOLVEOCP_diff_applied_control
Note: This is just the direct numeric difference (a-b).

I further explored by removing the terminal_cost=cost from solve_ocp.

obc.solve_ocp(system, np.arange(0, finiteHorizon+1) * Ts, xin, cost, constraints, squeeze=True)

The final value of the sequence does remain at zero but the applied control is considerably different.
3_CONTROLOPTIMALSOLVEOCP_NOTERMCOST_final_sequence_value
4_CONTROLOPTIMALSOLVEOCP_NOTERMCOST_diff_applied_control

Finally, I created a new terminal cost with R=0.

termcost = obc.quadratic_cost(system, Q, 0, x0=xstar.flatten(), u0=0)
obc.solve_ocp(system, np.arange(0, finiteHorizon+1) * Ts, xin, cost, constraints, terminal_cost=termcost, squeeze=True)

The final value of the control sequence remains at zero as desired. The applied control action is still different.
5_CONTROLOPTIMALSOLVEOCP_SPECTERMCOST_final_sequence_value
6_CONTROLOPTIMALSOLVEOCP_SPECTERMCOST_diff_applied_control

While the workaround and potential solution posed in the earlier answers is sufficient to produce the same optimization problem for the initial condition with the same optimization output, the solution does not work for subsequent state values. This includes different choices for the terminal cost. The issue remains and cannot be closed.

Additionally, in my initial post, I raised a second issue where the use of create_mpc_iosystem and feedback and simulation using input_output_response results in different numeric results on different platforms.
Creating the MPC object with the proposed workaround

ctrl = obc.create_mpc_iosystem(system, np.arange(0, finiteHorizon +1) * Ts, cost, constraints, terminal_cost=cost)

and comparing the control actions to a simulation using the optimal control problem at each time

res = obc.solve_ocp(system, np.arange(0, finiteHorizon+1) * Ts, xcurrent.flatten(), cost, constraints, terminal_cost=cost, squeeze=True)

with dynamics computing the next state at each time

ct.input_output_response(system, np.arange(0, 2)*Ts, useq[ind,0:2], xcurrent.flatten())

By my limited understanding of the how create_mpc_iosystem and feedback operate, I would expect that the underlying optimization problem and propagation of the dynamics to be the same. However, I found that the applied control action is different.
9_mpcio_loop_diff_applied_control
I have assumed that the outputs from input_output_response operated on a feedback loop are in the form of [states, control sequence]. I additionally noted that the first value (t=0) of all of the control sequences is 0, and the second value (t_1) of the control sequences matches the optimization output from earlier discussions.

Finally, for Linux and Mac platforms the resulting simulation is still constraint violating. In my analysis (see the above comments), when scipy.optimize.minimize is given the optimization problem directly there appears to be no or minimal numeric issues. The numeric issue of loop simulation with constraint violating differences between platforms has not yet been addressed.

@murrayrm
Copy link
Member

It will take me a while to look through all of that, but one quick comment:

I have assumed that the outputs from input_output_response operated on a feedback loop are in the form of [states, control sequence].

This is not correct. The output of the feedback function is whatever the outputs are for the system given in the first argument. The actual input from the controller is not available (though it could be recomputed from the internal state of the controller).

If you want to create a system that has both the outputs from the system and the inputs from the controller, you need to use interconnect and explicitly describe what is connected to what. Alternatively, in PR #930 you can use create_statefbk_iosystem:

loop = ct.create_statefbk_iosystem(system, ctrl)

which will have output given by [system states, control output]. (I added this functionality while debugging because having to use interconnect was a pain.)

@JMK593
Copy link
Author

JMK593 commented Sep 18, 2023

Thank you for the quick response.

I stated incorrectly that I used the outputs from res = ct.input_output_response, implying that I used res.outputs. I actually used res.states.

I found that the res.states is of size: number of states + length of finite prediction horizon. I naively assumed that after the system states (the first set of values) that the rest of res.states would be the output values of the optimizer in the closed-loop. I also noted that in this second set of states (after the system states) the second value (at discrete-time index 1) had the same values as the output of the optimizer from earlier. Observe in the last figure above, the difference in control action between my two tests remains at 0 for several steps. (The control actions in both simulations are definitely non-zero initially.) Hence my naive assumption that this was the control action.

As suggested I have re-run the loop simulation using the updated functions

clctrl , clsys = ct.create_statefbk_iosystem(system, ctrl)

statefbk_results = ct.input_output_response(clsys, timeVec, 0, x_0.flatten())

Please see the difference between the extracted applied control action from statefbk_results.outputs[:,:-1] to computing the control action from obc.solve_ocp and propagating the state via ct.input_output_response at every time step (i.e. operating the time loop manually). The shape of the curve appears to be similar to the earlier figure.
8_diffinappliedcontrolfromcreate_statefbk_iosystem

Computing the difference between what I assumed was the control action and the output from ct.input_output_response is different. Although, I do note that the first few values of this difference are zero.
9_appliedcontroldiff_mpcio_loop_to_create_statefbk

Also for clarity here is the difference in control action between using the loop from ct.create_statefbk_iosystem and directly computing the control action from scipy.optimize.minimize and propagating the state manually. Also again, the first few values of the difference are zero.
10_appliedcontroldiff_create_statefbk_TO_originalsumofstate

Updated code here: Code_V3.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants