# MPC for switched linear systems with random disturbances


In this live script, the goal is to control a switched linear system of the 
following form

$$x_{k+1}=A_{\sigma_{k}} x_{k}+w_{k}$$

where $x_{k} \in \mathbb{R}^{n}$ is the state at time $k, \ \sigma_{k} \in\{1, 
\ldots, m\}$ is a discrete input, $\left\{w_{k} | k \in \mathbb{N}_{0}\right\}$ 
is a sequence of stochastic disturbances with zero mean and covariance $W=\mathbb{E}\left[w_{k} 
w_{k}^{\top}\right]$ for every $k$. If we consider the following cost function

$$\lim _{h \rightarrow \infty} \frac{1}{h} \mathbb{E}\left[\sum_{\ell=0}^{h-1} 
x_{\ell}^{\top} Q x_{\ell}\right]$$

The goal is to provide an MPC inspired policy for this problem with horizon 
$H$, denoted by

$$\sigma _k \ = \ \mu (x_k),$$

which takes the following form

$$\sigma (x_k)\ = \ \Sigma _0 (x_k),$$

where

$$\left.\left(\Sigma_{0}\left(x_{k}\right), \Sigma_{1}\left(x_{k}\right), 
\ldots, \Sigma_{H-1}\left(x_{k}\right)\right)=\arg _{\bar{\sigma}_{k}, \bar{\sigma}_{k+1}, 
\ldots, \bar{\sigma}_{k+H-1}} \min \mathbb{E}[ \Sigma _{\ell=k}^{k+H-1} \  \ 
x_{\ell}^{\top} Q x_{\ell}\right|x_k]$$

$$\text{s.t. } x_{\ell+1}=A_{\bar{\sigma}_{\ell}} x_{\ell}+w_{\ell}$$

The values of $\overline{\sigma} _k \ , ..... ,\ \overline{\sigma} _{k+H-1}$ 
in the optimization should be seen as constant values (do not depend on the 
state). 

We can obtain an explicit expression for this policy. To this effect, we note 
that we can write, for fixed values of $\overline{\sigma} _k \ , ..... ,\ \overline{\sigma} 
_{k+H-1}$, the following cost 

$$\mathbb{E}[ \Sigma _{\ell=k}^{H-1} \  \ x_{\ell}^{\top} Q x_{\ell}|x_k]$$

as $x_k^T P_kx_k+\alpha_k$ for $k \in \{0,1,2,\dots,H-1\}$. In fact, we note 
that this is true for $k=H-1$ , since 

$$\mathbb{E}[ \Sigma _{\ell=H-1}^{H-1} \  \ x_{\ell}^{\top} Q x_{\ell}|x_{H-1}]=x_{H-1}^T 
\underbrace{Q}_{P_{H-1}}x_{H-1}+\underbrace{0}_{\alpha_{H-1}}$$

and assuming it is true for a given $k+1$, i.e., that $\mathbb{E}[ \Sigma 
_{\ell=k+1}^{H-1} \  \ x_{\ell}^{\top} Q x_{\ell}|x_{k+1}] = x_{k+1}^T P_{k+1}x_{k+1}+\alpha_{k+1}$ 
then it also holds for $k$ since

$$\mathbb{E}[ \Sigma _{\ell=k}^{H-1} \  \ x_{\ell}^{\top} Q x_{\ell}|x_k] 
=x_{k}^{\top} Q x_{k}+ \mathbb{E}[ \Sigma _{\ell=k+1}^{H-1} \  \ x_{\ell}^{\top} 
Q x_{\ell}|x_k] = \mathbb{E}[\underbrace{x_{k+1}}_{A_{\bar{\sigma}_k}x_k+w_k}^T 
P_{k+1}\underbrace{x_{k+1}}_{A_{\bar{\sigma}_k}x_k+w_k}+\alpha_{k+1}|x_k] = 
x_k^T\underbrace{A_{\bar{\sigma}_k}^T  P_{k+1}A_{\bar{\sigma}_k}}_{P_k}x_k+\underbrace{\mathbb{E} 
[w_k^T]}_{=0} P_{k+1}A_{\bar{\sigma}_k}x_k+\underbrace{\alpha_{k+1}+\mathbb{E} 
[w_k^T P_{k+1}w_k]}_{\alpha_k}$$

Note that $\mathbb{E} [w_k^T P_{k+1}w_k]=\text{trace}(P_{k+1}W)$. The matrices 
$P_k$ and the scalars $\alpha_k$ depend on the sequence $\overline{\sigma} _k 
\ , ..... ,\ \overline{\sigma} _{k+H-1}$. We can then pick the best sequence 
which minimizes $x_k^T P_0x_k+\alpha_0$.

The function `stochmpcsls` below provides this policy $\sigma \ = 
\ \mu (x)$, $\sigma \ \in \ \{1,....,m\}$ for a given state $x$. It takes as 
input the A, W and Q matrices of the switched system, in the cell format A{i}, 
W{i}, Q{i}, as well as the horizon H and the input x of $\mu (x)$. The output 
of the function is the policy $\sigma = \mu (x), \ \sigma \ \in \{1,...,m\}$for 
a given state $x$. An example is provided next.

In [None]:
import numpy as np

In [None]:
def stochmpcsls(A,W,Q,H,x):
    p = len(A)
    n = A[0].shape[1]
    # these instructions provide all the possible switching combinations
    # with H switching
    sigmaopt = np.zeros((p**(H-1),H-1))
    for k in range(H-1):
        sigmaopt[:,[k]] = np.kron( np.ones((p**(H-k-1-1) , 1)), np.kron(np.arange(p).reshape(p,1)+1, np.ones( (p**(k) , 1 ))))
    
    # compute matrices and scalars and corresponding cost of each switching
    # option
    alpha = np.zeros((p**(H-1), H))
    cost =  np.zeros(p**(H-1))
    P = [[np.zeros((n,n)) for idx in range(H)] for kdx in range(p**(H-1))]
    for ind in range(p**(H-1)):
        for k in range(H-1-1,-1,-1):#
            P[ind][k] = A[int(sigmaopt[ind,k])-1].T @ P[ind][k+1] @ A[int(sigmaopt[ind,k])-1] + Q
            alpha[ind,k] = np.trace(P[ind][k+1]@W) + alpha[ind,k+1] 
        cost[ind] = alpha[ind,0] + x.T @ P[ind][0] @ x
    
    # obtain the best switching option
    indmin = np.argmin(cost)
    # provided the first switching of the optimal switching sequence
    sigma = sigmaopt[indmin,0]
    
    return sigma

In [None]:
# Define the input variables
A = [np.array([[1,1],[3, -2]]),np.array([[0, 1],[3, -2]]) ]
W = np.eye(2)
Q = np.eye(2)
gamma = np.arange(0, 2*np.pi, 0.1)
xvec  = np.array([np.cos(gamma), np.sin(gamma)])
H = 5

sigma = np.zeros(xvec.shape[1])

for i in range(xvec.shape[1]):
    sigma[i]  = stochmpcsls(A,W,Q,H,xvec[:,[i]])

sigma