# Control of switched linear systems


In this live script we will learn techniques to control a switched linear 
system using approximate dynamic programming and apply these techniques to a 
resource-aware control problem.

Consider a switched linear system

\begin{equation*}
x_{k+1} = A_{\sigma_k} x_k, k \in \mathbb{N}_0
\label{eq:sw_lin_sys} \tag{1}
\end{equation*}

where $x_k \in \mathbb{R}^n$ is the state, $\sigma_k \in \{1,2,\dots,m\}$ is the switching input, at time $k \in \mathbb{N}_0$, and $A_i \in \mathbb{R}^{n\times n}$, $i\in \{1,2,\dots,m\}$. 

Our goal is to compute the control switching input $\sigma_k$, $k \in\{0,1,\dots,h-1\}$, that minimizes a finite horizon cost

\begin{equation*}
\sum_{k=0}^{h-1}x_k^TQ_{\sigma_k} x_k +x_h^T\bar{Q}_h x_h
\label{eq:fin_hor_cost} \tag{2}
\end{equation*}

where $Q_i \in \mathbb{R}^{n\times n}$, $i\in \{1,2,\dots,m\}$, and $\bar{Q}_h \in \mathbb{R}^{n\times n}$, for a given initial condition $x_0$. 

For a small $h$ for example $h<15$ one can call provide the optimal solution by calling the following function
```
sigmastar, cost = slsoptimal(A,Q,Qh,h,x0)
```
 
whose input arguments are:
 
* a cell array such that $A\{i\} = A_i$, $i \in \{1,\dots,m\}$. 
* a cell array such that $Q\{i\} = Q_i$, $i \in \{1,\dots,m\}$. 
* matrix $\bar{Q}_h$ characterizing the terminal cost.
* number of decision stages $h$.
* initial condition $x_0$.
 
and the outputs are:
 
* row vector such that sigmastar[i+1] = $\sigma^*_i$, $i \in \{0,1,\dots,h-1\}$, where $(\sigma^*_0,\sigma^*_1,\dots,\sigma^*_{h-1})$ are the optimal switching inputs for the initial condition $x_0$.
* value of the cost for the optimal switching input for the given initial condition.
 
Motivated by the need to find approximate solutions when the horizon increases 
we will consider first a model predictive control policy. This can be tested 
by calling
```
sigma, cost = slsmpc(A,Q,Qh,h,x0,H)
```
 
The input and output arguments of this function are as for the function slsoptimal, 
except for the input argument $H$, which is the prediction horizon- an integer 
greater than or equal to $2$. For example: sigma is a row vector such that  
sigma(i+1) = $\sigma^*_i$, $i \in \{0,1,\dots,h-1\}$, where $(\sigma^*_0,\sigma^*_1,\dots,\sigma^*_{h-1})$ 
are the switching inputs obtained with an MPC policy for the initial condition 
$x_0$ and cost is the corresponding value of original cost. 

Alternatively to model predictive control, we can consider a rollout policy 
with a prediction horizon $H \geq 1$ and a base policy characterized as follows. 
Given a vector $v = (v_0,v_1,\dots,v_{q-1})$ of switching inputs $v_i \in \{1,2,\dots,m\}$, the base policy, denoted by $\bar{\sigma}_\ell$, $\ell \geq 0$, is obtained by periodically repeating the inputs in $v$, i.e., 

$$\bar{\sigma}_\ell = v_{\ell \text{ mod } q}, \ell \geq 0$$

where $\ell \text{ mod }q$ is the remainder after division of $\ell$ by $q$. A function

```
sigma, cost = slsrollout(A,Q,Qh,h,x0,H,v)
```

to compute the control switching input $\sigma_k, k \in\{0,1,\dots,h-1\}$, 
obtained from a rollout policy considering the original cost. The input and 
output arguments of this function are as in $\eqref{eq:sw_lin_sys}$, $\eqref{eq:fin_hor_cost}$ except for the input argument v, which is a row vector characterizing the base policy such that  v(i+1)$=v_i$, $i\in \{0,1\dots,q-1\}$.

```
sigma, cost = slsrolloutasymptotic(A,Q,x0,H,v,Niter)
```

to compute the control switching input $\sigma_k$, $k\in \mathbb{N}_0$, 
obtained from a rollout policy with prediction horizon $H \geq 1$ and a base 
policy characterized as in $\eqref{eq:inf_hor_cost}$ but now considering an infinite horizon cost

\begin{equation*}
\sum_{k=0}^{\infty}x_k^TQ_{\sigma_k} x_k
\label{eq:inf_hor_cost} \tag{3}
\end{equation*}

Assume that the cost is bounded when the base policy is applied. The input 
parameters are as in the previous questions except for Niter denoting a (large) 
number of iterations of interest $N_{\text{iter}}$. The ouput arguments are: 
(i) a row vector sigma such that sigma(k+1)=$\sigma^*_k$ where $(\sigma^*_0,\sigma^*_1,\dots,\sigma_{N_{\text{iter}}})$ are the switching inputs obtained by the rollout algorithm in the interval $k \in \{0,\dots,N_{\text{iter}}\}$; and (ii) a positive number cost defined 
as an approximation of the infinite horizon cost:

$$\sum_{k=0}^{N_{\text{iter}}}x_k^TQ_{\sigma^*_k} x_k$$

Let us apply these policies to the simultaneous control of two systems where 
the actuation of only one subsystem can be updated at each time step, scheduled 
by the switching input. Assume that the dynamics of each subsystem $i$ are given 
by 

$$x^i_{k+1}=A_dx^i_k+B_du^i_k, i\in\{1,2\}$$

When one system's control input is not updated it is assumed to be zero and 
the system evolves in open loop according to the following dynamics

$$x^i_{k+1}=A_dx^i_k, i\in\{1,2\};$$

Instead, when the control input is updated it follows a linear state feedback 
policy $u^i_k = K_dx_k^i$ and the systems state is updated according to 

$$x^i_{k+1}=(A_d+B_dK_d)x^i_k, i\in\{1,2\}$$

The cost is assumed to take the form

$$\sum_{k=0}^h x^i_k{}^T Q_dx^i_k+u^i_k{}^T R_du^i_k$$

Taking $x_k = \left[ \matrix{x_k^1 \cr x_k^2}  \right]$, considering the overal 
cost to be the sum of costs for each subsystem, and assuming that $\sigma_k=i$ 
pertains to updating system $i$, we can formulate such a problem in the framework 
of switched linear systems $\eqref{eq:sw_lin_sys}$ and either $\eqref{eq:fin_hor_cost}$ or $\eqref{eq:inf_hor_cost}$ with

$$A_1 = \left[ \matrix{A_d+B_dK_d & 0 \cr 0 & A_d } \right], A_2 = 
\left[ \matrix{A_d & 0 \cr 0 & A_d+B_dK_d } \right]$$

and

$$Q_1 = \left[ \matrix{Q_d+K_d^TR_dK_d & 0 \cr 0 & Q_d } \right], Q_2 
= \left[ \matrix{Q_d & 0 \cr 0 & Q_d+K_d^TR_dK_d } \right].$$

This formulation is carried out with the following instructions, where `optx0`
can be used to select different initial conditions.

In [None]:
import numpy as np
from numpy import array
from scipy import signal
import matplotlib.pyplot as plt
import control
control.use_numpy_matrix(False)
from course_functions import dlqr
%matplotlib ipympl

In [None]:
def slsoptimal(A, Q, Qh, h, x0):
    n = A.shape[1]
    m = A.shape[0]
    x = np.zeros((n, h+1))
    cost_ = np.zeros(m**h)
    sigma = np.zeros((m**h, h), int)
    x[:,[0]] = x0

    # these instructions provide all the possible switching combinations
    # with h switchings
    for k in range(h):
        sigma[:, [k]] = np.kron(np.ones((m**(h-k-1), 1)), np.kron((np.arange(m)+1).reshape((m, 1)), np.ones((m**k, 1))))

    sigma -= 1

    # compute the cost of each switching possibility
    for ind in range(m**h):
        for l in range(h):
            x[:, l+1] = A[sigma[ind, l]] @ x[:, l]
            cost_[ind] += x[:, l].T @ Q[sigma[ind, l]]@x[:, l]
        cost_[ind] += x[:, h].T @ Qh @ x[:, h]

    # select the minimum cost and corresponding sequence
    indopt = np.argmin(cost_)
    cost = cost_[indopt]
    sigmastar = sigma[indopt,:]
    
    return sigmastar, cost

Function to compute the MPC approximate solution

In [None]:
def slsmpc(A, Q, Qh, h, x0, H):
    n = A.shape[1]
    m = A.shape[0]
    x = np.zeros((n, h+1))
    sigma = np.zeros(h, int)
    sigmaopt = np.zeros((m**H, H), int)
    cost = 0.0
    x[:,[0]] = x0

    # these instructions provide all the possible switching combinations
    # with H switchings
    for k in range(H):
        sigmaopt[:, [k]] = np.kron(np.ones((m**(H-k-1), 1)), np.kron((np.arange(m)+1).reshape((m, 1)), np.ones((m**k, 1))))

    sigmaopt -= 1

    for k in range(h):
        # compute the cost of all switching options within the horizon
        x_ = np.zeros((n, min(h-k+1, H+1)))
        cost_ = np.zeros(min(m**(h-k), m**H))
        x_[:, 0] = x[:, k]
        for ind in range(min(m**(h-k), m**H)):
            for l in range(min(h-k, H)):
                x_[:, l+1] = A[sigmaopt[ind, l]] @ x_[:, l]
                cost_[ind] += x_[:, [l]].T @ Q[sigmaopt[ind, l]] @ x_[:, [l]]
            if h-k <= H:
                cost_[ind] += x_[:, l+1].T @ Qh @ x_[:, l+1]

        # compute the switching sequence which leads to the minimum value
        indopt = np.argmin(cost_)
        # take the first switching of the best swithing sequence
        sigma[k] = sigmaopt[indopt, 0]
        # apply it to the system and get the cost at this iteration
        x[:, k+1] = A[sigma[k]] @ x[:, k]
        cost += x[:,k].T @ Q[sigma[k]] @ x[:, k]
    
    cost += x[:, h].T @ Qh @ x[:, h]

    return sigma, cost

Function to compute the Rollout approximate solution

In [None]:
def slsrollout(A, Q, Qh, h, x0, H, v):

    n = A.shape[1]
    m = A.shape[0]
    x = np.zeros((n, h+1))
    sigma = np.zeros(h, int)
    sigmaopt = np.zeros((m**H, H), int)
    cost = 0.0
    x[:,[0]] = x0

    # these instructions provide all the possible switching combinations
    # with H switchings and with a periodic sequence that repeats v after this
    # horizon
    for k in range(H):
        sigmaopt[:, [k]] = np.kron(np.ones((m**(H-k-1), 1)), np.kron((np.arange(m)+1).reshape((m, 1)), np.ones((m**k, 1))))
    seqper = np.kron(np.ones((1, round(h/v.shape[0]) + 1), int),  v)
    sigmaopt = np.hstack((sigmaopt, np.ones((m**H, 1), int) * seqper[0, range(h-H)]))

    sigmaopt -= 1

    for k in range(h):
        # compute the cost of all switching options
        x_ = np.zeros((n, h-k+1))
        cost_ = np.zeros(min(m**(h-k), m**H))
        x_[:, 0] = x[:,k]
        for ind in range(min(m**(h-k), m**H)):
            for l in range(h-k):
                x_[:, l+1] = A[sigmaopt[ind, l]] @ x_[:, l]
                cost_[ind] += x_[:,l].T @ Q[sigmaopt[ind, l]] @ x_[:,l]
            cost_[ind] += x_[:, l+1].T @ Qh @ x_[:, l+1]
        # compute the best switching option
        indopt = np.argmin(cost_)
        # take the first switching of the best switching sequence
        sigma[k] = sigmaopt[indopt, 0]
        # apply it to the system and get the cost at this iteration
        x[:, k+1] = A[sigma[k]] @ x[:, k]
        cost += x[:, k].T @ Q[sigma[k]] @ x[:, k]
    cost += x[:,h].T @ Qh @ x[:,h]

    return sigma, cost

Function to compute the Rollout policy for infinite horizon problems 

In [None]:
def slsrolloutasymptotic(A, Q, x0, H, v, Niter):

    n = A.shape[1]
    m = A.shape[0]
    x = np.zeros((n, Niter+2))
    sigma = np.zeros(Niter+1, int)
    sigmaopt = np.zeros((m**H, H), int)
    cost = 0.0
    x[:,[0]] = x0
    nv = v.shape[0]

    # these instructions provide all the possible switching combinations
    # with H switchings 
    for k in range(H):
        sigmaopt[:, [k]] = np.kron(np.ones((m**(H-k-1), 1)), np.kron((np.arange(m)+1).reshape((m, 1)), np.ones((m**k, 1))))

    sigmaopt -= 1
    v -= 1

    # compute a matrix P such that the cost after the horizon is xH'PxH
    optbase = 2

    if optbase == 1:
        eps = 1e-3
        ii = 0
        P_ = [np.eye(n)]
        vt = v[::-1]
        while(True):
            P_.append(A[vt[ii%nv]].T @ P_[ii] @ A[vt[ii%nv]] + Q[vt[ii%nv]])
            if ii+1 > nv:
                if (ii+1)%nv == 0 and np.linalg.norm(P_[ii]-P_[ii-nv]) < eps:
                    P = P_[ii]
                    break
            ii += 1

    elif optbase == 2:
        Phi = np.eye(n)
        C = np.zeros((n,n))
        for ii in range(nv):
            C += Phi.T @ Q[v[ii]] @ Phi
            Phi = A[v[ii]] @ Phi
        P = control.dlyap(Phi.T, C)

    print('cost periodic')
    print(x0.T @ P @ x0)

    # compute the cost of all possible switching sequence explicitely and a
    # priori, these costs are given by xk'Prol xk where P depends on the switching
    # sequence
    Prol = np.zeros((n, n, m**H))
    for ind in range(m**H):
        Prol2 = np.zeros((n, n, H+1, m**H))
        Prol2[:,:,-1,ind] = P
        for k in range(H)[::-1]:
            Prol2[:,:,k,ind] = A[sigmaopt[ind, k]].T @ Prol2[:,:,k+1,ind] @ A[sigmaopt[ind, k]] + Q[sigmaopt[ind,k]]
        Prol[:,:,ind] = Prol2[:,:,0,ind]
    
    for k in range(Niter+1):
        cost_ = np.zeros(m**H)
        # obtain the optimal switching sequence by evaluating all the cost
        # corresponding to switching sequences
        for ind in range(m**H):
            cost_[ind] = x[:,k].T @ Prol[:,:,ind] @ x[:,k]
        # take the best switching sequence corresponding to the optimal cost
        indopt = np.argmin(cost_)
        sigma[k] = sigmaopt[indopt, 0]
        # take the first switching of the optimal switching sequence and apply
        # it to the system
        x[:, k+1] = A[sigma[k]] @ x[:, k]
        # take the cost at this iteration
        cost += x[:,k].T @ Q[sigma[k]] @ x[:,k]

    return sigma, cost

In [None]:
# define resource aware control problem

Ac = array([[0, 1],[1, 0]])
Bc = array([[0], [1]])
Cc = np.eye(2)
Dc = np.zeros((2,1))
n = Ac.shape[0]
tau = 0.2
Ad, Bd = signal.cont2discrete((Ac, Bc, Cc, Dc), tau)[:2]
Kd = dlqr(Ad, Bd, np.eye(*Ad.shape), 1)[0]
Kd = -Kd
Qd = np.eye(n)
Rd = 0.01

A = np.zeros((2, 2*n, 2*n))
A[0] = np.block([[Ad + Bd@Kd, np.zeros((n,n))], [np.zeros((n,n)), Ad]])
A[1] = np.block([[Ad, np.zeros((n,n))], [np.zeros((n,n)), Ad + Bd@Kd]])
Q = np.zeros((2, 2*n, 2*n))
Q[0] = np.block([[Qd + Rd*Kd.T@Kd, np.zeros((n,n))], [np.zeros((n,n)), Qd]])
Q[1] = np.block([[Qd, np.zeros((n,n))], [np.zeros((n,n)), Qd + Rd*Kd.T@Kd]])

optx0 = 1 # two possible initial conditions more might be added

if optx0 == 1:
    x0 = np.array([[2, 0.5, 0.5, -0.25]]).T
elif optx0 == 2:
    x0 = np.array([[0, 0.5, 1, 1]]).T

Qh = 10*np.eye(4)

The four methods to obtain the scheduling sequence mentioned before can be tested with the following script (change `opt` to call each method).

In [None]:
opt = 1
if opt == 1:
    h = 10
    L = h
    sigmastar, cost = slsoptimal(A, Q, Qh, h, x0)
    sigma = sigmastar
elif opt == 2:
    h = 25
    H = 3
    L = h
    sigma, cost = slsmpc(A, Q, Qh, h, x0, H)
elif opt == 3:
    h = 25
    H = 3
    v = array([1, 2])
    L = h
    sigma, cost = slsrollout(A, Q, Qh, h, x0, H, v)
elif opt == 4:
    H = 3
    v = array([1, 2])
    Niter = 150
    L = Niter
    sigma, cost = slsrolloutasymptotic(A, Q, x0, H, v, Niter)

# iterate the system and compute the costs (must be the same)
t = np.arange(L+1) * tau
x = np.zeros((A.shape[1], L+1))
x[:,[0]] = x0
cost_ = 0
for k in range(L):
    x[:,k+1] = A[sigma[k]] @ x[:,k]
    cost_ += x[:,k].T @ Q[sigma[k]] @ x[:,k]

if opt != 4:
    cost_ += x[:,L].T @ Qh @ x[:,L]
else:
    cost_ += x[:,L].T @ Q[sigma[L]] @ x[:,L]

# compare the two costs

In [None]:
cost

In [None]:
cost_

In [None]:
# plots
fig = plt.figure()
ax1, ax2 = fig.subplots(1,2)
ax1.plot(t, x[0, :])
ax1.set_title('time')
ax1.set_ylabel('x1')
ax2.plot(t, x[2, :])
ax2.set_title('time')
ax2.set_ylabel('x3');