<a href="https://colab.research.google.com/github/YousefSoltanian/MAE598_Design_Optimization/blob/main/Project1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Inverted Double Pendulum on a Cart System

## System Description

Consider a double pendulum mounted on a cart, where the cart has mass $m$ and the pendulum has two rods with lengths $l_1$ and $l_2$. Point masses $m_1$ and $m_2$ are attached to the rods. The rods are assumed to be massless.

## Generalized Coordinates

Let $\theta_1$ and $\theta_2$ be the deviations of the rods from the upright position, and $q$ be the horizontal position of the cart. The system has three generalized coordinates: $q$, $\theta_1$, and $\theta_2$.

## Derivatives

The derivatives with respect to time are denoted as $\dot{q}$, $\dot{\theta}_1$, and $\dot{\theta}_2$.

## Control Input

The control input is denoted as $u(t)$, representing the force applied to the cart.

## External Disturbances

External disturbances $w_1$, $w_2$, $w_3$ act as forces on $q$, $\theta_1$, and $\theta_2$.

## Damping and Friction

Damping coefficients $d_1$, $d_2$, $d_3$ model friction and damping. The friction/damping force of the cart is $-d_1\dot{q}$, and the friction/damping forces in the joints are $-d_2\dot{\theta}_1$ and $-d_3\dot{\theta}_2$.

# System Dynamics

## Kinetic Energy K and Potential Energy P

The kinetic energy of the system is given by:
\begin{equation}
K = \frac{1}{2} \left[ m\dot{q}^2 + m_1(\dot{q} + l_1\dot{\theta}_1\cos\theta_1)^2 + m_2(\dot{q} + l_1\dot{\theta}_1\cos\theta_1 + l_2\dot{\theta}_2\cos\theta_2)^2 \right]
\end{equation}

The potential energy of the system is given by:

\begin{equation}
P = g \left[ m_1l_1\cos\theta_1 + m_2(l_1\cos\theta_1 + l_2\cos\theta_2) \right]
\end{equation}
## Lagrangian Mechanics

The Lagrangian $L$ is defined as the difference between kinetic and potential energy:

\begin{equation}
 L = K - P
\end{equation}

The equations of motion for the cart are derived using the principle of Lagrangian mechanics, resulting in the matrix equation:

\begin{equation}
 M(y)\ddot{y} = f(y, \dot{y}, u, \dot{w})
\end{equation}

where $M(y)$ is an invertible matrix.

## State-Space Equation

Defining the state vector $x = [q, \dot{q}, \theta_1, \dot{\theta}_1, \theta_2, \dot{\theta}_2]$, the state-space equation is given by:

\begin{aligned}\begin{bmatrix} m + m_1 + m_2 & l_1(m_1 + m_2) \cos \theta_1 & m_2l_2 \cos \theta_2 \\ l_1(m_1 + m_2) \cos \theta_1 & l_1^2 (m_1 + m_2) & l_1 l_2 m_2 \cos (\theta_1 - \theta_2) \\ l_2 m_2 \cos \theta_2 & l_1 l_2 m_2 \cos (\theta_1 - \theta_2) & l_2^2 m_2 \end{bmatrix}\begin{bmatrix} \ddot{q} \\ \ddot{\theta}_1 \\ \ddot{\theta}_2 \end{bmatrix}\end{aligned} =
\begin{aligned}
\begin{bmatrix}
l_1(m_1 + m_2)\dot{\theta}_1^2\sin \theta_1 + m_2l_2\dot{\theta}_2^2\sin \theta_2 \\
-l_1l_2m_2\dot{\theta}_2^2\sin(\theta_1 - \theta_2) + gl_1(m_1 + m_2)\sin \theta_1 \\
l_1l_2m_2\dot{\theta}_1^2\sin(\theta_1 - \theta_2) + g l_2 m_2 \sin \theta_2
\end{bmatrix}
-\begin{bmatrix} d_1 \dot{q} \\ d_2 \dot{\theta}_1 \\ d_3 \dot{\theta}_2 \end{bmatrix} + \begin{bmatrix} u \\ 0 \\ 0 \end{bmatrix} + \begin{bmatrix} w_1 \\ w_2 \\ w_3 \end{bmatrix}
\end{aligned}

This system of equations describes the motion of an inverted double pendulum on a cart subject to control input and external disturbances.


In [8]:
import logging
import math
import random
import numpy as np
import time
import torch
import torch.nn as nn
from torch import optim
from torch.nn import utils
import matplotlib.pyplot as plt

!pip install ipywidgets
from ipywidgets import IntProgress
from IPython.display import display
from matplotlib import pyplot as plt, rc
from matplotlib.animation import FuncAnimation, PillowWriter
rc('animation', html='jshtml')
!pip install jupyterthemes
from jupyterthemes import jtplot
jtplot.style(theme='grade3', context='notebook', ticks=True, grid=False)

logger = logging.getLogger(__name__)



Collecting jedi>=0.16 (from ipython>=4.0.0->ipywidgets)
  Downloading jedi-0.19.1-py2.py3-none-any.whl (1.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m19.9 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: jedi
Successfully installed jedi-0.19.1
Collecting jupyterthemes
  Downloading jupyterthemes-0.20.0-py2.py3-none-any.whl (7.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.0/7.0 MB[0m [31m28.8 MB/s[0m eta [36m0:00:00[0m
Collecting lesscpy>=0.11.2 (from jupyterthemes)
  Downloading lesscpy-0.15.1-py2.py3-none-any.whl (46 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.7/46.7 kB[0m [31m8.1 MB/s[0m eta [36m0:00:00[0m
Collecting ply (from lesscpy>=0.11.2->jupyterthemes)
  Downloading ply-3.11-py2.py3-none-any.whl (49 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m49.6/49.6 kB[0m [31m8.5 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: ply, le

## Modeling the Dynamic

We have modeled the inverted double pendulum dynamic using the function below, it takes the current states and the action and gives the next states. we have used torch.linalg.solve method for solving the equation and then we have used a simple update rule.

In [9]:
import torch
import torch.nn as nn

device = "cpu" if not torch.cuda.is_available() else "cuda:0"
num_cells = 256  # number of cells in each layer i.e. output dim.
lr = 15e-4
max_grad_norm = 1.0

class IDPDynamics(nn.Module):
    def __init__(self, m=1.0, m1=1.0, m2=1.0, l1=1.0, l2=1.0, d1=0.1, d2=0.1, d3=0.1):
        super(IDPDynamics, self).__init__()
        self.m = m
        self.m1 = m1
        self.m2 = m2
        self.l1 = l1
        self.l2 = l2
        self.d1 = d1
        self.d2 = d2
        self.d3 = d3
        self.g = 9.8  # gravitational acceleration

    def forward(self, state, action):
        """
        state[0] = q
        state[1] = q_dot
        state[2] = theta1
        state[3] = theta1_dot
        state[4] = theta2
        state[5] = theta2_dot

        action[0] = u (control input)

        External disturbances w1, w2, w3 are added as small random noises.
        """

        # Unpack state and action
        q, q_dot, theta1, theta1_dot, theta2, theta2_dot = state.squeeze()
        u = action.squeeze()

        # External disturbances (small random noises)
        w1 = torch.randn(1) * 0.01
        w2 = torch.randn(1) * 0.01
        w3 = torch.randn(1) * 0.01

        # Equations of motion
        M = torch.tensor([
            [self.m + self.m1 + self.m2, self.l1 * (self.m1 + self.m2) * torch.cos(theta1), self.m2 * self.l2 * torch.cos(theta2)],
            [self.l1 * (self.m1 + self.m2) * torch.cos(theta1), self.l1**2 * (self.m1 + self.m2), self.l1 * self.l2 * self.m2 * torch.cos(theta1 - theta2)],
            [self.m2 * self.l2 * torch.cos(theta2), self.l1 * self.l2 * self.m2 * torch.cos(theta1 - theta2), self.l2**2 * self.m2]
        ], device=device)

        # damping matrices
        D = torch.tensor([
            [self.d1, 0, 0],
            [0, self.d2, 0],
            [0, 0, self.d3]
        ], device=device)

        F = torch.tensor([
        [self.l1 * (self.m1 + self.m2) * theta1_dot**2 * torch.sin(theta1) + self.m2 * self.l2 * theta2_dot**2 * torch.sin(theta2)],
        [-self.l1 * self.l2 * self.m2 * theta2_dot**2 * torch.sin(theta1 - theta2) + self.g * self.l1 * (self.m1 + self.m2) * torch.sin(theta1)],
        [self.l1 * self.l2 * self.m2 * theta1_dot**2 * torch.sin(theta1 - theta2) + self.g * self.l2 * self.m2 * torch.sin(theta2)]
        ],device=device)

        state_dot = torch.tensor([
            [q_dot],
            [theta1_dot],
            [theta2_dot]
        ],device=device)



        state_double_dot = F - D@state_dot + torch.tensor([[u],[0],[0]],device=device) + torch.tensor([[w1],[w2],[w3]],device=device)

        # Solve for state_double_dot using torch.linalg.solve
        q_double_dot, theta1_double_dot, theta2_double_dot = torch.linalg.solve(M, state_double_dot)

        # Update state
        q_dot = q_dot + q_double_dot * 0.01  # Assuming a small time step
        theta1_dot = theta1_dot + theta1_double_dot * 0.01
        theta2_dot = theta2_dot + theta2_double_dot * 0.01

        q = q + q_dot * 0.01
        theta1 = theta1 + theta1_dot * 0.01
        theta2 = theta2 + theta2_dot * 0.01

        next_state = torch.tensor([q, q_dot, theta1, theta1_dot, theta2, theta2_dot],device=device)

        return next_state.unsqueeze(0)


# Example usage:
dynamics_model = IDPDynamics()

# Example initial state and action
initial_state = torch.tensor([[0.0, 0.0, 0.1, 0.0, 0.1, 0.0]])
action = torch.tensor([[1.0]])

# Forward pass
next_state = dynamics_model(initial_state, action)
print("Next State:", next_state)


Next State: tensor([[-9.1103e-05, -9.1103e-03,  1.0019e-01,  1.8654e-02,  1.0000e-01,
          2.1720e-04]], device='cuda:0')


## Probabilistic Controller

Although we have only 1 action in this problem, we have used a probabilistic actor to model the action. meaning that the outputs of the network are 2 numbers, one of them representing a mean value and the other one representing a standard deviation. the final action from the controller comes from the distribution made by that mean value and that standard deviation. we want that final action to always be in the range of $(-1,1)$, for this purpose we have used a $tanh$ distribution. Also, to ensure that the mean value is withen the range $(-1,1)$ we have used a $tanh$ activation for the mean value and a $softplus$ activation for the standard devitation to ensure that it is always positive.

In [25]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.distributions import Normal, TransformedDistribution, TanhTransform

class Controller(nn.Module):
    def __init__(self, dim_input=6, dim_hidden=256, dim_output=2):
        """
        dim_input: # of system states
        dim_output: # of actions
        dim_hidden:
        """
        super(Controller, self).__init__()

        self.network = nn.Sequential(
            nn.Linear(dim_input, dim_hidden, device=device),
            nn.Tanh(),
            nn.Linear(dim_hidden, dim_hidden, device=device),
            nn.Tanh(),
            nn.Linear(dim_hidden, dim_hidden, device=device),
            nn.Tanh(),
            nn.Linear(dim_hidden, dim_output, device=device),
        )

    def forward(self, state):
        output = self.network(state)
        mean, std = torch.chunk(output, 2, dim=1)
        mean = torch.tanh(mean)
        std = F.softplus(std) + 1e-5  # Add a small constant for numerical stability
        base_distribution = Normal(mean, std)
        distribution = TransformedDistribution(base_distribution, [TanhTransform()])
        action = distribution.sample()

        return action
#test


controller = Controller()

# Sample an output
input_data = torch.randn(1, 6,device=device)  # Replace with your actual input data
action = controller(input_data)

print("action", action)

action tensor([[-0.6865]], device='cuda:0')
