In [1]:
try:
    from openmdao.utils.notebook_utils import notebook_mode  # noqa: F401
except ImportError:
    !python -m pip install openmdao[notebooks]

# Computing Post-Optimality Sensitivities of a Constrained Optimization Problem

Lets consider a problem such that we have an active bound and an active inequality constraint.

\begin{align*}
\min_{\theta_0,\, \theta_1} \quad & f(\theta_0, \theta_1; \mathbf{p}) = (\theta_0 - p_0)^2 + \theta_0 \theta_1 + (\theta_1 + p_1)^2 - p_2 \\
\text{where} \quad \mathbf{p} &= \begin{bmatrix} 3 \\ 4 \\ 3 \end{bmatrix} \in \mathbb{R}^3 \\
\text{bounds:} \quad \theta_0 &\le 6 \\
\text{equality constraints:} \quad \theta_1 &= -\theta_0
\end{align*}

Then, if we want to know the sensitivity of the optimization to the value of $\theta_0^{ub}$, as if is another parameter to the problem, we can assume $\theta_0^{ub}$ is just another element in our parameter vector:

\begin{align*}
    \bar{p} &= \begin{bmatrix} p_0 \\ p_1 \\ p_2 \\ \theta_0^{ub} \end{bmatrix}
\end{align*}

If active, we can treat the bound on $\theta_0$ as just another equality constraint.

\begin{align*}
  \bar{\mathcal{G}}(\bar{\theta}, \bar{p}) &= \begin{bmatrix}
                                   \theta_0 + \theta_1 \\
                                   \theta_0 - p_3
                                \end{bmatrix} = \bar 0
\end{align*}

**How will my system design ($\bar{\theta}^*$) respond to changes in my assumptions and system inputs ($\bar{p}$)?**

### The Universal Derivatives Equation

The UDE is:

\begin{align*}
  \left[ \frac{\partial \mathcal{R}}{\partial \mathcal{u}} \right] \left[ \frac{d u}{d \mathcal{R}} \right]
  &=
  \left[ I \right]
  =
  \left[ \frac{\partial \mathcal{R}}{\partial \mathcal{u}} \right]^T \left[ \frac{d u}{d \mathcal{R}} \right]^T\\
\end{align*}

Here, the residuals are the primal and dual residuals of the optimization process, given above.

There's a lot of nomenclature collisions, so let's define the following going forward.

- Post-optimization, the unknowns ($\bar{u}$) in our system are the resulting design variable values ($\bar{\theta}$).

\begin{align*}
  \bar{u} &= \bar{\theta}
\end{align*}

- The independent variables ($\bar{x}$) of this system are those inputs for which we want to determine the sensitivity of the ouputs: the bounding values of the active constraints as well as any other parameters that are inputs to the model but not design variables controlled by the optimizer.

\begin{align*}
  \bar{x} &= \bar{p}
\end{align*}

## Applying the UDE to solving post-optimality sensitivities

In our case, the unknowns vector consists of
- the optimization parameters which includes the bounding values of active constraints ($\bar{p}$)
- the design variables of the optimization ($\bar{\theta}$)
- the Lagrange multipliers of the optimization ($\bar{\lambda}$)
- the objective value **as well as** any other outputs for which we want the sensitivities ($f$)

The total size of the unknowns vector is $N_p + N_{\theta} + N_{\lambda} + N_{f}$

\begin{align*}
  \hat{u} &=
  \begin{bmatrix}
    \hat{p} \\
    \bar{\theta} \\
    \bar{\lambda} \\
    \bar{f}
  \end{bmatrix}
\end{align*}

Under the UDE, the corresponding residual equations for these unknowns are
- the implicit form of the independent variable values
- the stationarity condition
- the active constraints
- the implicit form of the explicit calculations of $f$ and $y$

\begin{align}
\bar{\mathcal{R}}
&=
\begin{bmatrix}
\bar{\mathcal{R}}_p \\
\bar{\mathcal{R}}_{\theta} \\
\bar{\mathcal{R}}_{g} \\
\bar{\mathcal{R}}_{f}
\end{bmatrix}
&=
\begin{bmatrix}
  \bar{p} - \bar{p} \\[1.1ex]
  \hline \\
  \bar{r}_{\theta} - \left[ -\nabla_{\bar{\theta}} \check{f} (\bar{\theta}, \bar{p}) + \nabla_{\bar{\theta}} \check{g}_{ab} (\bar{\theta}, \bar{p})^T \bar{\lambda} \right] \\[1.1ex]
  \hline \\
  \bar{r}_{\lambda} - \check{g}_{ab} \left( \bar{\theta}, \bar{p} \right) \\[1.1ex]
  \hline \\
  f - \check{f}\left(\bar{\theta}, \bar{p} \right) 
\end{bmatrix}
&=
\begin{bmatrix}
  p_0 - \check{p}_0 \\[1.1ex]
  p_1 - \check{p}_1 \\[1.1ex]
  p_2 - \check{p}_2 \\[1.1ex]
  p_3 - \check{p}_3 \\[1.1ex]
  \hline \\
  \bar{r}_{\theta} - \left[ -\nabla_{\bar{\theta}} \check{f} (\bar{\theta}, \bar{p}) + \nabla_{\bar{\theta}} \check{g}_{ab} (\bar{\theta}, \bar{p})^T \bar{\lambda} \right] \\[1.1ex]
  \hline \\
  r_{\lambda_0} - \left[ \theta_0 + \theta_1 \right] \\[1.1ex]
  r_{\lambda_1} - \left[\theta_0 - p_4 \right] \\[1.1ex]
  \hline \\
  f - \left[ (\theta_0 - p_0)^2 + \theta_0 \theta_1 + (\theta_1 + p_1)^2 - p_2 \right]
\end{bmatrix}
&= \bar 0
\end{align}

In order to find the total derivatives that we seek ($\frac{d f^*}{d \bar{p}}$ and $\frac{d \bar{\theta}^*}{d \bar{p}}$), we need $\frac{\partial \bar{\mathcal{R}}}{\partial \bar{u}}$.

The optimizer has served as the nonlinear solver in this case which has computed the values in the unknowns vector: $\bar{\theta}$, $\bar{\lambda}$, $\bar{g}$, and $f$ such that the residuals are satisfied.

\begin{align}
\frac{\partial \bar{\mathcal{R}}}{\partial \bar{u}}
&=
\begin{bmatrix}
\frac{\partial \bar{\mathcal{R}_p}}{\partial \bar{p}} & 0 & 0 & 0 \\[1.1ex]
\frac{\partial \bar{\mathcal{R}_{\theta}}}{\partial \bar{p}} & \frac{\partial \bar{\mathcal{R}_{\theta}}}{\partial \bar{\theta}} & \frac{\partial \bar{\mathcal{R}_{\theta}}}{\partial \bar{\lambda}} & 0 \\[1.1ex]
\frac{\partial \bar{\mathcal{R}_g}}{\partial \bar{p}} & \frac{\partial \bar{\mathcal{R}_g}}{\partial \bar{\theta}} & 0 & 0 \\[1.1ex]
\frac{\partial \bar{\mathcal{R}_f}}{\partial \bar{p}} & \frac{\partial \bar{\mathcal{R}_f}}{\partial \bar{\theta}} & 0 & \frac{\partial \bar{\mathcal{R}_f}}{\partial f}
\end{bmatrix}
&=
\begin{bmatrix}
    \left[ I_p \right] & 0 & 0 & 0 \\[1.1ex]
    \frac{d \check{\mathcal{L}}}{d \bar{p}} & \frac{d \nabla \check{\mathcal{L}}}{d \bar{\theta}} & \frac{d \check{\mathcal{L}}}{d \bar{\lambda}} & 0 \\[1.1ex]
    \frac{d \check g}{d \bar{p}} & \frac{d \check g}{d \bar{\theta}} & 0 & 0 \\[1.1ex]
    -\frac{d \check f}{d \bar{p}} & -\frac{d \check f}{d \bar{\theta}} & 0 & \left[ I_f \right]
\end{bmatrix}
&=
\begin{bmatrix}
    \left[ I_p \right] & 0 & 0 & 0 \\[1.1ex]
    \frac{d \check{\mathcal{L}}}{d \bar{p}} & \nabla^2 \check{\mathcal{L}} & \nabla \check g ^T & 0 \\[1.1ex]
    \frac{d \check g}{d \bar{p}} & \nabla \check g & 0 & 0 \\[1.1ex]
    -\frac{d \check f}{d \bar{p}} & -\frac{d \check f}{d \bar{\theta}} & 0 & \left[ I_f \right]
\end{bmatrix}
\end{align}

This nomenclature can be a bit confusing.

**The _partial_ derivatives of the post-optimality residuals are the _total_ derivatives of the analysis.**

In this case of the stationarity residuals $\mathcal{R}_{\bar{\theta}}$, which already include _total_ derivatives of the analysis for the objective and constraint gradients, second derivatives are required.

The corresponding total derivaties which we need to solve for are:

\begin{align}
\frac{d \bar{u}}{d \bar{\mathcal{R}}}
&=
\begin{bmatrix}
\frac{d \bar{p}}{d \bar{\mathcal{R}_p}} & \frac{d \bar{p}}{d \bar{\mathcal{R}_{\theta}}} & \frac{d \bar{p}}{d \bar{\mathcal{R}_{\lambda}}} & \frac{d \bar{p}}{d \bar{\mathcal{R}_f}} \\[1.1ex]
\frac{d \bar{\theta}}{d \bar{\mathcal{R}_p}} & \frac{d \bar{\theta}}{d \bar{\mathcal{R}_{\theta}}} & \frac{d \bar{\theta}}{d \bar{\mathcal{R}_{\lambda}}} & \frac{d \bar{\theta}}{d \bar{\mathcal{R}_f}} \\[1.1ex]
\frac{d \bar{\lambda}}{d \bar{\mathcal{R}_p}} & \frac{d \bar{\lambda}}{d \bar{\mathcal{R}_{\theta}}} & \frac{d \bar{\lambda}}{d \bar{\mathcal{R}_{\lambda}}} & \frac{d \bar{\lambda}}{d \bar{\mathcal{R}_f}} \\[1.1ex]
\frac{d f}{d \bar{\mathcal{R}_p}} & \frac{d f}{d \bar{\mathcal{R}_{\theta}}} & \frac{d f}{d \bar{\mathcal{R}_{\lambda}}} & \frac{d f}{d \bar{\mathcal{R}_f}}
\end{bmatrix}
&=
\begin{bmatrix}
\frac{d \bar{p}}{d \bar{p}} & \frac{d \bar{p}}{d \bar{\theta}} & \frac{d \bar{p}}{d \bar{\lambda}} & \frac{d \bar{p}}{d \bar{f}} \\[1.1ex]
\mathbf{\frac{d \bar{\theta}}{d \bar{p}}} & \frac{d \bar{\theta}}{d \bar{\theta}} & \frac{d \bar{\theta}}{d \bar{\lambda}} & \frac{d \bar{\theta}}{d \bar{f}} \\[1.1ex]
\frac{d \bar{\lambda}}{d \bar{p}} & \frac{d \bar{\lambda}}{d \bar{\theta}} & \frac{d \bar{\lambda}}{d \bar{\lambda}} & \frac{d \bar{\lambda}}{d \bar{\lambda}} \\[1.1ex]
\mathbf{\frac{d f}{d \bar{p}}} & \frac{d f}{d \bar{\theta}} & \frac{d f}{d \bar{\lambda}} & \frac{d \bar{f}}{d \bar{f}}
\end{bmatrix}
&=
\begin{bmatrix}
\left[ I_p \right] & 0 & 0 & 0 \\[1.1ex]
\mathbf{\frac{d \bar{\theta}}{d \bar{p}}} & \frac{d \bar{\theta}}{d \bar{\mathcal{R}_{\theta}}} & \frac{d \bar{\theta}}{d \bar{\mathcal{R}_{\lambda}}} & \frac{d \bar{\theta}}{d \bar{\mathcal{R}_f}} \\[1.1ex]
\frac{d \bar{\lambda}}{d \bar{\mathcal{R}_p}} & \frac{d \bar{\lambda}}{d \bar{\mathcal{R}_{\theta}}} & \frac{d \bar{\lambda}}{d \bar{\mathcal{R}_{\lambda}}} & \frac{d \bar{\lambda}}{d \bar{\mathcal{R}_f}} \\[1.1ex]
\mathbf{\frac{d f}{d \bar{p}}} & \frac{d f}{d \bar{\mathcal{R}_{\theta}}} & \frac{d f}{d \bar{\mathcal{R}_{\lambda}}} & \left[ I_f \right]
\end{bmatrix}
&=
\begin{bmatrix}
\left[ I_p \right] & 0 & 0 & 0 \\[1.1ex]
\mathbf{\frac{d \bar{\theta}}{d \bar{p}}} & \frac{d \bar{\theta}}{d \bar{\theta}} & \frac{d \bar{\theta}}{d \bar{\lambda}} & 0 \\[1.1ex]
\frac{d \bar{\lambda}}{d \bar{p}} & \frac{d \bar{\lambda}}{d \bar{\theta}} & \frac{d \bar{\lambda}}{d \bar{\lambda}} & 0 \\[1.1ex]
\mathbf{\frac{d f}{d \bar{p}}} & \frac{d f}{d \bar{\theta}} & \frac{d f}{d \bar{\lambda}} & \left[ I_f \right]
\end{bmatrix}
\end{align}

The sensitivities of the objective and the design variable values with respect to the parameters of the optimization are highlighted.

In this case, we can solve them with four linear solves of the forward system, or three solves of the reverse system.

TODO: Need to explain how du/dRf becomes du/df.

The UDE for this problem, in forward form, is

\begin{align}
\begin{bmatrix}
    \left[ I_p \right] & 0 & 0 & 0 \\[1.1ex]
    \frac{d \check{\mathcal{L}}}{d \bar{p}} & \nabla^2 \check{\mathcal{L}} & \nabla \check g ^T & 0 \\[1.1ex]
    \frac{d \check g}{d \bar{p}} & \nabla \check g & 0 & 0 \\[1.1ex]
    -\frac{d \check f}{d \bar{p}} & -\frac{d \check f}{d \bar{\theta}} & 0 & \left[ I_f \right]
\end{bmatrix}
\begin{bmatrix}
\frac{d \bar{p}}{d \bar{p}} & \frac{d \bar{p}}{d \bar{\theta}} & \frac{d \bar{p}}{d \bar{\lambda}} & \frac{d \bar{p}}{d \bar{f}} \\[1.1ex]
\mathbf{\frac{d \bar{\theta}}{d \bar{p}}} & \frac{d \bar{\theta}}{d \bar{\theta}} & \frac{d \bar{\theta}}{d \bar{\lambda}} & \frac{d \bar{\theta}}{d \bar{f}} \\[1.1ex]
\frac{d \bar{\lambda}}{d \bar{p}} & \frac{d \bar{\lambda}}{d \bar{\theta}} & \frac{d \bar{\lambda}}{d \bar{\lambda}} & \frac{d \bar{\lambda}}{d \bar{\lambda}} \\[1.1ex]
\mathbf{\frac{d f}{d \bar{p}}} & \frac{d f}{d \bar{\theta}} & \frac{d f}{d \bar{\lambda}} & \frac{d \bar{f}}{d \bar{f}}
\end{bmatrix}
&=
\begin{bmatrix}
    \left[ I_p \right] & 0 & 0 & 0 \\[1.1ex]
    0 & \left[ I_\theta \right] & 0 & 0 \\[1.1ex]
    0 & 0 & \left[ I_\lambda \right] & 0 \\[1.1ex]
    0 & 0 & 0 & \left[ I_f \right]
\end{bmatrix}
\end{align}

The sensitivities of the objective and the design variable values with respect to the parameters of the optimization are highlighted.

In this case, we have four parameters and thus four columns for which we need to solve the system.
Alternatively, we have three rows of interest in this system...two for the design variables $\theta_0$ and $\theta_1$ and one for the objective $f$. Taking the transpose and solving this system using the reverse form would require three linear system solves.

## Working through the example

First, lets use OpenMDAO to find the solution.

In [52]:
import jax.numpy as np
import openmdao.api as om


class ObjComp(om.JaxExplicitComponent):

    def setup(self):
        self.add_input('Θ', shape=(2,))
        self.add_input('p', shape=(4,))
        self.add_output('f', shape=(1,))

    def compute_primal(self, Θ, p):
        f = (Θ[0] - p[0])**2 + Θ[0] * Θ[1] + (Θ[1] + p[1])**2 - p[2]
        return np.array([f])

class ConComp(om.JaxExplicitComponent):

    def setup(self):
        self.add_input('Θ', shape=(2,))
        self.add_input('p', shape=(4,))
        self.add_output('g', shape=(1,))

    def compute_primal(self, Θ, p):
        g = Θ[0] + Θ[1]
        return np.array([g])


prob = om.Problem()
prob.model.add_subsystem('f_comp', ObjComp(), promotes_inputs=['*'], promotes_outputs=['*'])
prob.model.add_subsystem('g_comp', ConComp(), promotes_inputs=['*'], promotes_outputs=['*'])

prob.model.add_design_var('Θ', upper=[6., None])
prob.model.add_constraint('g', equals=0.)
prob.model.add_objective('f')

prob.driver = om.ScipyOptimizeDriver()

prob.setup()

prob.set_val('p', [3, 4, 3, 6])

prob.run_driver()


Optimization terminated successfully    (Exit mode 0)
            Current function value: -25.999999999999993
            Iterations: 2
            Function evaluations: 2
            Gradient evaluations: 2
Optimization Complete
-----------------------------------


Problem: problem14
Driver:  ScipyOptimizeDriver
  success     : True
  iterations  : 3
  runtime     : 1.2067E-01 s
  model_evals : 3
  model_time  : 2.2514E-02 s
  deriv_evals : 2
  deriv_time  : 9.4140E-02 s
  exit_status : SUCCESS

In [53]:
prob.model.list_vars(print_arrays=True);

6 Variables(s) in 'model'

varname  val                  io      prom_name
-------  -------------------  ------  ---------
f_comp
  Θ      |8.48528137|         input   Θ        
         val:
         array([ 6., -6.])
  p      |8.36660027|         input   p        
         val:
         array([3., 4., 3., 6.])
  f      [-26.]               output  f        
g_comp
  Θ      |8.48528137|         input   Θ        
         val:
         array([ 6., -6.])
  p      |8.36660027|         input   p        
         val:
         array([3., 4., 3., 6.])
  g      [1.77635684e-15]     output  g        




In [None]:
active_dvs, active_cons = prob.driver.compute_lagrange_multipliers()

Now lets form the UDE system and compute the sensitivities, outside of OpenMDAO first.

In [153]:
# Convert OpenMDAO values to jax arrays

f_opt = np.array(prob.get_val('f'))
Θ_opt = np.array(prob.get_val('Θ'))
p = np.array(prob.get_val('p'))

# The lagrange multipliers of the active constraints are
λ_opt = np.array([active_dvs['Θ']['multipliers'][active_dvs['Θ']['indices']],
              active_cons['g']['multipliers'][active_cons['g']['indices']]])

Define our objective and active constraint (and bounds) functions.

In [154]:
def f(Θ, p):
    f = (Θ[0] - p[0])**2 + Θ[0] * Θ[1] + (Θ[1] + p[1])**2 - p[2]
    return np.array([f])

def g_active(Θ, p):
    return np.array([Θ[0] + Θ[1],
                     Θ[0] - p[3]])

# def df_dΘ(Θ, p):
#     df_dΘ = [[2 * (Θ[0] - p[0]) + Θ[1]],
#              [Θ[0] + 2 * (Θ[1] + p[1])]]
#     return np.array(df_dΘ)

# def df_dp(Θ, p):
#     df_dp = [[-2 * (Θ[0] - p[0])],
#              [2 * (Θ[1] + p[1])],
#              [-1],
#              [0]]
#     return np.array(df_dp)

In [333]:
import jax

# The design vars, from OpenMDAO
print('\nΘ*:')
print(Θ_opt)

# The Lagrange multipliers, from OpenMDAO
print("\nλ*:")
print(λ_opt)

# Jacobian of f with respect to Θ
df_dΘ = jax.jacobian(f, argnums=0)(Θ_opt, p)
print("\n∂f/∂Θ:")
print(df_dΘ)

# Jacobian of f with respect to p
df_dp = jax.jacobian(f, argnums=1)(Θ_opt, p)
print("\n∂f/∂p:")
print(df_dp)

# Jacobian of g_active with respect to Θ
dg_dΘ = jax.jacobian(g_active, argnums=0)(Θ_opt, p)
print("\n∂g/∂Θ:")
print(dg_dΘ)

# Jacobian of g_active with respect to p
dg_dp = jax.jacobian(g_active, argnums=1)(Θ_opt, p)
print("\n∂g/∂p:")
print(dg_dp)

# Lagrangian
dL_dΘ = -df_dΘ.T + np.matmul(dg_dΘ.T, λ_opt)
print("\n∇L:")
print(dL_dΘ)

# Hessian of the objective
d2f_dΘ2 = jax.jacobian(jax.jacobian(f, argnums=0), argnums=0)(Θ_opt, p)
print("\n∇²f (via jacobian of jacobian):")
print(d2f_dΘ2)

d2f_dΘdp = jax.jacobian(jax.jacobian(f, argnums=0), argnums=1)(Θ_opt, p)
print("\n∂∇f/∂p (via jacobian of jacobian):")
print(d2f_dΘdp)

# Hessian of the constraints
d2g_dΘ2 = jax.jacobian(jax.jacobian(g_active, argnums=0), argnums=0)(Θ_opt, p)
print("\n∇²g (via jacobian of jacobian):")
print(d2g_dΘ2)

d2g_dΘdp = jax.jacobian(jax.jacobian(g_active, argnums=0), argnums=1)(Θ_opt, p)
print("\n∂∇g/∂p (via jacobian of jacobian):")
print(d2g_dΘdp)

# # Hessian of the lagrangian
d2L_dΘ2 = -d2f_dΘ2.T + np.dot(d2g_dΘ2.T, λ_opt)
print("\n∇²L:")
print(d2L_dΘ2)

# # Hessian of the lagrangian
d2L_dΘdp = -d2f_dΘ2.T + np.dot(d2g_dΘ2.T, λ_opt)
print("\n∇²L:")
print(d2L_dΘ2)

I_p = np.eye(4)
print("\nI_p:")
print(I_p)

I_f = np.eye(1)
print("\nI_f:")
print(I_f)


Θ*:
[ 6. -6.]

λ*:
[[ 2.]
 [-2.]]

∂f/∂Θ:
[[1.77635684e-15 2.00000000e+00]]

∂f/∂p:
[[-6. -4. -1.  0.]]

∂g/∂Θ:
[[1. 1.]
 [1. 0.]]

∂g/∂p:
[[ 0.  0.  0. -0.]
 [ 0.  0.  0. -1.]]

∇L:
[[-3.10862447e-15]
 [-1.77635684e-15]]

∇²f (via jacobian of jacobian):
[[[2. 1.]
  [1. 2.]]]

∂∇f/∂p (via jacobian of jacobian):
[[[-2.  0.  0.  0.]
  [ 0.  2.  0.  0.]]]

∇²g (via jacobian of jacobian):
[[[0. 0.]
  [0. 0.]]

 [[0. 0.]
  [0. 0.]]]

∂∇g/∂p (via jacobian of jacobian):
[[[0. 0. 0. 0.]
  [0. 0. 0. 0.]]

 [[0. 0. 0. 0.]
  [0. 0. 0. 0.]]]

∇²L:
[[[-2.]
  [-1.]]

 [[-1.]
  [-2.]]]

∇²L:
[[[-2.]
  [-1.]]

 [[-1.]
  [-2.]]]

I_p:
[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]

I_f:
[[1.]]


In [334]:
import scipy.sparse as sp

def assemble_ude_matrix_scipy_sparse(nabla2_L, nabla_g, dg_dp, df_dtheta, df_dp, Np, Nx, Ng):
    """
    Assemble the UDE matrix using SciPy sparse matrices.
    This is more memory efficient for large problems.
    """

    # Convert JAX arrays to numpy for SciPy
    nabla2_L_np = np.array(nabla2_L).reshape((Nx, Nx))
    nabla_g_np = np.array(nabla_g)
    dg_dp_np = np.array(dg_dp)
    df_dtheta_np = np.array(df_dtheta)
    df_dp_np = np.array(df_dp)

    # Create sparse blocks

    # Row 1: [I_p, 0, 0, 0]
    row1 = [
        sp.eye(Np),                                    # I_p
        sp.csr_matrix((Np, Nx)),                      # 0
        sp.csr_matrix((Np, Ng)),                      # 0
        sp.csr_matrix((Np, 1))                        # 0
    ]

    # Row 2: [∂∇L/∂p, ∇²L, ∇g^T, 0]
    row2 = [
        sp.csr_matrix((Nx, Np)),                      # ∂∇L/∂p (placeholder)
        sp.csr_matrix(nabla2_L_np),                   # ∇²L
        sp.csr_matrix(nabla_g_np.T),                  # ∇g^T
        sp.csr_matrix((Nx, 1))                        # 0
    ]

    # Row 3: [∂g/∂p, ∇g, 0, 0]
    row3 = [
        sp.csr_matrix(dg_dp_np),                      # ∂g/∂p
        sp.csr_matrix(nabla_g_np),                    # ∇g
        sp.csr_matrix((Ng, Ng)),                      # 0
        sp.csr_matrix((Ng, 1))                        # 0
    ]

    # Row 4: [-∂f/∂p, -∂f/∂θ, 0, I_f]
    row4 = [
        sp.csr_matrix(-df_dp_np),                     # -∂f/∂p
        sp.csr_matrix(-df_dtheta_np),                 # -∂f/∂θ
        sp.csr_matrix((1, Ng)),                       # 0
        sp.eye(1)                                     # I_f
    ]

    # Assemble using bmat
    partial_R_partial_u = sp.bmat([row1, row2, row3, row4], format='csr')

    return partial_R_partial_u

In [335]:
partial_R_partial_u = assemble_ude_matrix_scipy_sparse(d2L_dΘ2, dg_dΘ, dg_dp, df_dΘ, df_dp, Np=4, Nx=2, Ng=2)

In [336]:
with np.printoptions(linewidth=1024, precision=1):
    print(np.asarray(partial_R_partial_u.todense(), dtype=int))

[[ 1  0  0  0  0  0  0  0  0]
 [ 0  1  0  0  0  0  0  0  0]
 [ 0  0  1  0  0  0  0  0  0]
 [ 0  0  0  1  0  0  0  0  0]
 [ 0  0  0  0 -2 -1  1  1  0]
 [ 0  0  0  0 -1 -2  1  0  0]
 [ 0  0  0  0  1  1  0  0  0]
 [ 0  0  0 -1  1  0  0  0  0]
 [ 6  3  1  0  0 -2  0  0  1]]


To obtain the sensitivities of the objective with respect to the parameters, the final row of $\frac{d u}{d \mathcal{R}}$, we transpose $\frac{\partial \mathcal{R}}{\partial u}$ and seed the right hand side with a 1 in the last row.

\begin{align}
  \begin{bmatrix}
    1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\[1.5ex]
    0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\[1.5ex]
    0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\[1.5ex]
    0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\[1.5ex]
    0 & 0 & 0 & 0 &-2 &-1 & 1 & 1 & 0 \\[1.5ex]
    0 & 0 & 0 & 0 &-1 &-2 & 1 & 0 & 0 \\[1.5ex]
    0 & 0 & 0 & 0 & 1 & 1 & 0 & 0 & 0 \\[1.5ex]
    0 & 0 & 0 &-1 & 1 & 0 & 0 & 0 & 0 \\[1.5ex]
    6 & 3 & 1 & 0 & 0 &-2 & 0 & 0 & 1
  \end{bmatrix}
  ^T
  \begin{bmatrix}
    \frac{d f^*}{d p_0} \\[1.3ex]
    \frac{d f^*}{d p_1} \\[1.3ex]
    \frac{d f^*}{d p_2} \\[1.3ex]
    \frac{d f^*}{d p_3} \\[1.3ex]
    \frac{d f^*}{d r_\theta 0} \\[1.3ex]
    \frac{d f^*}{d r_\theta 1} \\[1.3ex]
    \frac{d f^*}{d r_\lambda 0} \\[1.3ex]
    \frac{d f^*}{d r_\lambda 1} \\[1.3ex]
    \frac{d f^*}{d f^*}
  \end{bmatrix}
  &=
  \begin{bmatrix}
    0 \\[1.5ex]
    0 \\[1.5ex]
    0 \\[1.5ex]
    0 \\[1.5ex]
    0 \\[1.5ex]
    0 \\[1.5ex]
    0 \\[1.5ex]
    0 \\[1.5ex]
    1
  \end{bmatrix}  
\end{align}

In [337]:
from scipy.sparse.linalg import spsolve
rhs = sp.csr_matrix([[0, 0, 0, 0, 0, 0, 0, 0, 1]]).T
dfstar_du = spsolve(partial_R_partial_u.T, rhs)

In [338]:
dfstar_du

array([-6., -4., -1., -2., -0., -0.,  2., -2.,  1.])

In [339]:
dfstar_dp = dfstar_du[:4]

In [340]:
dfstar_dp

array([-6., -4., -1., -2.])

## Checking the results

Recall that the optimal objective value $f^*$ was

In [341]:
f_opt.item()

-25.999999999999993

Lets check the sensitivities by perturbing each element in p and reoptimizing

In [342]:
def check_f_sensitivity(h=1.0E-8):
    prob.driver.options['disp'] = False

    print(f'Sensitivity               {"UDE Result":20s}       {"FD Result":20s}          {"Error":20s}')

    for p_idx in range(4):
        p_nom = np.array([3, 4, 3, 6])
        ub_nom = np.array([6., 1.0E16])

        dp = np.zeros(4)
        dp = dp.at[p_idx].set(h)

        dub = np.zeros(2)
        if p_idx == 3:
            # To test p3 we need to change the upper bound on θ
            dub = dub.at[0].set(h)

        prob.set_val('p', p_nom + dp)
        prob.set_val('Θ', [1, 1]) # Start away from the optimum

        prob.model.set_design_var_options('Θ', upper=ub_nom + dub)

        prob.run_driver()
        prob.set_val('p', p_nom)
        prob.model.set_design_var_options('Θ', upper=ub_nom)

        dfstar_dpi_fd = (prob.get_val('f') - f_opt) / h

        print(f'   df*/dp_{p_idx}     {dfstar_dp[p_idx]:20.12f}      {dfstar_dpi_fd[0]:20.12f}      {dfstar_dp[p_idx]-dfstar_dpi_fd[p_idx]:20.12f}')

In [343]:
check_f_sensitivity()

Sensitivity               UDE Result                 FD Result                     Error               
   df*/dp_0          -6.000000000000           -6.000000496442            0.000000496442
   df*/dp_1          -4.000000000000           -3.999999975690           -0.000000024310
   df*/dp_2          -1.000000000000           -1.000000082740            0.000000082740
   df*/dp_3          -2.000000000000           -2.000000876023            0.000000876023


Similarly, for the sensitivities of $\theta^*$

In [344]:
rhs = sp.csr_matrix([[0, 0, 0, 0, 1, 0, 0, 0, 0]]).T
dthetastar0_du = spsolve(partial_R_partial_u.T, rhs)
rhs = sp.csr_matrix([[0, 0, 0, 0, 0, 1, 0, 0, 0]]).T
dthetastar1_du = spsolve(partial_R_partial_u.T, rhs)
dthetastar_du = np.vstack((dthetastar0_du, dthetastar1_du))

In [345]:
dthetastar_dp = dthetastar_du[:, :4]

In [346]:
dthetastar_dp

Array([[ 0.,  0.,  0.,  1.],
       [ 0.,  0.,  0., -1.]], dtype=float64)

That is, at the optimum $\theta_0$ is on its upper bound, so modifying this upper bound will necessarily change $\theta_0$ since we assume the bound remains active.

Since $\theta_1$ is constrained to be equal and opposite to $\theta_0$, the increase in $\theta_0$ will result in a equal and opposite change in $\theta_1$.

In [347]:
def check_Θ_sensitivity(h=1.0E-8):
    prob.driver.options['disp'] = False

    print(f'Sensitivity                                       '
          f'{"UDE Result":20s}                      '
          f'{"FD Result":20s}                         '
          f'{"Error":20s}')

    for p_idx in range(4):
        p_nom = np.array([3, 4, 3, 6])
        ub_nom = np.array([6., 1.0E16])

        dp = np.zeros(4)
        dp = dp.at[p_idx].set(h)

        dub = np.zeros(2)
        if p_idx == 3:
            # To test p3 we need to change the upper bound on θ
            dub = dub.at[0].set(h)

        prob.set_val('p', p_nom + dp)
        prob.set_val('Θ', [1, 1]) # Start away from the optimum

        prob.model.set_design_var_options('Θ', upper=ub_nom + dub)

        prob.run_driver()
        prob.set_val('p', p_nom)
        prob.model.set_design_var_options('Θ', upper=ub_nom)

        dthetastar_dpi_fd = (prob.get_val('Θ') - Θ_opt) / h

        # print(prob.get_val('Θ'), Θ_opt, dthetastar_dpi_fd)

        with np.printoptions(precision=4, formatter={'all':lambda x: f'{x:16.12f}'}):
            print(f'   dΘ*/dp_{p_idx}              {dthetastar_dp[:, p_idx]}      {dthetastar_dpi_fd}      {dthetastar_dp[:, p_idx]-dthetastar_dpi_fd}')

In [348]:
check_Θ_sensitivity()

Sensitivity                                       UDE Result                                FD Result                                    Error               
   dΘ*/dp_0              [  0.000000000000   0.000000000000]      [  0.000000000000  -0.000000177636]      [  0.000000000000   0.000000177636]
   dΘ*/dp_1              [  0.000000000000   0.000000000000]      [  0.000000000000   0.000000000000]      [  0.000000000000   0.000000000000]
   dΘ*/dp_2              [  0.000000000000   0.000000000000]      [  0.000000000000   0.000000000000]      [  0.000000000000   0.000000000000]
   dΘ*/dp_3              [  1.000000000000  -1.000000000000]      [  0.999999993923  -1.000000171558]      [  0.000000006077   0.000000171558]
