In [None]:
import pandas as pd
import numpy as np
import cvxpy as cp

# Modeling Trees as Truncated Cones using Convex Optmization

The purpose of this notebook is to fit the truncated cone model introduced in the previous notebook using convex optimization. 

## Model Formulation

$$
V = \frac{1}{3}\pi (r_1^2+r_1r_2+r_2^2)h
$$

Where $V$ is the observed volume of the tree and $h$ is the observed height. Then, $r_1$ and $r_2$ are the radii of the base and top of the cone, respectively.

We will consider the upper radius to be a fraction of the lower radius. So, we introduce a parameter $\alpha$ where $r_2 = \alpha r_1, \text{ for }\alpha \in (0,1)$. We will use a grid search to find an alpha that minimizes the fitted sum of squares. Let the lower tree radius be $r$ and the upper be $r_2$. Since $r_2 = \alpha r$, we can write the volume as:

$$
V = \frac{1}{3}\cdot h\cdot\pi\cdot r^2(1+\alpha+\alpha^2), \;\alpha\in(0,1)
$$

Formulating this as an optimization problem, we seek to minimize the sum of squares of the residuals.

\begin{align}
    \min &\sum_{i=1}^N \left(V_i - \frac{h_i}{3}\pi r_i^2(1+\alpha+\alpha^2)\right)^2\\
    \text{s.t. }&\alpha\in[0,1]
\end{align}

Observed constants radius ($R_i$), height ($h_i$), and volume ($V_i$), for $i=1, ..., N$.

We want to minimize the sum of squared residuals, so our loss function is:

$$
\begin{align}
    \min &\sum_{i=1}^N \left(V_i - \frac{h_i}{3}\pi R_i^2(1+\alpha+\alpha^2)\right)^2\\
    \text{s.t. }&\alpha\in[0,1]
\end{align}
$$

To represent this in matrix notation, Let $V=\begin{pmatrix}V_1 \\ V_2 \\ \vdots \\ V_N\end{pmatrix}$ and $hr^2 = \begin{pmatrix}h_1r_1^2 \\ h_2r_2^2 \\ \vdots \\ h_Nr_N^2\end{pmatrix}$

Putting this all together, we have:

$$
\begin{align}
    \min &\|V - \frac{h}{3}\pi r^2(1+\alpha+\alpha^2)\|_2^2 \\
    \text{s.t. }&0\leq\alpha\leq 1
\end{align}
$$

Solutions $\alpha\in \mathbb r$, with $\alpha\in [0,1]$

## Read Data and Set Parameters

In [None]:
trees = pd.read_csv("trees.csv")
trees.head()

In [None]:
r = (trees.d/2).to_numpy() # DBH Divided by height
h = trees.h.to_numpy()
V = trees.v.to_numpy()

In [None]:
a = cp.Variable(1, pos=True) # alpha parameter, target of interest

## Optimization

In [None]:
np.random.seed(6596) # NLP CU Denver Course number used as seed

objective=cp.sum_squares(V-h/3*np.pi*r**2*(1+a+a**2)) # Objective Function
constraints = [a>= 0, a<=1]
prob = cp.Problem(cp.Minimize(objective), constraints)
prob.solve()

## Adding Species

One-hot encoding for species type.

In [None]:
# Example data

species = ["cherry", "cherry", "teak"]
df = pd.DataFrame(species, columns=["Species"])
df.join(pd.get_dummies(df, columns=["Species"], dtype=int)).rename(columns={"Species_cherry": "ind1", "Species_teak": "ind2"})

$$
\text{Here is a WRONG way to represent an additive effect per species}
$$

$$
\begin{align}
    \min &\sum_{i=1}^N \left(V_i - \frac{h_i}{3}\pi R_i^2(1+\alpha+\alpha^2)-\beta_i S_i\right)^2\\
    \text{s.t. }&\alpha\in[0,1]
\end{align}
$$