### Problem 1 (50 points) 

Vapor-liquid equilibria data are correlated using two adjustable parameters $A_{12}$ and $A_{21}$ per binary
mixture. For low pressures, the equilibrium relation can be formulated as:

$$
\begin{aligned}
p = & x_1\exp\left(A_{12}\left(\frac{A_{21}x_2}{A_{12}x_1+A_{21}x_2}\right)^2\right)p_{water}^{sat}\\
& + x_2\exp\left(A_{21}\left(\frac{A_{12}x_1}{A_{12}x_1+A_{21}x_2}\right)^2\right)p_{1,4 dioxane}^{sat}.
\end{aligned}
$$

Here the saturation pressures are given by the Antoine equation

$$
\log_{10}(p^{sat}) = a_1 - \frac{a_2}{T + a_3},
$$

where $T = 20$($^{\circ}{\rm C}$) and $a_{1,2,3}$ for a water - 1,4 dioxane
system is given below.

|             | $a_1$     | $a_2$      | $a_3$     |
|:------------|:--------|:---------|:--------|
| Water       | 8.07131 | 1730.63  | 233.426 |
| 1,4 dioxane | 7.43155 | 1554.679 | 240.337 |


The following table lists the measured data. Recall that in a binary system $x_1 + x_2 = 1$.

|$x_1$ | 0.0 | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | 1.0 |
|:-----|:--------|:---------|:--------|:-----|:-----|:-----|:-----|:-----|:-----|:-----|:-----|
|$p$| 28.1 | 34.4 | 36.7 | 36.9 | 36.8 | 36.7 | 36.5 | 35.4 | 32.9 | 27.7 | 17.5 |

Estimate $A_{12}$ and $A_{21}$ using data from the above table: 

1. Formulate the least square problem; 
2. Since the model is nonlinear, the problem does not have an analytical solution. Therefore, solve it using the gradient descent or Newton's method implemented in HW1; 
3. Compare your optimized model with the data. Does your model fit well with the data?

---

### 1
The objective function is following:<br />
$min_{A_{12},A_{21}}\Sigma_{i=1}^{n}(p(x^{(i), A_{12}, A_{21}})-p^{(i)})^2$ <br />
where $n = 11$,  $x_2=1-x_1$<br />
$p(x^{(i)}, A_{12},A_{21})=x_1^{(i)}exp(A_{12}(\frac{A_{21}x_2^{(i)}}{A_{12}x^{(i)}_1+A_{21}x_2^{(i)}})^2)p_{water}^{sat}+x_2^{(i)}exp(A_{12}(\frac{A_{21}x_1^{(i)}}{A_{12}x^{(i)}_1+A_{21}x_2^{(i)}})^2)p_{1,4dioxane}^{sat}$

### 2

In [1]:
import torch as t
from torch.autograd import Variable
import numpy as np

pwater_sat = np.power(10, (8.071 - 1730.63/(20 + 233.426)))
p14_sat = np.power(10, (7.432 - 1554.68/(20 + 240.337)))
x = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
p = [28.1, 34.4, 36.7, 36.9, 36.8, 36.7, 36.5, 35.4, 32.9, 27.7, 17.5]

def pressure(a, xi):

    x1 = xi
    x2 = 1 - x1
    A12 = a[0]
    A21 = a[1]
    p_n = x1 * t.exp(A12* ( (A21*x2)/(A12*x1 + A21*x2) )**2 ) * pwater_sat + x2 * t.exp(A21* ( (A12*x1)/(A12*x1 + A21*x2) )**2 ) * p14_sat
    return p_n

def object_fun(a):

    loss_ = 0
    n = len(x)

    for i in range(n):
        xi = x[i]
        p_n = pressure(a, xi)
        loss_ = loss_ + (p_n - p[i])**2  # Sum the squared difference.

    return loss_

def linesearch(a):

    step = 0.1
    while object_fun(a - step*a.grad) > object_fun(a)-step*(0.5)*np.matmul(a.grad, a.grad):
        step = 0.5 * step

    return step

a = Variable(t.tensor([1.0, 1.0]), requires_grad=True)
diff = 100

while diff > 0.1:

    obj = object_fun(a)
    obj.backward()
    step = linesearch(a)
    diff = t.linalg.norm(a.grad)

    with t.no_grad():
        a -= step * a.grad
        a.grad.zero_()

print('A12 and A21 is {}, respectly, with loss {}'.format(a.data.numpy(), obj.data.numpy()))

A12 and A21 is [1.9545084 1.6900573], respectly, with loss 0.7166754603385925


### 3

In [4]:
a_ = a.data
print('True Pressures   Model Pressures')
y_pred = np.zeros([len(x), 1])
for i in range(len(x)):
    y_pred[i] = pressure(a_, x[i])
    print('The true pressure is {}, and the model pressure is {}'.format(p[i], pressure(a, x[i])))

from sklearn.metrics import mean_squared_error
err = mean_squared_error(p, y_pred)
print('The mean squared error of the predicted data w.r.t true values is {}'.format(err))

True Pressures   Model Pressures
The true pressure is 28.1, and the model pressure is 28.85372543334961
The true pressure is 34.4, and the model pressure is 34.645721435546875
The true pressure is 36.7, and the model pressure is 36.45185852050781
The true pressure is 36.9, and the model pressure is 36.866844177246094
The true pressure is 36.8, and the model pressure is 36.873008728027344
The true pressure is 36.7, and the model pressure is 36.747642517089844
The true pressure is 36.5, and the model pressure is 36.38789367675781
The true pressure is 35.4, and the model pressure is 35.38384246826172
The true pressure is 32.9, and the model pressure is 32.949913024902344
The true pressure is 27.7, and the model pressure is 27.73257064819336
The true pressure is 17.5, and the model pressure is 17.460784912109375
The mean squared error of the predicted data w.r.t true values is 0.06515213753043846


Since the means squared error is 0.065, the model-predicted pressures fit very well with the data.
### Problem 2 (50 points)
Solve the following problem using Bayesian Optimization:
$$
    \min_{x_1, x_2} \quad \left(4-2.1x_1^2 + \frac{x_1^4}{3}\right)x_1^2 + x_1x_2 + \left(-4 + 4x_2^2\right)x_2^2,
$$
for $x_1 \in [-3,3]$ and $x_2 \in [-2,2]$. A tutorial on Bayesian Optimization can be found [here](https://thuijskens.github.io/2016/12/29/bayesian-optimisation/).

In [2]:
# A simple example of using PyTorch for gradient descent

import torch as t
from torch.autograd import Variable

# Define a variable, make sure requires_grad=True so that PyTorch can take gradient with respect to this variable
x = Variable(t.tensor([1.0, 0.0]), requires_grad=True)

# Define a loss
loss = (x[0] - 1)**2 + (x[1] - 2)**2

# Take gradient
loss.backward()

# Check the gradient. numpy() turns the variable from a PyTorch tensor to a numpy array.
x.grad.numpy()

array([ 0., -4.], dtype=float32)

In [3]:
# Let's examine the gradient at a different x.
x.data = t.tensor([2.0, 1.0])
loss = (x[0] - 1)**2 + (x[1] - 2)**2
loss.backward()
x.grad.numpy()

array([ 2., -6.], dtype=float32)

In [4]:
# Here is a code for gradient descent without line search

import torch as t
from torch.autograd import Variable

x = Variable(t.tensor([1.0, 0.0]), requires_grad=True)

# Fix the step size
a = 0.01

# Start gradient descent
for i in range(1000):  # TODO: change the termination criterion
    loss = (x[0] - 1)**2 + (x[1] - 2)**2
    loss.backward()
    
    # no_grad() specifies that the operations within this context are not part of the computational graph, i.e., we don't need the gradient descent algorithm itself to be differentiable with respect to x
    with t.no_grad():
        x -= a * x.grad
        
        # need to clear the gradient at every step, or otherwise it will accumulate...
        x.grad.zero_()
        
print(x.data.numpy())
print(loss.data.numpy())

[1.        1.9999971]
8.185452e-12
