# Question.1

## (i)
Using lagrange multiplier,
$$
min - \sum \log x_i, \quad \text{s.t.} \Sigma x_i = 1
$$
Denote 
$$
L = - \sum \log x_i + \lambda(\Sigma x_i - 1)
$$
Apply F.O.C

$$
\frac{\partial L}{\partial x_i} = 0 \Rightarrow -\frac{1}{x_i} + \lambda = 0 \Rightarrow x_i = \frac 1\lambda
$$
$$
\frac{\partial L}{\partial \lambda} = 0 \Rightarrow \Sigma x_i = 1 \Rightarrow x_i = \frac 1n, \lambda = n
$$

## (ii)
Similar to (i), denote that 
$$
L = -\sum\log x_i + (Ax - b)^T\lambda
$$
Apply F.O.C,

Here, $A_{ji}$ is the jth row, ith col element in A.
$$
\frac{\partial L}{\partial x_i} = 0 \Rightarrow  -\frac{1}{x_i} + \lambda\sum_j A_{ji} = 0
$$
$$
\frac{\partial L}{\partial\lambda} = 0 \Rightarrow Ax = b.
$$
From the first equaiton, we find that 
$$x_i = \frac{1}{\lambda\sum_jA_{ji}} = \frac{1}{(A^T\lambda)_i}$$.

Also we have,
$$Ax = 
A\left(\begin{array}{c}\frac{1}{\lambda \sum_j A_{j1}}\\
\vdots\\
\frac{1}{\lambda \sum_j A_{jn}}\\
\end{array}\right) = 
\left(
\begin{array}{c}\sum_i\frac{ A_{1i}}{\lambda \sum_j A_{ji}}\\
\vdots\\
\sum_i\frac{ A_{ni}}{\lambda \sum_j A_{ji}}\\
\end{array}\right)
$$
Thus we have
$$
x^TA^T \lambda = \left(
\begin{array}{c}\sum_i\frac{ A_{1i}}{\lambda \sum_j A_{ji}}&
\cdots&
\sum_i\frac{ A_{ni}}{\lambda \sum_j A_{ji}}\\
\end{array}\right)
\left(\begin{array}{cc}\lambda\\ 
\vdots\\
\lambda \end{array}\right) = n
$$
Thus we get the dual larange function
$$
q(\lambda) = \inf_{\lambda}( -\sum\log x_i + (Ax - b)^T\lambda) = -b^T\lambda - \sum_{i = 1}^{n} \frac{1}{(A^T\lambda)_i} + n
$$
i.e.
$$
\max q(\lambda) = -b^T\lambda +\sum_{i = 1}^{n} {(A^T\lambda)_i} + n\qquad \text{where } \lambda > 0.
$$

This is the dual problem.


## (iii)
Already solved in (ii)

## (iv)
By newton's method, we need to calculate the matrix equation
$$
\begin{bmatrix}
\nabla^2 f &A^T\\
A& 0
\end{bmatrix}
\begin{bmatrix}
-\Delta x\\
\lambda
\end{bmatrix}
=
\begin{bmatrix}
-\nabla f(x)\\
0
\end{bmatrix}
$$
It's easy to see that 

$$
\nabla f = (-\frac{1}{x_1},\cdots, -\frac{1}{x_n})^T
$$

$$
\nabla^2 f = diag(\frac{1}{x_1^2},\cdots, \frac{1}{x_n^2})
$$

In [166]:
import numpy as np
from numpy.linalg import inv
def NewtonMethod_acp(A, x0, alpha, beta, tol):
    def decrement(dfx, H):
        # define the function to calculate lambda(x), which is the Newton decrement
        return np.sqrt(np.dot(dfx.T,np.dot(np.diag(1/np.diag(H)), dfx)))
    (row, col) = A.shape
    x = x0
    H = np.diag(x) # H is hessian of f(x)
    upper = np.column_stack((H, A.T)) # [Hessian, A^T]
    lower = np.column_stack((A, np.zeros((row, row)))) # [A, 0]
    KKT = np.row_stack((upper, lower)) # this the the matrix on the left
    dfx = 1 / x # this is - nabla f(x)
    RHS = np.append(dfx, np.zeros(row)) # right hand side
    epsilon =  decrement(dfx, H) # get the Newton decrement
    count = 0
    while epsilon > tol and count < 50:
        count = count + 1
        LHS = np.squeeze(np.array(inv(KKT).dot(RHS)))
        dx = np.array(LHS)[:col]
        t = 1
        print (x)
        # using backtracking line search
        while -np.log(x + t * dx).sum() >= -np.log(x).sum() - alpha * t * dfx.T.dot(dx):
            t = t * beta
        print (t)
        # update the parameters
        x = x + t * dx
        H = np.diag(x)
        upper = np.column_stack((H, A.T))
        lower = np.column_stack((A, np.zeros((row, row))))
        KKT = np.row_stack((upper, lower))
        dfx = 1 / x
        RHS = np.append(dfx, np.zeros(row))
        epsilon =  decrement(dfx, H)
    return x

In [43]:
A = np.matrix([1,1,1])

In [164]:
x0 = np.array([0.5,0.9,1])

In [50]:
a = 0.1
b = 0.5
tol = 0.00001

In [168]:
res

array([-0.51337591,  2.02675183, -0.51337591])

# Question.2

## (a)

First, show the equivelent between these 4 conditions.
___
* Condition 1 is equivalent to Condition 2

Condition 1 $\rightarrow$ Condition 2

Suppose $x \in \mathcal N(A) \cap \mathcal N(H)$ and $x \neq 0$. In condition 2, we have  $Ax = 0\, x \neq 0$. Since $x \in \mathcal N(A) \cap \mathcal N(H)$, we have $x^THx = 0$, this contradict to to second statement.

Condition 2 $\rightarrow$ Condition 1

If there is an $x$ s.t. $Ax = 0\, x \neq 0, x^THX = 0$. Because $H$ is semipositive definate, we must have $Hx = 0$, i.e.   $x \in \mathcal N(A) \cap \mathcal N(H)$
___
* Condition 2 is equivalent to Condition 3

If $Ax = 0, x \neq 0$, then because $\mathcal R(F) = \mathcal N (A)$, we have $x = Fz$ and $z \neq 0$. Then because of 2 we have $x^T Hx = z^TF^THFZ > 0$

___
* Condition 2 is equivalent to Condition 4
If we have the condition 2. then 

$$
x^T(H + A^TA) = x^THx + \vert\vert A^Tx\vert\vert^2_2 > 0
$$
for all nonzero x, sot the condition 4 holds for $Q = I$.
If the fourth condition hols with general semipositive definate form of $Q$.
$$
x^T(H + A^TQA) = x^THx + x^TA^TQAx > 0
$$
for all nonzero $x$, Therefore if $Ax = 0\, x \neq 0$, we must have $x\neq 0$
___
Second show these four conditions are equivalent to nonsigularity of KKT matrix.

Suppose $x\neq 0$, s.t.$Ax = 0, Hx = 0$
$$
\begin{bmatrix}
H & A^T\\
A & 0
\end{bmatrix}
\begin{bmatrix}
x\\
0
\end{bmatrix} = 0$$

If the KKT matrix is singular, $x,z$ are not both zero, and we have
$$
\begin{bmatrix}
H & A^T\\
A & 0
\end{bmatrix}
\begin{bmatrix}
x\\
z
\end{bmatrix} = 0$$

This leads to $Hx + A^Tz = 0$, and $Ax = 0$. 
from the first equation, we have 
$$
x^THx + x^TA^Tz = 0.
$$
plug in the second equation.
we have $x^THx = 0$, this contradicts second condition unless $x= 0$.

## (b)
From previous question, we have $H + A^TA$ is positive definate matrix. Then we must hvae $R \in \mathbb R^{n\times n}$, s.t.
$$
R^T(H + A^TA)R = I
$$
Apply SVD to $AR$, $AR = U\Sigma V_1^T$, where $\Sigma = diag(\sigma_1,\cdots, \sigma_p)$. Denote $V_2 \in \mathbb R^{n\times(n-p)}$, s.t.
$$
V = \left[V_1\quad V_2\right]
$$
$V_2$ is orthogonal, and let 
$$
S = \left[\Sigma\quad 0\right] \in \mathbb R^{p\times n}
$$
We have $AR = USV^T$, then 
$$
V^TR^T(H + A^TA)RV = V^TR^THRV + S^TS = I
$$
Becasue $V^TR^THRV = I - S^TS$ is diagonal matrix, then denote it as $D$
$$
D = V^TR^THRV = diag(1-\sigma_1^2, \cdots, 1-\sigma_p^2, 1, \cdots, 1)
$$
Apply the transform to KKT matrix, we have

$$
\begin{bmatrix}
V^TR^T & 0\\
0 & U
\end{bmatrix}
\begin{bmatrix}
H & A^T\\
A & 0
\end{bmatrix}
\begin{bmatrix}
RV & 0\\
0 & U
\end{bmatrix}
=
\begin{bmatrix}
D & S^T\\
S & 0
\end{bmatrix}
$$
Apply the permutation to hte matrix on the right gives a block diaonal matrix with n diagonal blocks.
$$
\begin{bmatrix}
\lambda_i & \sigma_i\\
\sigma_i & 0
\end{bmatrix}
$$
where $i = 1,2,\cdots, p$, 

and $\lambda_i = 1$ for $i = p +1, \cdots, n$

This matrix have eiggenvalues of $\frac{\lambda_i \pm \sqrt{\lambda_i^2 + 4\sigma_i^2}}{2}$, i.e. one eigen value is positive, another is negative. 

In total, there are n positive evalues and p negative evalues.