In [7]:
import numpy as np

Consider the function $f({\bf x}) = 5x_{0}^{2} + x_{1}^{2} + 4x_{0}x_{1} - 14x_{0} - 6x_{1} +20$.

Define
$$
{\bf x}=
\left[\begin{array}{cc}
x_{0}\\
x_{1}
\end{array}\right],
Q=
\left[\begin{array}{cc}
10&4\\
4&2
\end{array}\right]
\hbox{ and }
{\bf q}=
\left[\begin{array}{cc}
-14\\
-6
\end{array}\right].
$$

Then $f({\bf x})=\displaystyle \frac{1}{2}{\bf x}^tQ{\bf x}+{\bf q}^t {\bf x}+20$. 
<hr>

In [8]:
def functionValue(x):
    y=5*x[0,0]**2+x[1,0]**2+4*x[0,0]*x[1,0]-14*x[0,0]-6*x[1,0]+20
    return y

def functionGradient(x):
    d=[[10*x[0,0]+4*x[1,0]-14],[2*x[1,0]+4*x[0,0]-6]]
    d=np.array(d)
    return d

In [9]:
x = np.zeros((2,1))
x[0,0]=1
x[1,0]=1
functionValue(x)
functionGradient(x)

array([[0.],
       [0.]])

Since
$$
\frac{\partial f}{\partial x_0}=10x_0+4x_1-14
$$
and 
$$
\frac{\partial f}{\partial x_1}=2x_1+4x_0-6,
$$
we obtain the graident $\nabla f(x)$ of $f(x)$ is 
$$
\nabla f(x)=\left[\begin{array}{c} 10x_0+4x_1-14 \\ 2x_1+4x_0-6 \end{array}\right].
$$
<hr>

From the definition of gradients, we consider
$$f({\bf x}) = f({\bf\bar{x}}) + \nabla f({\bf\bar{x}})^{t}({\bf x}-{\bf\bar{x}}) + \| {\bf x}-{\bf\bar{x}} \| \alpha ({\bf\bar{x}},{\bf x}-{\bf\bar{x}}),$$
The computation is given as follows:
$$f({\bf x}) - f({\bf\bar{x}}) = \frac{1}{2}{\bf x}^tQ{\bf x}+{\bf q}^t{\bf x} - \left(\frac{1}{2}{\bf\bar{x}}^tQ{\bf\bar{x}}+{\bf q}^t{\bf\bar{x}}\right)$$
$$
= \frac{1}{2}{\bf x}^tQ{\bf x}+{\bf q}^t({\bf x}-{\bf\bar{x}}) -{\bf\bar{x}}^tQ{\bf\bar{x}} +\frac{1}{2}{\bf\bar{x}}^tQ{\bf\bar{x}}+ {\bf\bar{x}}^tQ{\bf x}-{\bf\bar{x}}^tQ{\bf x}
$$
$$
= {\bf\bar{x}}^t Q({\bf x}-{\bf\bar{x}}) +{\bf q}^t({\bf x}-{\bf\bar{x}})+\frac{1}{2}{\bf x}^tQ{\bf x}  +\frac{1}{2}{\bf\bar{x}}^tQ{\bf\bar{x}}-{\bf\bar{x}}^tQ{\bf x}
$$
$$
= (Q{\bf\bar{x}})^t({\bf x}-{\bf\bar{x}}) +{\bf q}^t({\bf x}-{\bf\bar{x}})+\frac{1}{2}{\bf x}^tQ({\bf x}-{\bf\bar{x}})  -\frac{1}{2}{\bf\bar{x}}^tQ({\bf x} -{\bf\bar{x}})
$$
$$
= (Q{\bf\bar{x}}+{\bf q})^t({\bf x}-{\bf\bar{x}}) +\frac{1}{2}({\bf x}-{\bf\bar{x}})^tQ({\bf x}-{\bf\bar{x}})
$$

Since
$$
\frac{\partial f}{\partial x_0}=10x_0+4x_1-14
$$
and 
$$
\frac{\partial f}{\partial x_1}=2x_1+4x_0-6,
$$
we obtain the graident $\nabla f(x)$ of $f(x)$ is 
$$
\nabla f(x_0,x_1)=\left[\begin{array}{c} 10x_0+4x_1-14 \\ 2x_1+4x_0-6 \end{array}\right].
$$
Moreover, 
$$
\nabla f(x_0,x_1)=Q\left[\begin{array}{c}x_0 \\ x_1 \end{array}\right]+q
$$
where 
$$
Q=\left[\begin{array}{cc}10 &４\\4 & 2\end{array}\right], 
q=\left[\begin{array}{cc}-14\\-6\end{array}\right].
$$
At the same time, the function $f(x_0,x_1) = 5x_{0}^{2} + x_{1}^{2} + 4x_{0}x_{1} - 14x_{0} - 6x_{1} +20$ can be rewritten as 
$$
f(x_0,x_1)=\frac{1}{2}\left[\begin{array}{cc}x_0 & x_1 \end{array}\right]
Q\left[\begin{array}{c}x_0 \\ x_1 \end{array}\right]+q^t\left[\begin{array}{c}x_0 \\ x_1 \end{array}\right]
$$
i.e.
$$
f({\bf x})=\frac{1}{2}{\bf x}^t Q{\bf x}+q^t{\bf x}\quad and\quad \nabla f({\bf x})=Q{\bf x}+q 
$$
where ${\bf x}=\left[\begin{array}{c}x_0 \\ x_1 \end{array}\right]$.
<hr>

In [12]:
Q=np.zeros((2,2))
Q[0,0]=10
Q[0,1]=4
Q[1,0]=4
Q[1,1]=2
print(Q)

[[10.  4.]
 [ 4.  2.]]


In [13]:
q=np.zeros((2,1))
q[0,0]=-14
q[1,0]=-6
print(q)

[[-14.]
 [ -6.]]


In [15]:
def functionValue(x,Q,q,f0):
    y=(1/2)*x.transpose().dot(Q).dot(x)+q.transpose().dot(x)+f0
    #y=x[0,0]**2+x[1,0]**2+x[0,0]*x[1,0]+x[0,0]+x[1,0]+1
    return y
x = np.zeros((2,1))
x[0,0]=1
x[1,0]=1
f0=20

print(functionValue(x,Q,q,f0))
print(functionValue(x,Q,q,f0)[0,0])


[[10.]]
10.0


In [17]:
#Q.dot(x)+q
def functionGradient(x,Q,q):
    d=Q.dot(x)+q
    #d=[[x[0,0]+1],[x[1,0]+1]]
    #d=np.array(d)
    return d

In [19]:
x = np.zeros((2,1))
x[0,0]=2
x[1,0]=2
f0=20
print(functionValue(x,Q,q,f0))
print(functionValue(x,Q,q,f0)[0,0])
print(functionGradient(x,Q,q))
print(functionGradient(x,Q,q)[0,0])
print(functionGradient(x,Q,q)[1,0])


[[20.]]
20.0
[[14.]
 [ 6.]]
14.0
6.0


Unconstrained Problem:
\begin{align*}
    (\text{P}) \quad \min \quad & f({\bf x})  \\
    \text{s.t. } \quad  & \bar{\bf x} \in X,
\end{align*}
where $\bar{\bf x}=(x_{1},\cdots,x_{n}) \in \mathbb{R}^{n}, f(\bar{\bf x}):\mathbb{R}^{n} \rightarrow \mathbb{R},$ and $X$ is an open set (usually $X = \mathbb{R}^{n}$).
<hr>
Definition:

The direction ${\bf d}$ is called a descent direction of $f({\bf x})$ at ${\bf x} = \bar{\bf x}$ if there is a $\varepsilon>0$ such that for all $\lambda\in(0,\varepsilon)$
$$
f(\bar{\bf x}+\lambda {\bf d}) < f(\bar{\bf x}).
$$
<hr>

A $necessary \, condition$ for local optimality is a statement of the form: "if $\bar{\bf x}$ is a local minimum of (P), then $\bar{\bf x}$ must satisfy$\ldots$". Such a condition helps us identify all candidates for local optima.
<hr>
Theorem:

Suppose that $f({\bf x})$ is differentiable at $\bar{\bf x}$. If there is a vector $d$ such that $\nabla f(\bar{\bf x})^{t}d < 0$,
then there is a $\varepsilon>0$ such that for all $\lambda\in(0,\varepsilon)$, $f(\bar{\bf x}+ \lambda {\bf d}) < f(\bar{\bf x})$, and hence $d$ is a descent direction of $f({\bf x})$ at $\bar{\bf x}$.
<hr>


Theorem:

Suppose that $f({\bf x})$ is twice differentiable at $\bar{\bf x}$. If $\nabla f(\bar{\bf x}) = 0$ and $H(\bar{\bf x})$ is positive definite, then $\bar{\bf x}$ is a (strict) local minimum.
<hr>

In [7]:
barx=[np.array([[0],[10]])]
print(barx[0].shape)
print(barx[0])

(2, 1)
[[ 0]
 [10]]


In [4]:
functionValue()

60

In [7]:
dk=[-functionGradient()]
print(dk)

[array([[-26],
       [-14]])]


A natural consequence of this is the following algorithm, called the steepest
descent algorithm.
    
Step 0: Given ${\bf x}^{0}$, set $k:=0$

Step 1: ${\bf d}^{k}:= -\nabla f({\bf x}^{k})$. If ${\bf d}^{k}=0$, then stop.

Step 2: Solve $\displaystyle \min_{\alpha >0} f({\bf x}^{k} + \alpha {\bf d}^{k})$ for the step size $\alpha^{k}$, perhaps chosen by an exact or inexact line search.

Step 3: Set ${\bf x}^{k+1} \leftarrow {\bf x}^{k} + \alpha^{k} {\bf d}^{k}$, $k \leftarrow k+1$.Go to Step 1.

Note from Step 2 and the fact that ${\bf d}^{k} = - \nabla f({\bf x}^{k})$
is a descent direction, it follows that $f({\bf x}^{k+1}) < f({\bf x}^{k}).$
<hr>

<hr>
Step 0: Given ${\bf x}^{0}$, set $k:=0$

<hr>
Step 1: $d^{k}:= -\nabla f(x^{k})$.

<hr>
Step 2: Solve $\displaystyle \min_{\alpha >0} f({\bf x}^{k} + \alpha {\bf d}^{k})$ for the step size $\alpha^{k}$, perhaps chosen by an exact or inexact line search.

$$
\begin{array}{rcl}
f({\bf x}+\alpha {\bf d}) &= &5(x_{0}+\alpha d_{0})^{2} + (x_{1}+\alpha d_{1})^{2} + 4(x_{0}+\alpha d_{0})(x_{1}+\alpha d_{1}) - 14(x_{0}+\alpha d_{0}) - 6(x_{1}+\alpha d_{1}) +20\\
& = & (5d_{0}^{2}+d_{1}^{2}+4d_{0}d_{1}) \alpha^2 +(10x_{0}d_{0}+2x_{1}d_{1}+4x_{1}d_{0}+4x_{0}d_{1}-14d_{0}-6d_{1})\alpha +5x_{0}^{2} + x_{1}^{2} + 4x_{0}x_{1} - 14x_{0} - 6x_{1} +20\\
\end{array}
$$
Hence 
$$
\frac{d}{d \alpha} f({\bf x}+\alpha {\bf d})=2(5d_{0}^{2}+d_{1}^{2}+4d_{0}d_{1}) \alpha +10x_{0}d_{0}+2x_{1}d_{1}+4x_{1}d_{0}+4x_{0}d_{1}-14d_{0}-6d_{1}
$$
and the minimum solution of $\displaystyle \min_{\alpha >0} f({\bf x}^{k} + \alpha {\bf d}^{k})$ is given by
$$
\alpha^{*}=-\frac{10x_{0}d_{0}+2x_{1}d_{1}+4x_{1}d_{0}+4x_{0}d_{1}-14d_{0}-6d_{1}}{2(5d_{0}^{2}+d_{1}^{2}+4d_{0}d_{1})}
$$