# Optimization
## Subgradient method
A **subgradient** for a *convex* function $f:R^n\rightarrow R$ at point $x_0$ is a vector $g\in R^n$ such that:
<br> $f(x)\ge f(x_0)+g^T(x-x_0)$ for all $x\in R^n$
<br> The set of all subgradients of $f$ at point $x_0$ is called the **subdifferential** of $f$ at $x_0$, which is denoted by $\partial f(x_0)$.
<br> **Hint 1:** If $f$ is **differentiable**, then the subdifferential of $f$ at point $x_0$ will be $\{\nabla f(x_0)\}$.
<br>The **subgradient method** extends the *gradient descent* to convex non-differentiable functions:
<br>
<br> $x_0\leftarrow$some initial guess
<br> for $k$ in range(iter):
<br>$\;\;x_{k+1}=x_k-\eta_k g_k$
<br>
<br> where $g_k$ is any subgradient of $f$ at point $x_k$.
<br>Here, $\eta_k\ge0$ is the **step size** (learning rate). It can be kept constant. Or you may decrease it with iteration $k$. Also, you may use fixed-length step size by $\eta_k=\frac{\eta_0}{\|g_k\|_2}$ where $\eta_0$ is a positive number.
<br> **Hint 2:** We should keep the best solution found so far, since the subgradient method does not always choose a descent direction.
<br>**Reminder:** In the examples here, we use the fixed-length step size for gradient method; because we find it more stable in the experiments.
<br>The Python code at: https://github.com/ostad-ai/Optimization
<br> Explanation: https://www.pinterest.com/HamedShahHosseini/Optimization 

In [1]:
# importing required modules
import numpy as np

In [2]:
# the function f(x,y)=|3*x-2|+2*|y-5|, which we seek its minimum
def func(z):
    x,y=z.flatten()
    return np.abs(3*x-2)+2*np.abs(y-5)

# this function returns one subgradient of function defined above
# in numpy module, numpy.sign(0) returns zero
def subg_func(z):
    x,y=z.flatten()
    return np.array([3*np.sign(3*x-2),
            2*np.sign(y-5)]).reshape(-1,1)

iter=1000  # iteration 
etta0=.01  # initial step size
x=np.array([1,1]).reshape(-1,1)  # initial guess
# keep best solution found so far
xbest=x.copy(); fbest=func(xbest)

# subgradient loop
for _ in range(iter):
    gk=subg_func(x)
    etta=etta0/np.sqrt(np.sum(gk**2))
    x=x-etta*gk
    fnow=func(x)
    if fnow<fbest:
        xbest=x
        fbest=fnow
        
print('The optimum solution is: [2/3,5] with objective value=0')   
print(50*'-')
print(f'The best solution by subgradient method: {xbest.flatten()},'+\
      f'\nwhich makes the objective function {fbest}')

The optimum solution is: [2/3,5] with objective value=0
--------------------------------------------------
The best solution by subgradient method: [0.66717988 5.00493542],
which makes the objective function 0.01141048028687841


<hr>
In the second example of this notebook, we want to use subgradient method for finding the minimum of the following function:
<br> $|Ax-b|$
<br>where $A$ is a matrix, and $b$ is a column vector. and $x$ is the unknown vector

In [3]:
# example of defining matrix A and vector b
A=np.array([[1,2],[3,2]])
b=np.array([6,4]).reshape(-1,1)
# the exact solution is available by
xstar=np.linalg.inv(A)@b

In [4]:
# function |Ax-b|
def func2(x,A,b):
    return np.sum(np.abs(A@x-b))

# one subgradient of |Ax-b|
def subg_func2(x,A,b):
    return A.T@np.sign(A@x-b)

# solving |Ax-b| by subgradient method
iter=1000 # iteration
x=np.array([1,1]).reshape(-1,1) # initial guess
etta0=.01  # step size parameter
# best soluton found so far
xbest=x.copy();fbest=func2(xbest,A,b)
#subgradient loop
for k in range(iter):
    #etta=etta0/(k+1)
    gk=subg_func2(x,A,b)
    etta=etta0/np.sqrt(np.sum(gk**2))
    x=x-etta*gk
    fnow=func2(x,A,b)
    if fnow<fbest:
        xbest=x.copy(); fbest=fnow

print(f'The optimum solution is {xstar.flatten()} by objective value={func2(xstar,A,b)}')
print(50*'-')
print(f'The solution found by subgradient method is {xbest.flatten()}'+\
      f'\nwhich provides the objective value={fbest}')

The optimum solution is [-1.   3.5] by objective value=8.881784197001252e-16
--------------------------------------------------
The solution found by subgradient method is [-0.99391306  3.49608694]
which provides the objective value=0.012173875177023064
