<center><img src="Fig/Ensimag.png" width="30%" height="30%"></center>
<center><h3>Ensimag 2A</h3></center>
<hr>
<center><h1>Optimisation Numérique</h1></center>
<center><h2>TP3: Proximal Algorithms (2x1.5h)</h2></center>

# Structure of an optimization program

An optimization program can be practically divided into three parts:
* the *run* environment, in which you test, run your program, and display results.
* the *problem* part, which contains the function oracles, problem constraints, etc.
* the *algorithmic* part, where the algorithms are coded.

The main interest of such division is that these parts are interchangeable, meaning that, for instance, the algorithms of the third part can be used of a variety of problems. That is why such a decomposition is widely used.

In the present lab, you will use this division:
* `TP3_Proximal_algorithms.ipynb` will be the *run* environment
* `logistic_regression.ipynb` will be the considered *logistic regression problem* for this lab
* `algoProx.ipynb` will contain the proximal *algorithms* studied in this lab

---

The following script will allow you to import *notebooks* as if you imported *python files* and will have to be executed at each time you launch Jupyter notebooks.

In [None]:
import start
from imp import reload

---

# Composite minimization for machine learning.

In this lab, we will investigate optimization algorithms over composite functions composed of a smooth and a non-smooth part using the proximal gradient algorithm over a practical problem of machine learning: binary classification using logistic regression.</br>

> Read the file `logistic_regression_2.ipynb` containing the problem explanation and simulators. 

> Implement the proximal operation linked to $\ell_1$ norm in the regularization. 

> Implement the proximal gradient algorithm in the file `algoProx.ipynb` and test you algorithm below.


In [None]:
import algoProx             # load our algoProx module (from notebook)
reload(algoProx)            # reload the module if changed (and saved)
from algoProx import *      # import all methods of the module into the current environment

import numpy as np
import logistic_regression_2 as pb
reload(pb)

#### Parameter we give at our algorithm (see algoGradient.ipynb)
PREC    = 1e-5                     # Sought precision
ITE_MAX = 1000                      # Max number of iterations
x0      = np.zeros(pb.n)              # Initial point
step    = 1.0/pb.L

##### gradient algorithm
x,x_tab = proximal_gradient_algorithm(pb.F , pb.f_grad , pb.g_prox , x0 , step , PREC, ITE_MAX , True)



> Investigate the decrease of the algorithm.

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

F = []
for i in range(x_tab.shape[0]):
    F.append(pb.F(x_tab[i])) 

plt.figure()
plt.plot(F, color="black", linewidth=1.0, linestyle="-")
plt.grid(True)
plt.show()

> Plot, with the following command, the support of the vector $x_k$ (i.e. one point for every non-null coordinate of $x_k$) versus the iterations. 

> What do yo notice? Was it expected?

In [None]:
plt.figure()

for i in np.arange(0,x_tab.shape[0],int(x_tab.shape[0]/40)):
    for j in range(pb.n):
        if np.abs(x_tab[i,j])>1e-14:
            plt.plot( i , j  , 'ko')

plt.grid(True)
plt.ylabel('Non-null Coordinates')
plt.xlabel('Nb. Iterations')
plt.ylim(-1,pb.d+1)
plt.yticks(np.arange(0,pb.d+1))
plt.show()

---

# Regularization path.


We saw above that the algorithm *selected* some coordinates as the other get to zero. Considering our machine learning task (see `logistic_regression_2.ipynb`), this translates into the algorithm selecting a subset of the features that will be used for the prediction step.  

> Change the parameter $\lambda_1$ of the problem (`pb.lam1`) in the code above and investigate how it influences the number of selected features.

In order to quantify the influence of this feature selection, let us consider the *regularization path* that is the support of the final points obtained by our minimization method versus the value of $\lambda_1$.

> For $\lambda_1 = 2^{-12},2^{-11}, .. , 2^{1}$, run the proximal gradient algorithm on the obtained problem and store the support of the final point, the prediction performance on the *training set* (`pb.prediction_train`) and on the *testing set* (`pb.prediction_test`).

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

import algoProx             # load our algoProx module (from notebook)
reload(algoProx)            # reload the module if changed (and saved)
from algoProx import *      # import all methods of the module into the current environment

import numpy as np
import logistic_regression_2 as pb
reload(pb)

#### Parameter we give at our algorithm (see algoGradient.ipynb)
PREC    = 1e-5                     # Sought precision
ITE_MAX = 500                      # Max number of iterations
x0      = np.zeros(pb.n)              # Initial point
step    = 1.0/pb.L

# FILL THERE #######
reg_l1_tab = ...
pb.lam2 = 1e-1

train_perf = np.zeros(len(reg_l1_tab))
test_perf = np.zeros_like(train_perf)
x_tab = np.zeros((len(reg_l1_tab), pb.n))

for i, lam1 in enumerate(reg_l1_tab):
    pb.lam1 = lam1
    # FILL HERE #########
    # ###################


> Plot the *regularization path* and look at the feature signification (file `student.txt` or `logistic_regression_2.ipynb`) to see which are the most important features of the dataset.

> (Bonus: you can do some text manipulation to put the labels on the plot as well).

In [None]:
plt.figure()
for i, feat in enumerate(x_tab.T):
    nonzeros = np.flatnonzero(feat)
    plt.scatter(nonzeros - 12, (i+1)*np.ones_like(nonzeros), color='k', marker='o')

plt.ylabel('Non-null Coordinates')

plt.ylim(-1,pb.d+1)
plt.yticks(np.arange(0,pb.d+1))
plt.xlabel("log(lam1)")
plt.show()

> Plot the *training* and *testing* accuracies versus the value of $\lambda_1$.

In [None]:
log_lam = np.arange(-11, 2)
# FILL HERE ####
# ##############
plt.legend()
plt.show()

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

import algoProx             # load our algoProx module (from notebook)
reload(algoProx)            # reload the module if changed (and saved)
from algoProx import *      # import all methods of the module into the current environment

import numpy as np
import logistic_regression_2 as pb
reload(pb)

#### Parameter we give at our algorithm (see algoGradient.ipynb)
PREC    = 1e-3                     # Sought precision
ITE_MAX = 5000                     # Max number of iterations
x0      = np.zeros(pb.n)              # Initial point
step    = 1.0/pb.L

log_lam = np.arange(-7, 3)
reg_l2_tab = np.power( 2.0, log_lam )
pb.lam1 = 1e-5

train_perf = np.zeros(len(reg_l2_tab))
test_perf = np.zeros_like(train_perf)
x_tab = np.zeros((len(reg_l2_tab), pb.n))
x_tab_lst = []

for i, lam2 in enumerate(reg_l2_tab):
    pb.lam2 = lam2
    x, x_tab_alg = proximal_gradient_algorithm(pb.F , pb.f_grad , pb.g_prox , x0 , step , PREC, ITE_MAX, True)
    x_tab[i] = x
    x_tab_lst.append(x_tab_alg)
    _, train_perf[i] = pb.prediction_train(x, False)
    _, test_perf[i] = pb.prediction_test(x, False)

> Explore the proximal algorithm or propose ideas (cite your sources if you use pieces of litterature) to change it or compare it to something else. Send the results to your favorite TA by zipping/tarballing/... your work and either:
> * sending it directly via an email.
> * sending and email with an invitation to a PRIVATE repository (github, gitlab, bitbucket, etc) containing your work.
>
> ### Guidelines:
> Write your own code, do not try to throw LLM nonsense to your TA.
> 
> Be original.
> 
> Write every idea you have in the notebook, as verbosely as possible.
>
> Write clean code. E.g. go see python's pep8.