### Optimizer description: ZSCG
**Name:**  Zero Stochastic Conditional Gradient<br>
**Class:** zeroOptim.ClassicZSCG <br>
**Paper:** *Zeroth-order Nonconvex Stochastic Optimization: Handling Constraints, High-Dimensionality and Saddle-Points* (rishnakumar Balasubramanian†1 and Saeed Ghadimi‡2) <br>


**Description:** <br>
The Zero-order Stochastic Conditional Gradient Descent at each iteration *k* try to minimize *F(z)* with these 3 main steps:

    1. Estimate the gradient as follow:
$$G_{v}^{k} \equiv G_{v}(z_{k-1}, \xi_{k}, u_{k}) = \frac{1}{m_{k}} \sum_{j=1}^{m_{k}} \frac{F(z_{k-1} + vu_{k,j}, \xi_{k,j}) - (z_{k-1}, \xi_{k,j})}{v}u_{k,j}$$
    2. Solve this linear programming problem
$$x_{k} = argmin_{u\in\chi}\langle G_{v}^{k}, u\rangle$$ 
    3. Update z
$$z_{k+1} = (1-\alpha_{k})z_{k} + \alpha_{k} x_{k}$$ 

where: <br>
$z_{k}$ is our optimization parameter <br>
$\xi_{k}$ is a sample of our distribution <br>
$u_{k,j} \sim N(0, I_{d})$ <br>
$m_{k}$ is the number of gaussian vector to generate <br>
$\alpha_{k}$ is the momentum at time k <br>
$v$ is the gaussian smoothing parameter <br>

**Args:**

        Name            Type                Description
        x               (torch.tensor)      The variable of our optimization problem. Should be a 3D tensor (img)
        v               (float)             The gaussian smoothing
        n_gradient      (list)              Number of normal vector to generate at every step
        ak              (list)              Momentum  every step
        epsilon         (float)             The upper bound of norm
        L_type          (int)               Either -1 for L_infinity or x for Lx. Default is -1
        batch_size      (int)               Maximum parallelization during the gradient estimation. Default is -1 (=n_grad)
        C               (tuple)             The boundaires of the pixel. Default is (0, 1)
        max_steps       (int)               The maximum number of steps. Default is 100
        verbose         (int)               Display information or not. Default is 0
        additional_out  (bool)              Return also all the x. Default is False
        tqdm_disable    (bool)              Disable the tqdm bar. Default is False


     
     
**Suggested values:** <br>
$v = \sqrt{\frac{2B_{L_{\sigma}}}{N(d+3)^3}}$, 
$\alpha_{k} =\frac{1}{\sqrt{N}}$,
$m_{k} = 2B_{L_{\sigma}}(d + 5)N$,
$\forall k \geq 1$

where:<br>
- *N* is the number of steps <br>
- *d* is the dimension of *x* <br>
- $\sigma$ is the Strong Convexity coefficient
- $B \geq ||f(x)||, \forall x \in \chi$
- $B_{L_{\sigma}} ≥ max\bigg\{\sqrt{\frac{B^2 + \sigma^2}{L}}, 1\bigg\}$

**Empirical values:** <br>
In case of MNIST we can set:<br>
$N = 100$, $B_{L_{\sigma}} = 1$ and we have a image 28 * 28 ($d = 784$), so:<br>
- $v = 10e-6$
- $\alpha_{k} = 0.1$
- $m_{k} = 78900$

**N.B** <br>
In reality it seems that $\alpha_{k}$ could be set higher (e.g. 0.2) and $m_{k}$ could be set much lower (e.g. 8000) and doesn't need to be dependent of the number of steps *N*.



### Results

Morover it has been seen that the success rate was fairly independent by the guassian smoothing parameter *v* and by the momentum *alpha* with good values respectively 0.001 and 0.2. For this reason the results are taken by changing the different level of *epsilon* and *n_gradient* (the upper bound of the norm and the number of function evaluation per step).

The maximum number of step has been set to 100.

**N.B**
1. All the results are taken in the *google colab enviroment* using the available GPU *Tesla K80*. <br>
2. The results are all taken with the torch random seed set as *42*. <br>
3. The results are tanken using 100 random sample. In case for target attack with high number of model call for the gradient evaluation (*n_gradient*) 20 random examples were used. Results taken with only 20 examples are indicated with '*'
4. In the *target* case all the images sampled are used in a target attack against all the other category. For this reason using a sample of 20 images means, in the case of *MNIST*, means performin 180 attacks. 


**1. MNIST**
    
    1.a) Untarget
        
         1.a.i)  Ininfity norm:
         
                 
                 Epsilon           Success rate                       Avg time                          n_gradient          
                 _________________________________________________________________________________________________
                 
                 0.25               0.94                               0.21                                500
                                    0.98                               0.21                               1000
                                    0.99                               0.26                               2000
                                    1.00                               0.39                               4000
                                    1.00                               0.51                               8000
                                    1.00                               0.83                              16000
                                   ________________________________________________________________________________ 
                 
                 0.20               0.78                               0.48                                500
                                    0.83                               0.65                               1000
                                    0.90                               0.76                               2000
                                    0.97                               0.86                               4000
                                    0.97                               1.09                               8000
                                    0.97                               1.92                              16000
                                   ________________________________________________________________________________
                           
                 0.15               0.58                               0.83                                500
                                    0.61                               1.26                               1000
                                    0.69                               1.72                               2000
                                    0.74                               2.79                               4000
                                    0.77                               3.63                               8000
                                    0.82                               5.36                              16000                 
                                   _______________________________________________________________________________
                 
                 0.10               0.33                               1.23                                500
                                    0.40                               1.86                               1000
                                    0.43                               2.82                               2000
                                    0.48                               4.95                               4000
                                    0.51                               6.84                               8000
                                    0.53                              11.22                              16000
                                   _______________________________________________________________________________
                           
                 0.05               0.11                               1.58                                500
                                    0.13                               2.58                               1000
                                    0.15                               4.01                               2000
                                    0.17                               7.56                               4000
                                    0.18                              10.87                               8000
                                    0.19                              18.29                              16000

                                    
                                    
                                    
         1.a.ii) L2 norm:
         
                 Epsilon           Success rate      Avg time      n_gradient      alpha       max_epochs     v        
                 _________________________________________________________________________________________________
                 
                 0.2
                 
                 
                 
    1.b) Target
    
        1.b.i)  Infinity norm:


                 Epsilon           Success rate                       Avg time                          n_gradient     
                 ____________________________________________________________________________________________________           
                                   
                 0.5               0.97                                 0.17                              500
                                   0.99                                 0.21                             1000          
                                   0.99                                 0.28                             2000          
                                   0.99                                 0.51                             4000          
                                   0.99                                 0.70                             8000  
                                   ___________________________________________________________________________________      
                 
                 
                 0.4               0.89                                 0.34                              500
                                   0.94                                 0.40                             1000          
                                   0.96                                 0.49                             2000          
                                   0.97                                 0.84                             4000          
                                   0.97                                 1.14                             8000  
                                   ___________________________________________________________________________________
                                   
                 0.3               0.64                                 0.75                              500
                                   0.75                                 0.96                             1000          
                                   0.82                                 1.21                             2000          
                                   0.85                                 2.04                             4000          
                                   0.89*                                2.51                             8000
                                   0.88*                                4.21                            16000
                                   ___________________________________________________________________________________
                                   
                 0.2               0.29*                                1.31                              500
                                   0.35*                                2.02                             1000          
                                   0.42*                                2.91                             2000          
                                   0.49*                                5.03                             4000          
                                   0.56*                                6.63                             8000
                                   ----*                                ----                            16000

                           