# LAB 3: NUMBA 
### SGD Optimisation with Numba
### AI and Machine Learning // Suchkova Natalia М8О-114М-22
09.11.2022 @ MAI IT-Center

## Table of Contents

1. [**Without Numba**](#gen1)

    - [Classical GD](#gen11) 
        + [Himmelblau](#gen111)
        + [Rastrigin](#gen112)
        + [Rosenbrock](#gen113)

    - [AdaGrad](#gen12)
        + [Himmelblau](#gen121)
        + [Rastrigin](#gen122)
        + [Rosenbrock](#gen123)


2. [**With Numba**](#gen2)

    - [Classical GD](#gen21) 
        + [Himmelblau](#gen211)
        + [Rastrigin](#gen212)
        + [Rosenbrock](#gen213)

    - [AdaGrad](#gen22)
        + [Himmelblau](#gen221)
        + [Rastrigin](#gen222)
        + [Rosenbrock](#gen223)



3. [**Result Comparison Table**](#gen3)



In [190]:
import numpy as np
import pandas as pd
from numba import njit

from typing import Tuple, Mapping

import matplotlib.pyplot as plt
from IPython import display

In [191]:
res_df = pd.DataFrame(columns=['Problem', 'Method', 'Time w/o Numba', 'Numba Time']) # создаем дф для учета результатов

<a id='gen1'></a>

## Without Numba

In [192]:
class Himmelblau:
    
    def function(x: np.ndarray) -> np.float64:
        return (x[0]**2 + x[1] - 11)**2 + (x[0] + x[1]**2 - 7)**2
    
    def gradient(x: np.ndarray) -> np.array:
        return np.array([4 * x[0] * (x[0]**2 + x[1] - 11) \
                         + 2 * (x[0] + x[1]**2 - 7),
                         2 * (x[0]**2 + x[1] - 11) \
                         + 4 * x[1] * (x[0] + x[1]**2 - 7)])

class Rosenbrock: 
    
    def function(x: np.ndarray, b: int = 100) -> np.float64:
        return b * (x[1] - x[0]**2)**2 + (x[0] - 1)**2

    def gradient(x: np.ndarray, b: int = 100) -> np.array:

        return np.array([2 * (x[0] - 1) \
                         - 4 * b * x[0] * (x[1] - x[0]**2),
                         2 * b * (x[1] - x[0]**2)])
    
class Rastrigin:
    
    def function(x: np.ndarray, A: int = 10) -> np.float64:
        return list(x**2 - A * np.cos(2 * np.pi * x))[0] # + A
    
    def gradient(x: np.float32, A: int = 10) -> np.float64:
            return 2 * x + A * 2 * np.pi * np.sin(2 * np.pi * x)

Упростим еще немного код, который был.

In [193]:
# def classic_GD (
#                 function: Mapping, gradient_of_function: Mapping,
#                 start: np.ndarray, learning_rate: float = 0.01, 
#                 n_iter: int = 100, tolerance: float = 1e-5
#                 ) -> Tuple [np.ndarray, np.float32]:
    
#     """ 
#     Args:
#         function (Mapping): минимизруемая функция
#         gradient_of_function (Mapping): градиент минимизируемой функции
#         start (np.ndarray): рандомная стартовая точка
#         learning_rate (float): шаг минимизации
#         n_iter (int): количество итераций градиентного спуска
#         tolerance (float): минимальное допустимое изменение 
#                     значения минимизируемой величины
#     Return:
#         tuple with found minimum coordinates, found minimum function value, 
#         and plotting data list with multiple np.ndarray aka dots for plotting
    
#    """ 
        
#     current_point = start.copy()
#     current_point = current_point.astype('float64')

#     for iter in range(n_iter):
#         diff = learning_rate * -gradient_of_function(current_point)

#         if np.all(np.abs(diff) <= tolerance):
#             print(f'\033[1mEarly stopping!!\033[0m')
#             break

#         iter_count = iter + 1    
#         current_point += diff

#     print(f' Finished in {iter_count} iterations\n', \
#           f'Minimum point coordinates \033[1m{current_point}\033[0m,', \
#           f'with function value = \033[1m{round(function(current_point), 4)}\033[0m')


#     return current_point, function(current_point)

<a id='gen11'></a>

In [194]:
def classic_GD (
                function: Mapping, gradient_of_function: Mapping,
                start: np.ndarray, learning_rate: float = 0.01, 
                n_iter: int = 100,
                ) -> Tuple [np.ndarray, np.float32]:
    
    """ 
    Args:
        function (Mapping): минимизруемая функция
        gradient_of_function (Mapping): градиент минимизируемой функции
        start (np.ndarray): рандомная стартовая точка
        learning_rate (float): шаг минимизации
        n_iter (int): количество итераций градиентного спуска

    Return:
        tuple with found minimum coordinates, found minimum function value, 
        and plotting data list with multiple np.ndarray aka dots for plotting
    
   """    
    current_point = start.copy()
    current_point = current_point.astype('float64')

    for iter in range(n_iter):   
        current_point = current_point - learning_rate * gradient_of_function(current_point)

    return current_point, function(current_point)

In [195]:
# wall time

<a id='gen111'></a>
#### Himmelblau

In [196]:
%%time 
GD_Him = classic_GD(Himmelblau.function, Himmelblau.gradient, np.array([-100, -100]), learning_rate=0.00001, n_iter=7000)

Wall time: 53.9 ms


In [197]:
print(f'Minimum point coordinates \033[1m{GD_Him[0]}\033[0m,', \
      f'with function value = \033[1m{round(GD_Him[1], 4)}\033[0m')

Minimum point coordinates [1m[-3.78698073 -3.29532809][0m, with function value = [1m0.0073[0m


In [198]:
# cpu time
time = %timeit -o classic_GD(Himmelblau.function, Himmelblau.gradient, np.array([-100, -100]), learning_rate=0.00001, n_iter=7000)

39.5 ms ± 1.01 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [199]:
data = ['Himmelblau', 'Classical GD', str(time)[:25], '']
res_df.loc[len(res_df)] = data

In [200]:
res_df

Unnamed: 0,Problem,Method,Time w/o Numba,Numba Time
0,Himmelblau,Classical GD,39.5 ms ± 1.01 ms per loo,


<a id='gen112'></a>
#### Rastrigin

In [201]:
%%time
GD_Rast = classic_GD(Rastrigin.function, Rastrigin.gradient, np.array([100, 100]), learning_rate=0.1, n_iter=9000)

Wall time: 43.9 ms


In [202]:
print(f'Minimum point coordinates \033[1m{GD_Rast[0]}\033[0m,', \
      f'with function value = \033[1m{round(GD_Rast[1], 4)}\033[0m')

Minimum point coordinates [1m[0.76548588 0.76548588][0m, with function value = [1m-0.3855[0m


In [203]:
# cpu time
time = %timeit -o classic_GD(Rastrigin.function, Rastrigin.gradient, np.array([100, 100]), learning_rate=0.1, n_iter=9000)

40.2 ms ± 977 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [204]:
data = ['Rastrigin', 'Classical GD', str(time)[:25], '']
res_df.loc[len(res_df)] = data

<a id='gen113'></a>
#### Rosenbrock

In [205]:
%%time
GD_Ros = classic_GD(Rosenbrock.function, Rosenbrock.gradient, np.array([10, 10]), 
                    learning_rate=0.00001, n_iter=200000)

Wall time: 798 ms


In [206]:
print(f'Minimum point coordinates \033[1m{GD_Ros[0]}\033[0m,', \
      f'with function value = \033[1m{round(GD_Ros[1], 4)}\033[0m')

Minimum point coordinates [1m[3.02969959 9.18234049][0m, with function value = [1m4.1207[0m


In [207]:
# cpu time
time = %timeit -o classic_GD(Rosenbrock.function, Rosenbrock.gradient, np.array([10, 10]), learning_rate=0.00001, n_iter=200000)

794 ms ± 9.33 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [208]:
data = ['Rosenbrock', 'Classical GD', str(time)[:25], '']
res_df.loc[len(res_df)] = data

<a id='gen12'></a>

In [253]:
def GD_AdaGrad(
            function: Mapping, gradient: Mapping, start: np.ndarray,
            gamma: np.float64 = 0.1, n_iter: int = 100, 
            learning_rate: float = 0.01, tolerance=1e-06, 
            dtype="float64", rand_state: int = 12
            ) -> Tuple [np.ndarray, np.ndarray]:
    """ 
        function (Mapping):  минимизруемая функция
        gradient (Mapping): градиент заданной выше функции
        start (np.ndarray): рандомная стартовая точка/точки
        gamma (float): скорость затухания скользащих средних ф-ии потерь
        n_iter (int): количество итераций градиентного спуска
        learning_rate (float): шаг минимизации
        tolerance (float): минимальное допустимое изменение 
                           значения минимизируемой величины
        dtype (str): тип данных
        rand_state (int): зерно рандомайзера
        
    """
    dtype_ = np.dtype(dtype)

    cur_point = start.copy()
    cur_point = cur_point.astype('float64')

    G = np.zeros(cur_point.shape, dtype=dtype_) 
   
    for iter in range(n_iter):
        loss = gradient(cur_point)
        G = gamma * G + loss ** 2

        if np.all(np.abs(loss/np.sqrt(G)) <= tolerance):
            print(f'\033[1mEarly stopping!!\033[0m')
            break

        cur_point -= (loss/np.sqrt(G)) * learning_rate

    return cur_point, function(cur_point)

<a id='gen121'></a>
#### Himmelblau

In [210]:
%%time
Him_Ada = GD_AdaGrad(Himmelblau.function, Himmelblau.gradient, np.array([100, 100]),
                     gamma=0.8, learning_rate=0.01, n_iter=30000)

Wall time: 511 ms


In [211]:
print(f'Minimum point coordinates \033[1m{Him_Ada[0]}\033[0m,', 
      f'with function value = \033[1m{round(Him_Ada[1], 6)}\033[0m')

Minimum point coordinates [1m[3.00223411 2.00223281][0m, with function value = [1m0.000369[0m


In [212]:
# cpu time
time = %timeit -o GD_AdaGrad(Himmelblau.function, Himmelblau.gradient, np.array([100, 100]), gamma=0.8, learning_rate=0.01, n_iter=30000)

495 ms ± 3.26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [213]:
data = ['Himmelblau', 'AdaGrad', str(time)[:25], '']
res_df.loc[len(res_df)] = data

<a id='gen122'></a>
#### Rastrigin

In [214]:
%%time
Rast_Ada = GD_AdaGrad(Rastrigin.function, Rastrigin.gradient, np.array([-10, 100]), 
                      tolerance=1e-4, gamma=0.2, learning_rate=0.51, n_iter=2000000)

Wall time: 29.3 s


In [215]:
print(f'Minimum point coordinates \033[1m{Rast_Ada[0]}\033[0m,', 
      f'with function value = \033[1m{round(Rast_Ada[1], 6)}\033[0m')

Minimum point coordinates [1m[-5.02130232  5.0292553 ][0m, with function value = [1m15.302918[0m


In [216]:
# cpu time
time = %timeit -o GD_AdaGrad(Rastrigin.function, Rastrigin.gradient, np.array([-10, 100]), tolerance=1e-4, gamma=0.2, learning_rate=0.51, n_iter=2000000)

29.4 s ± 125 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [217]:
data = ['Rastrigin', 'AdaGrad', str(time)[:25], '']
res_df.loc[len(res_df)] = data

<a id='gen123'></a>
#### Rosenbrock

In [218]:
%%time
Ros_Ada = GD_AdaGrad(Rosenbrock.function, Rosenbrock.gradient, np.array([-10, -10]),
                     gamma=0.9, tolerance = 1e-3, learning_rate=0.01, n_iter=30000)

Wall time: 459 ms


In [219]:
print(f'Minimum point coordinates \033[1m{Ros_Ada[0]}\033[0m,', 
      f'with function value = \033[1m{round(Ros_Ada[1], 6)}\033[0m')

Minimum point coordinates [1m[1.00008264 0.9954266 ][0m, with function value = [1m0.002246[0m


In [220]:
# cpu time
time = %timeit -o GD_AdaGrad(Rosenbrock.function, Rosenbrock.gradient, np.array([-10, -10]), gamma=0.9, tolerance = 1e-3, learning_rate=0.01, n_iter=30000)

451 ms ± 1.78 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [221]:
data = ['Rosenbrock', 'AdaGrad', str(time)[:25], '']
res_df.loc[len(res_df)] = data

In [222]:
res_df.set_index('Problem', drop=True, inplace=True)

In [223]:
res_df

Unnamed: 0_level_0,Method,Time w/o Numba,Numba Time
Problem,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Himmelblau,Classical GD,39.5 ms ± 1.01 ms per loo,
Rastrigin,Classical GD,40.2 ms ± 977 µs per loop,
Rosenbrock,Classical GD,794 ms ± 9.33 ms per loop,
Himmelblau,AdaGrad,495 ms ± 3.26 ms per loop,
Rastrigin,AdaGrad,29.4 s ± 125 ms per loop,
Rosenbrock,AdaGrad,451 ms ± 1.78 ms per loop,


<a id='gen2'></a>

## With Numba

In [224]:
@njit(fastmath=True)
def Himmelblau(x: np.ndarray) -> np.float64:
    return (x[0]**2 + x[1] - 11)**2 + (x[0] + x[1]**2 - 7)**2
@njit(fastmath=True)
def Him_grad(x: np.ndarray) -> np.array:
    return np.array([4 * x[0] * (x[0]**2 + x[1] - 11) \
                     + 2 * (x[0] + x[1]**2 - 7),
                     2 * (x[0]**2 + x[1] - 11) \
                         + 4 * x[1] * (x[0] + x[1]**2 - 7)])
@njit(fastmath=True)
def Rosenbrock(x: np.ndarray, b: int = 100) -> np.float64:
    return b * (x[1] - x[0]**2)**2 + (x[0] - 1)**2
@njit(fastmath=True)
def Rosen_grad(x: np.ndarray, b: int = 100) -> np.array:
    return np.array([2 * (x[0] - 1) \
                     - 4 * b * x[0] * (x[1] - x[0]**2),
                         2 * b * (x[1] - x[0]**2)])
@njit(fastmath=True)
def Rastrigin(x: np.ndarray, A: int = 10) -> np.float64:
    return list(x**2 - A * np.cos(2 * np.pi * x))[0] # + A
@njit(fastmath=True)
def Rast_grad(x: np.float32, A: int = 10) -> np.float64:
        return 2 * x + A * 2 * np.pi * np.sin(2 * np.pi * x)

<a id='gen21'></a>

In [225]:
@njit
def classic_GD (
                function: Mapping, gradient_of_function: Mapping,
                start: np.ndarray, learning_rate: float = 0.01, 
                n_iter: int = 100
                ) -> Tuple [np.ndarray, np.float32]:
    
    """ 
    Args:
        function (Mapping): минимизруемая функция
        gradient_of_function (Mapping): градиент минимизируемой функции
        start (np.ndarray): рандомная стартовая точка
        learning_rate (float): шаг минимизации
        n_iter (int): количество итераций градиентного спуска
    Return:
        tuple with found minimum coordinates, found minimum function value, 
        and plotting data list with multiple np.ndarray aka dots for plotting
    
   """   
    current_point = start.copy()
    current_point = current_point.astype('float64')
    
    for iter in range(n_iter):   
        current_point = current_point - learning_rate * gradient_of_function(current_point)

    return current_point, function(current_point)

<a id='gen211'></a>
#### Himmelblau

In [226]:
%%time
GD_Him_GPU = classic_GD(Himmelblau, Him_grad, np.array([-100, -100]), learning_rate=0.00001, n_iter=9500)

Wall time: 470 ms


In [227]:
print(f'Minimum point coordinates \033[1m{GD_Him_GPU[0]}\033[0m,')
print(f'with function value = \033[1m{round(GD_Him_GPU[1], 4)}\033[0m')

Minimum point coordinates [1m[-3.78059997 -3.28525589][0m,
with function value = [1m0.0002[0m


In [228]:
# cpu time
time = %timeit -o classic_GD(Himmelblau, Him_grad, np.array([-100, -100]), learning_rate=0.00001, n_iter=9500)

1.63 ms ± 4.28 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [229]:
res_df.at['Himmelblau', 'Numba Time'] = str(time)[:25] # df.iat[1, 1]=22

In [230]:
res_df

Unnamed: 0_level_0,Method,Time w/o Numba,Numba Time
Problem,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Himmelblau,Classical GD,39.5 ms ± 1.01 ms per loo,1.63 ms ± 4.28 µs per loo
Rastrigin,Classical GD,40.2 ms ± 977 µs per loop,
Rosenbrock,Classical GD,794 ms ± 9.33 ms per loop,
Himmelblau,AdaGrad,495 ms ± 3.26 ms per loop,1.63 ms ± 4.28 µs per loo
Rastrigin,AdaGrad,29.4 s ± 125 ms per loop,
Rosenbrock,AdaGrad,451 ms ± 1.78 ms per loop,


<a id='gen212'></a>
#### Rastrigin

In [231]:
%%time
GD_Rast_GPU = classic_GD(Rastrigin, Rast_grad, np.array([10, 10]), learning_rate=0.1, n_iter=200)

Wall time: 597 ms


In [232]:
print(f'Minimum point coordinates \033[1m{GD_Rast_GPU[0]}\033[0m,')
print(f'with function value = \033[1m{round(GD_Rast_GPU[1], 4)}\033[0m')

Minimum point coordinates [1m[-1.78314385 -1.78314385][0m,
with function value = [1m1.1121[0m


In [233]:
# cpu time
time = %timeit -o classic_GD(Rastrigin, Rast_grad, np.array([10, 10]), learning_rate=0.1, n_iter=200)

67.2 µs ± 1.12 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [234]:
res_df.at['Rastrigin', 'Numba Time'] = str(time)[:25]

<a id='gen213'></a>
#### Rosenbrock

In [235]:
%%time
GD_Ros_GPU = classic_GD(Rosenbrock, Rosen_grad, np.array([10, 10]), learning_rate=0.00001, n_iter=200000)

Wall time: 427 ms


In [236]:
print(f'Minimum point coordinates \033[1m{GD_Ros_GPU[0]}\033[0m,')
print(f'with function value = \033[1m{round(GD_Ros_GPU[1], 4)}\033[0m')

Minimum point coordinates [1m[3.02969959 9.18234049][0m,
with function value = [1m4.1207[0m


In [237]:
# cpu time
time = %timeit -o classic_GD(Rosenbrock, Rosen_grad, np.array([10, 10]), learning_rate=0.00001, n_iter=200000)

34.2 ms ± 629 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [238]:
res_df.at['Rosenbrock', 'Numba Time'] = str(time)[:25]

<a id='gen22'></a>

In [239]:
@njit
def GD_AdaGrad (
                function: Mapping, gradient: Mapping,
                start: np.ndarray, learning_rate: float = 0.01, 
                n_iter: int = 100, tolerance: float = 1e-5,
                gamma: np.float64 = 0.1, dtype="float64", 
                rand_state: int = 12
                ) -> Tuple [np.ndarray, np.float64]:
    
    """ 
    Args:
        function (Mapping):  минимизруемая функция
        gradient (Mapping): градиент заданной выше функции
        start (np.ndarray): рандомная стартовая точка/точки
        gamma (float): скорость затухания скользащих средних ф-ии потерь
        n_iter (int): количество итераций градиентного спуска
        learning_rate (float): шаг минимизации
        tolerance (float): минимальное допустимое изменение 
                           значения минимизируемой величины
        dtype (str): тип данных
        rand_state (int): зерно рандомайзера
    Return: 
        tuple with found minimum point coordinates, 
        funcation value at this point
        
   """  
    cur_point = start.copy()
    cur_point = cur_point.astype('float64')
    
    G = np.zeros(cur_point.shape, dtype=np.float64)
    
    for iter in range(n_iter):
        loss = gradient(cur_point)
        G = gamma * G + loss ** 2

        if np.all(np.abs(loss/np.sqrt(G)) <= tolerance):
            print(f'\033[1mEarly stopping!!\033[0m')
            break

        cur_point -= (loss/np.sqrt(G)) * learning_rate

    return cur_point, function(cur_point)

<a id='gen221'></a>
#### Himmelblau

In [240]:
%%time
Him_Ada = GD_AdaGrad(Himmelblau, Him_grad, np.array([100, 100]),
                     gamma=0.8, learning_rate=0.01, n_iter=30000)

Wall time: 461 ms


In [241]:
print(f'Minimum point coordinates \033[1m{Him_Ada[0]}\033[0m,', 
      f'with function value = \033[1m{round(Him_Ada[1], 6)}\033[0m')

Minimum point coordinates [1m[3.00223411 2.00223281][0m, with function value = [1m0.000369[0m


In [242]:
# cpu time
time = %timeit -o GD_AdaGrad(Himmelblau, Him_grad, np.array([100, 100]), gamma=0.8, learning_rate=0.01, n_iter=30000)

10.9 ms ± 61.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [243]:
res_df.iat[3, 2] = str(time)[:25]

<a id='gen222'></a>
#### Rastrigin

In [244]:
%%time
Rast_Ada = GD_AdaGrad(Rastrigin, Rast_grad, np.array([-10, 100]), 
                      tolerance=1e-4, gamma=0.2, learning_rate=0.51, n_iter=2000000)

Wall time: 1.37 s


In [245]:
print(f'Minimum point coordinates \033[1m{Rast_Ada[0]}\033[0m,', 
      f'with function value = \033[1m{round(Rast_Ada[1], 6)}\033[0m')

Minimum point coordinates [1m[-4.7446343  19.65106439][0m, with function value = [1m22.848628[0m


In [246]:
# cpu time
time = %timeit -o GD_AdaGrad(Rastrigin, Rast_grad, np.array([-10, 100]), tolerance=1e-4, gamma=0.2, learning_rate=0.51, n_iter=2000000)

795 ms ± 2.95 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [247]:
res_df.iat[4, 2] = str(time)[:25]

<a id='gen223'></a>
#### Rosenbrock

In [248]:
%%time
Ros_Ada = GD_AdaGrad(Rosenbrock, Rosen_grad, np.array([-10, -10]),
                     gamma=0.9, tolerance = 1e-3, learning_rate=0.01, n_iter=30000)

Wall time: 405 ms


In [249]:
print(f'Minimum point coordinates \033[1m{Ros_Ada[0]}\033[0m,', 
      f'with function value = \033[1m{round(Ros_Ada[1], 6)}\033[0m')

Minimum point coordinates [1m[1.00008264 0.9954266 ][0m, with function value = [1m0.002246[0m


In [250]:
# cpu time
time = %timeit -o GD_AdaGrad(Rosenbrock, Rosen_grad, np.array([-10, -10]), gamma=0.9, tolerance = 1e-3, learning_rate=0.01, n_iter=30000)

11 ms ± 30.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [251]:
res_df.iat[5, 2] = str(time)[:25]

<a id='gen3'></a>
## Results

In [252]:
res_df

Unnamed: 0_level_0,Method,Time w/o Numba,Numba Time
Problem,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Himmelblau,Classical GD,39.5 ms ± 1.01 ms per loo,1.63 ms ± 4.28 µs per loo
Rastrigin,Classical GD,40.2 ms ± 977 µs per loop,67.2 µs ± 1.12 µs per loo
Rosenbrock,Classical GD,794 ms ± 9.33 ms per loop,34.2 ms ± 629 µs per loop
Himmelblau,AdaGrad,495 ms ± 3.26 ms per loop,10.9 ms ± 61.4 µs per loo
Rastrigin,AdaGrad,29.4 s ± 125 ms per loop,795 ms ± 2.95 ms per loop
Rosenbrock,AdaGrad,451 ms ± 1.78 ms per loop,11 ms ± 30.5 µs per loop (


Видим, что нумба сильно ускоряет результаты. Посчитаем точно во сколько раз.

In [274]:
wo_N = []
w_N = []
for i in res_df['Time w/o Numba']:
    cont = i.split()[:2]
    wo_N.append(cont)
for t in wo_N:
    t[0] = float(t[0])
    if t[1] == 's':  # переводим все в микросекунды
        t[0] = t[0] * 1e+6
    elif t[1] == 'ms':
        t[0] = t[0] * 1e+3
        
for i in res_df['Numba Time']:
    cont = i.split()[:2]
    w_N.append(cont)
for t in w_N:
    t[0] = float(t[0])
    if t[1] == 's':  # переводим все в микросекунды
        t[0] = t[0] * 1e+6
    elif t[1] == 'ms':
        t[0] = t[0] * 1e+3

# рассчитаем во сколько раз нумбы быстрее для каждого случая
i = 0  
boost = []
while i < len(w_N):
    boost.append(round(wo_N[i][0]/w_N[i][0], 2))
    i += 1

[[39500.0, 'ms'], [40200.0, 'ms'], [794000.0, 'ms'], [495000.0, 'ms'], [29400000.0, 's'], [451000.0, 'ms']]
[[1630.0, 'ms'], [67.2, 'µs'], [34200.0, 'ms'], [10900.0, 'ms'], [795000.0, 'ms'], [11000.0, 'ms']]


In [280]:
res_df['Time Boost (times)'] = boost

In [281]:
res_df

Unnamed: 0_level_0,Method,Time w/o Numba,Numba Time,Time Boost (times)
Problem,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Himmelblau,Classical GD,39.5 ms ± 1.01 ms per loo,1.63 ms ± 4.28 µs per loo,24.23
Rastrigin,Classical GD,40.2 ms ± 977 µs per loop,67.2 µs ± 1.12 µs per loo,598.21
Rosenbrock,Classical GD,794 ms ± 9.33 ms per loop,34.2 ms ± 629 µs per loop,23.22
Himmelblau,AdaGrad,495 ms ± 3.26 ms per loop,10.9 ms ± 61.4 µs per loo,45.41
Rastrigin,AdaGrad,29.4 s ± 125 ms per loop,795 ms ± 2.95 ms per loop,36.98
Rosenbrock,AdaGrad,451 ms ± 1.78 ms per loop,11 ms ± 30.5 µs per loop (,41.0
