# Gradient checks

It is best practice to do gradient checks before and after gradient-based optimization.
Here we show, how to use the gradient check methods that are implemented in pyPESTO, using the finite differences (FD) method as a comparison.
There is a trade-off between the quality of the approximation and numerical noise, so it is recommended to try different FD step sizes.  
Most importantly, test your gradients using the settings you will use later on.

In [1]:
import numpy as np

import benchmark_models_petab as models
import pypesto.optimize as optimize
import pypesto.petab

np.random.seed(2)

import seaborn as sns

#### Set up an example problem

Create the pypesto problem and a random vector of parameter values.

In [2]:
%%capture

model_name = "Boehm_JProteomeRes2014"
petab_problem = models.get_problem(model_name)

importer = pypesto.petab.PetabImporter(petab_problem)
pypesto_problem = importer.create_problem(verbose=False)

### Gradient check before optimization
Here we use the startpoint sampling method to generate random parameter vectors.

In [3]:
startpoints = pypesto_problem.get_startpoints(n_starts=2)

Perform a gradient check at the location of one of the parameter vectors. `check_grad` compares the gradients obtained by the finite differences (FD) method and the objective gradient. You can modify the finite differences step size via the argument `eps`.

In [4]:
pypesto_problem.objective.check_grad(
    x = startpoints[0], 
    eps = 1e-5,  # default
    verbosity = 0,
)

Unnamed: 0,grad,fd_f,fd_b,fd_c,fd_err,abs_err,rel_err
Epo_degradation_BaF3,28998050000.0,28983490000.0,28995160000.0,28989330000.0,11660550.0,8729236.0,0.000301119
k_exp_hetero,-1822.477,82479900.0,-118583600.0,-18051850.0,201063500.0,18050030.0,0.999899
k_exp_homo,1940159.0,-7634094.0,26205600.0,9285754.0,33839700.0,7345596.0,0.7910607
k_imp_hetero,1324222000.0,963610900.0,1401833000.0,1182722000.0,438221600.0,141500400.0,0.1196396
k_imp_homo,2759777000.0,2689595000.0,2810697000.0,2750146000.0,121102100.0,9630702.0,0.003501887
k_phos,-31838940000.0,-31897610000.0,-31812510000.0,-31855060000.0,85099380.0,16115090.0,0.0005058878
sd_pSTAT5A_rel,-4106435000000.0,-4106388000000.0,-4106482000000.0,-4106435000000.0,93725500.0,666.3599,1.622721e-10
sd_pSTAT5B_rel,-246766500000.0,-246808500000.0,-246724500000.0,-246766500000.0,84018740.0,33.62799,1.362745e-10
sd_rSTAT5A_rel,36.84015,-47691350.0,47691490.0,73.24219,95382840.0,36.40203,0.497009


Explanation of the gradient check result columns:
- `grad`: Objective gradient
- `fd_f`: FD forward difference
- `fd_b`: FD backward difference
- `fd_c`: Approximation of FD central difference (reusing the information from `fd_f` and `fd_b`)
- `fd_err`: Deviation between forward and backward differences `fd_f`, `fd_b`
- `abs_err`: Absolute error between `grad` and the central FD gradient `fd_c`
- `rel_err` Relative error between `grad` and the central FD gradient `fd_c`

If there are fixed parameters in your vector you might invoke an error due to the dimension mismatch. Use the helper method `Problem.get_reduced_vector` to get the reduced vector with only free (estimated) parameters:

In [5]:
pypesto_problem.objective.check_grad(
    x = pypesto_problem.get_reduced_vector(startpoints[0]),
    eps = 1e-5,  # default
    verbosity = 0,
)

Unnamed: 0,grad,fd_f,fd_b,fd_c,fd_err,abs_err,rel_err
Epo_degradation_BaF3,28998050000.0,28983490000.0,28995160000.0,28989330000.0,11660550.0,8729236.0,0.000301119
k_exp_hetero,-1822.477,82479900.0,-118583600.0,-18051850.0,201063500.0,18050030.0,0.999899
k_exp_homo,1940159.0,-7634094.0,26205600.0,9285754.0,33839700.0,7345596.0,0.7910607
k_imp_hetero,1324222000.0,963610900.0,1401833000.0,1182722000.0,438221600.0,141500400.0,0.1196396
k_imp_homo,2759777000.0,2689595000.0,2810697000.0,2750146000.0,121102100.0,9630702.0,0.003501887
k_phos,-31838940000.0,-31897610000.0,-31812510000.0,-31855060000.0,85099380.0,16115090.0,0.0005058878
sd_pSTAT5A_rel,-4106435000000.0,-4106388000000.0,-4106482000000.0,-4106435000000.0,93725500.0,666.3599,1.622721e-10
sd_pSTAT5B_rel,-246766500000.0,-246808500000.0,-246724500000.0,-246766500000.0,84018740.0,33.62799,1.362745e-10
sd_rSTAT5A_rel,36.84015,-47691350.0,47691490.0,73.24219,95382840.0,36.40203,0.497009


Next, we do optimization and perform a gradient check at a local optimum.  
The method `check_grad_multi_eps` calls the `check_grad` method above multiple times with different settings for the FD step size and reports the setting that results in the smallest error. 
You can supply a list of FD step sizes to be tested via the `multi_eps` argument (or use the default ones), and use the `label` argument to switch between the FD, or absolute or relative error.

In [6]:
%%capture

result = optimize.minimize(
    problem=pypesto_problem, 
    optimizer=optimize.ScipyOptimizer(), 
    n_starts=4,
)

In [7]:
gc = pypesto_problem.objective.check_grad_multi_eps(
    x=pypesto_problem.get_reduced_vector(result.optimize_result[0].x),
    verbosity=0,
    label='rel_err',  # default
)
gc

Unnamed: 0,grad,fd_f,fd_b,fd_c,fd_err,abs_err,rel_err,eps
Epo_degradation_BaF3,-0.001558,1.100138,-1.104286,-0.002074,2.204424,0.000515,0.479959,0.001
k_exp_hetero,0.055354,0.062354,0.04937,0.055862,0.012984,0.000508,0.003259,0.1
k_exp_homo,0.000609,0.033696,-0.032193,0.000751,0.065888,0.000142,0.081331,0.001
k_imp_hetero,-0.001165,197.943505,-181.738627,8.102439,379.682131,8.103604,0.98795,0.1
k_imp_homo,-4.4e-05,-3.3e-05,-7.4e-05,-5.3e-05,4.1e-05,9e-06,9.3e-05,0.1
k_phos,-0.001786,0.477147,-0.480739,-0.001796,0.957886,1e-05,0.012976,0.001
sd_pSTAT5A_rel,0.000435,0.187172,-0.186303,0.000434,0.373475,1e-06,0.002282,1e-05
sd_pSTAT5B_rel,0.00022,0.186953,-0.186522,0.000216,0.373475,4e-06,0.019104,1e-05
sd_rSTAT5A_rel,-0.000259,18.588643,-18.589168,-0.000262,37.177811,3e-06,0.011911,1e-07


Use the pandas style methods to visualise the results of the gradient check, e.g.:

In [8]:
def highlight_value_above_threshold(x, threshold=10):
    return ['color: darkorange' if xi > threshold else None for xi in x]

gc.style.apply(
    highlight_value_above_threshold, subset=["fd_err"],
).background_gradient(
    cmap=sns.light_palette("purple", as_cmap=True), subset=["abs_err"],
).background_gradient(
    cmap=sns.light_palette("red", as_cmap=True), subset=["rel_err"],
)

Unnamed: 0,grad,fd_f,fd_b,fd_c,fd_err,abs_err,rel_err,eps
Epo_degradation_BaF3,-0.001558,1.100138,-1.104286,-0.002074,2.204424,0.000515,0.479959,0.001
k_exp_hetero,0.055354,0.062354,0.04937,0.055862,0.012984,0.000508,0.003259,0.1
k_exp_homo,0.000609,0.033696,-0.032193,0.000751,0.065888,0.000142,0.081331,0.001
k_imp_hetero,-0.001165,197.943505,-181.738627,8.102439,379.682131,8.103604,0.98795,0.1
k_imp_homo,-4.4e-05,-3.3e-05,-7.4e-05,-5.3e-05,4.1e-05,9e-06,9.3e-05,0.1
k_phos,-0.001786,0.477147,-0.480739,-0.001796,0.957886,1e-05,0.012976,0.001
sd_pSTAT5A_rel,0.000435,0.187172,-0.186303,0.000434,0.373475,1e-06,0.002282,1e-05
sd_pSTAT5B_rel,0.00022,0.186953,-0.186522,0.000216,0.373475,4e-06,0.019104,1e-05
sd_rSTAT5A_rel,-0.000259,18.588643,-18.589168,-0.000262,37.177811,3e-06,0.011911,0.0


### How to "fix" my gradients?

- Find suitable simulation tolerances.
- Consider switching from adjoint to forward sensitivities, which tend to be more robust.
- Check the simulation logs for Warnings and Errors.
- Ensure that the model is correct.
- Be aware that, depending on the tolerances, the gradient might differ when it is calculated with sensitivities, because `Objective(x, sensi_orders=(0,))` might provide a different value than `Objective(x, sensi_orders=(0,1))`