<font color='grey' size='1.5'> Created by Parisa Hosseinzadeh, 04/25/2021, updated on 02/07/2022

# Nelder-Mead optimization

Nelder-Mead method is a popular optimization algorithm for finding the minimun or maximum of an **objective** function in a **multidimensional** space.

Nelder-Mead is categorized as a *direct search* method because it compares the values of a function to decide the next step.  It can be applied to functions for which we don't know the derivative, thus it is a suitable method for optimizing parameters of a score function by optimizing its performance on multiple tasks simultanuously.

This algorithm works by calculating the value of the function given starting parameters and then calculating the value of the center of the mass. Based on the calculated values, it can choose one of several actions: *reflect, expand, contract inwards or outwards, shrink*. It stops when it converges. 

In this activity we employ Nelder-Mead optimization to train parameters of a simple score function. To learn more about Nelder-Mead method, check [here](http://www.scholarpedia.org/article/Nelder-Mead_algorithm) or [here](https://codesachin.wordpress.com/2016/01/16/nelder-mead-optimization/). To see how Nelder-Mead is used to train a real score function, look at [Park et al, J Chem Theory Comput 2016](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5515585/).

In [1]:
#@title Step 1. Importing necessary libraries
#@markdown Run this cell to import necessary modules.

import pandas as pd
import numpy as np
from scipy.optimize import minimize

In [2]:
#@title Step 2. Defining our optimization

#@markdown **2.a. Defining score function**

#@markdown We're definign a very simplistic score function 
#@markdown that only takes into account electrostatic (elec), 
#@markdown van der Waals (vdw), and rotamer probability (rot) energy terms. 
#@markdown w1, w2, and w3 are the parameters or weights we're optimizing.
#@markdown Check out this example.

w1 = 0.1
w2 = 0.3
w3 = 0.2
elec = -20
vdw = -13
rot = -2
score = w1*elec + w2*vdw + w3*rot
print('initial weights are:')
print('w1 (elec) = ', w1)
print('w2 (vdw)= ', w2)
print('w3 (rot)= ', w3)
print ('score = w1*elec + w2*vdw + w3*rot')
print ('score value is:', score)

initial weights are:
w1 (elec) =  0.1
w2 (vdw)=  0.3
w3 (rot)=  0.2
score = w1*elec + w2*vdw + w3*rot
score value is: -6.300000000000001


In [3]:
#@title Optional: Vectorizing the score
#@markdown This will make the calculations faster
#@markdown for bigger systems.

# To make things faster, we can use vector operations
# W = [w1, w2, w3]
W = np.array([0.1, 0.3, 0.2])
# E = [elec, vdw, rot]
E = np.array([-20, -13, -2])
# Vector operation for multiplication
score = np.dot(W,E)
print ('score vlaue is:', score)

score vlaue is: -6.3


In [4]:
#@markdown **Defining score calculation function**

#@markdown Run this cell to prepare for calculations.

# Now we can define a function that performs the calculation we jsut tried
def score_calc(W, E):
  '''
     Calculates score based on weight arrays -W-
     and energy array -E-.
  '''
  return np.dot(W,E) 

### 2.b. Training set

For our training set, we have 10 protein systems. For each protein, I have the total energy as well as break-down of the energy in terms of van der Waals, electrostatic, and rotamer probabilities.


In [5]:
#@markdown **2.b. Training set**
#@markdown For our training set, we have 10 protein systems. 
#@markdown For each protein, I have the total energy as well as 
#@markdown break-down of the energy in terms of van der Waals, 
#@markdown electrostatic, and rotamer probabilities. These
#@markdown are our training set.

#@markdown Our protein sets are:

#@markdown `p0 : exp = -28, E =[-10,-25,-3]`

#@markdown `p1 : exp = -11, E =[-5,-17,+1]`

#@markdown `p2 : exp = -22, E =[-12,-21,-0.5]`

#@markdown `p3 : exp = -32, E =[-20,-20, -1]`

#@markdown `p4 : exp = -9, E =[-5,-10, +0.5]`

#@markdown `p5 : exp = -25, E =[-12,-18, -2]`

#@markdown `p6 : exp = -25, E =[-11,-26, -1]`

#@markdown `p7 : exp = -12, E =[-5,+2, -4]`

#@markdown `p8 : exp = -21, E =[-16,-16, +2]`

#@markdown `p9 : exp = -30, E =[-13,-20, -4]`

# To ame calculations easier, we write it as a matrix
p_individual_e = np.array (
    [
    [-5,-17, +1],
    [-12,-21, -0.5],
    [-20,-20, -1],
    [-5,-10, +0.5],
    [-12,-18, -2],
    [-11,-26, -1],
    [-5,+2, -4],
    [-16,-16, +2],
    [-13,-20, -4],
    ]
)
p_experimental_e = np.array (
    [
    -11.5,
    -23.5,
    -32,
    -9,
    -25,
    -26,
    -12,
    -20,
    -31,
    ]
)

In [6]:
#@markdown **2.c. Defining the loss function**

#@markdown One of the most important steps in any optimization, 
#@markdown is the definition of a loss function. 
#@markdown This *loss function* defines the problem we are trying to solve. 
#@markdown For example, in this case, we want to make sure that our score 
#@markdown function generates energies that are as close to the experimental energy as possible.
#@markdown To do so, the sum of the difference between calculated score 
#@markdown and experimental energy should be minimal.

# Now let's put everything into a function
def calc_loss (W, E, exp):
  '''
     Calculates the loss given protein dictionary and 
     weights of the score function
  '''
  calculated_score = np.apply_along_axis(score_calc, 1, E, W)
  score_diff = np.subtract(calculated_score, exp)
  loss = np.sum(np.absolute(score_diff))

  return loss

In [7]:
#@title Step 3. Optimization
#@markdown At this step, we will perform optimization.

#@markdown **3.a. Set up initial weights**

#@markdown It's very important to set some initial parameter values. 
#@markdown In general, it is suggested to avoid all 0 vectors. 
#@markdown For the puropose of this example, 
#@markdown we assume all score terms contribute equally to energy, 
#@markdown so the initial weight will be: 1.
#@markdown `W_init = np.array([1, 1, 1])`

w1 =  1#@param {type:"number"}
w2 =  1#@param {type:"number"}
w3 =  1#@param {type:"number"}

W_init = np.array([w1, w2, w3])

### 3.b. Optimization

This is where the optimization happens! We will use scipy optmization. for optimization. Since you have your loss function and your score calculation ready, you can write it down. Check out [scipy documentation](https://docs.scipy.org/doc/scipy/reference/tutorial/optimize.html#nelder-mead-simplex-algorithm-method-nelder-mead) and see if you can write up the optimization.

In [None]:
#@markdown **3.b. Optimization**

#@markdown This is where the optimization happens! 
#@markdown We will use scipy optmization for optimization. 
#@markdown As you run this cell, you will see an output.
#@markdown It shows *new w1, new w2, new w3* and *loss*.

def callbackF(Xi):
    global Nfeval
    print ('{0:4d}   {1: 3.6f}   {2: 3.6f}   {3: 3.6f}   {4: 3.6f}'.format(
        Nfeval, Xi[0], Xi[1], Xi[2], calc_loss(Xi,
                                               p_individual_e,
                                               p_experimental_e)))
    Nfeval += 25

Nfeval = 0
# Optimization
score_opt = minimize(calc_loss, W_init, 
                     args=(p_individual_e, p_experimental_e),
                     method='nelder-mead',
                     callback=callbackF,
                     options={'xatol': 1e-8, 
                              'disp': True,},
                     )

In [None]:
#@title Observing the results
#@markdown Let's checkout our results
solution = score_opt['x']
print ('new w1 is:', solution[0])
print ('new w2 is:', solution[1])
print ('new w3 is:', solution[2])

In [None]:
#@markdown Let's check one of two of our proteins:

#@markdown `p3 : exp = -32, E =[-20,-20, -1]`

#@markdown `p7 : exp = -12, E =[-5,+2, -4]`

print('predicted value for p3 energy is:',
      solution[0]*-20 + solution[1]*-20 + solution[2]*-1)

print('predicted value for p7 energy is:',
      solution[0]*-5 + solution[1]*2 + solution[2]*-4)