# Example of API usage for optimization

To install package run the following from base directory:

```console
$ python3 setup.py install
```

To import package:

In [1]:
import deminf_data

## Get objective function for name of the data set

It is possible to get objective function from name of the data set. The call of this function will take values of parameters and return value of log-likelihood (negative or positive) for the data set with given name. It could be used for optimizations benchmarks.

(The third load could take a while)

In [2]:
objective_1 = deminf_data.Objective.from_name("1_Bot_4_Sim")

In [3]:
objective_2 = deminf_data.Objective.from_name("2_ExpDivNoMig_5_Sim", negate=True)

In [4]:
objective_3 = deminf_data.Objective.from_name("4_DivMig_11_Sim", negate=True, type_of_transform="logarithm")

## Number of parameters and their bounds

Objectives provide information about parameters. Note that bounds are in transformed units already (for more information see nect section).

In [5]:
def get_information_from_objective(objective):
    print("Objective with name:", objective.name)
    print("Number of parameters:", objective.n_params)
    print("Lower bounds of parameters:", objective.lower_bound)
    print("Upper bounds of parameters:", objective.upper_bound)

In [6]:
get_information_from_objective(objective_1)

Objective with name: 1_Bot_4_Sim
Number of parameters: 4
Lower bounds of parameters: [0.001 0.001 0.    0.   ]
Upper bounds of parameters: [100 100   5   5]


In [7]:
get_information_from_objective(objective_2)

Objective with name: 2_ExpDivNoMig_5_Sim
Number of parameters: 5
Lower bounds of parameters: [1.e-02 1.e-02 1.e-02 1.e-15 1.e-15]
Upper bounds of parameters: [100 100 100   5   5]


In [8]:
get_information_from_objective(objective_3)

Objective with name: 4_DivMig_11_Sim
Number of parameters: 11
Lower bounds of parameters: [ -4.60517019  -4.60517019  -4.60517019  -4.60517019  -4.60517019
  -4.60517019 -34.53877639 -34.53877639 -34.53877639 -34.53877639
 -34.53877639]
Upper bounds of parameters: [4.60517019 4.60517019 4.60517019 4.60517019 4.60517019 4.60517019
 2.30258509 2.30258509 1.60943791 1.60943791 1.60943791]


## Transformation of parameter space
Parameter space could be transformed for objective. For example if `type_of_transform="logarithm"` then logarithm is applied. It means nothing to one who uses objective as bounds are already transformed. The other type of transform is `type_of_transform="suctom_logarithm"` which apply logarithm only on non-migration parameters.

To understand better let us see the example:

In [9]:
objective_1 = deminf_data.Objective.from_name("2_ExpDivNoMig_5_Sim") # without any transform
objective_2 = deminf_data.Objective.from_name("2_ExpDivNoMig_5_Sim", type_of_transform="logarithm")

We could observe that bounds of `objective_1` are logarithm of bounds of `objective_2`:

In [10]:
get_information_from_objective(objective_1)
get_information_from_objective(objective_2)

Objective with name: 2_ExpDivNoMig_5_Sim
Number of parameters: 5
Lower bounds of parameters: [1.e-02 1.e-02 1.e-02 1.e-15 1.e-15]
Upper bounds of parameters: [100 100 100   5   5]
Objective with name: 2_ExpDivNoMig_5_Sim
Number of parameters: 5
Lower bounds of parameters: [ -4.60517019  -4.60517019  -4.60517019 -34.53877639 -34.53877639]
Upper bounds of parameters: [4.60517019 4.60517019 4.60517019 1.60943791 1.60943791]


Then we could take random parameters for `objective_1` and logarithm from this parameters will give the same value of objective for `objective_2`:

In [11]:
import numpy as np
params = np.random.uniform(objective_1.lower_bound, objective_1.upper_bound)
obj_1 = objective_1(params)
print("Value of objective_1 on non-trasnfromed params:", obj_1)
transformed_params = np.log(params)
obj_2 = objective_2(transformed_params)
print("Value of objective_2 on transformed params:", obj_2)
print("Values are closed:", np.allclose(obj_1, obj_2))

Value of objective_1 on non-trasnfromed params: -408503.7009439486
Value of objective_2 on transformed params: -408503.7009439487
Values are closed: True


## Optimization

Here is example of usage in optimization with scipy (L-BFGS-B). As it is minimization we need to negate our funtion.

In [12]:
fun = deminf_data.Objective.from_name("1_Bot_4_Sim", negate=True, type_of_transform="logarithm")
x0 = np.random.uniform(fun.lower_bound, fun.upper_bound) # bad estimation of point close to optimum
x0 = [0 for _ in fun.lower_bound]
bounds = list(zip(fun.lower_bound, fun.upper_bound))

def callback(x):
    y = fun(x)
    print(y, x)

import scipy
result = scipy.optimize.minimize(fun, x0, method="L-BFGS-B", bounds=bounds, callback=callback)

1072.7961158459184 [ 5.14578337e-01 -7.71867505e-01  6.10014528e-05  1.83004358e-04]
574.1414181134596 [ 0.71206506 -0.83812539  0.04186397 -0.32872396]
471.2172996666159 [ 1.00135671 -1.05948898  0.1628776  -0.79537748]
415.1424603475971 [ 0.89984493 -0.98076323  0.11065138 -0.62358731]
413.0735965744825 [ 0.89529677 -0.99786753  0.11593928 -0.61483971]
405.1502271871741 [ 0.88505984 -1.14083072  0.16530048 -0.58889489]
401.873590993594 [ 0.8978786  -1.19748875  0.18851423 -0.60956615]
398.36641526509993 [ 0.91930106 -1.3339613   0.24301603 -0.64231311]
396.62302371025726 [ 0.93427502 -1.43480335  0.29308715 -0.68071958]
395.94692393647983 [ 0.94366241 -1.519452    0.33205211 -0.70320677]
395.134599429015 [ 0.96080123 -1.63067165  0.38808767 -0.7474449 ]
394.43497860975094 [ 0.97525054 -1.72352881  0.44334171 -0.79517248]
394.10885383235836 [ 0.99011525 -1.82627089  0.49901245 -0.83913412]
393.91572274009195 [ 1.00335397 -1.91343711  0.54947693 -0.88155527]
393.6054588042789 [ 1.01732

And we could compare found parameters with best known values of this dataset. Note that best value of function is sometimes estimation of best value for some datasets and it is not guaranteed that values from optimization are always worse than known optimum.

In [13]:
# 1. Get object with all information about dataset
data_set = deminf_data.DemInfData.from_name("1_Bot_4_Sim")

# 2. Get best known value of fitness function (log-likelihood). We should negate it (!!!)
best_known_obj = - data_set.max_ll
print("Best known value of our objective:", best_known_obj)

# We could ask data set if this is true optimum
print("Best known value of objective is optimum:", data_set.is_maximum_likelihood_exact())

# 3. Get best parameters and transform them
best_known_params = fun.transform(data_set.popt)
print("Parameters that corresponds to it:", best_known_params)

# We could check that this parameters corresponds to the best value of objective
print("Parameters corresponds indeed:", np.isclose(best_known_obj, fun(best_known_params)))

# 4. Compare best known value of objective with those from optimization
print("Founded parameters in optimizations are better than known values:", fun(result.x) < best_known_obj)

Best known value of our objective: 88.56032595987472
Best known value of objective is optimum: True
Parameters that corresponds to it: [-4.60517019  0.         -5.29831737 -2.99573227]
Parameters corresponds indeed: True
Founded parameters in optimizations are better than known values: False
