# Parameter Estimation

Users can:
    
    - Set which parameters to estimate
    - Set boundaries of optimization problem
    - Choose algorithm and set algorithm specific parameters
    - Configure report options (which parameters to include, name, append etc.)
    - Map arbitrary number of data files to fit at once
    - Choose between time course or steady state
      

To do:

    - Support for affected experiments section
    - Support for validation experiments



## Imports and Getting Test Model

In [1]:
%matplotlib inline
import site
site.addsitedir('//home/b3053674/Documents/pycotools')
from pycotools import model, tasks, viz, misc
from pycotools.Tests import test_models
import os
import pandas
from lxml import etree


import logging
logging.basicConfig(format = '%(levelname)s:%(message)s')
LOG=logging.getLogger()

## get string model from test_models
zi_model_string = test_models.TestModels().zi_model()

## get a working directory. Change this to change this to wherever you like
directory = r'/home/b3053674/Documents/pycotools/ZiModel'

## choose path to zi model
zi_path = os.path.join(directory, 'zi2012.cps')

##write model to file
with open(zi_path, 'w') as f:
    f.write(zi_model_string)
    
## check file exists
if not os.path.isfile(zi_path):
    raise Exception

zi = model.Model(zi_path)

root:INFO:27:    Initializing pycotools
root:INFO:28:    Initializing logging System
root:INFO:29:    logging config file at: //home/b3053674/Documents/pycotools/pycotools/logging_config.conf


## Simulate Synthetic Data for Demonstration

In [2]:
report= 'parameter_estimation_synthetic_data.txt'
TC=tasks.TimeCourse(
    zi, start=0, end=1000, intervals=1000, step_size=1, report_name=report
)

## validate that its worked
pandas.read_csv(TC.report_name,sep='\t').head()

Unnamed: 0,Time,[Smad3n],[Smad3c],[Smad4n],[Smad4c],[T1R_Surf],[T2R_Cave],[T2R_Surf],[Smads_Complex_n],[T1R_EE],...,Values[Kcd],Values[kr_Cave],Values[ki_Cave],Values[Kexp_Smad4n],Values[Kdiss_Smads_Complex_n],Values[Kimp_Smad2c],Values[Kimp_Smads_Complex_c],Values[Kimp_Smad4c],Values[Kdeg_T2R_EE],Values[k_Smads_Complex_c]
0,0,236.45,492.61,551.72,1149.4,0.237,1.778,0.202,0.0,2.06,...,0.005,0.03742,0.33,0.5,0.1174,0.16,0.16,0.08,0.025,6.9e-05
1,1,236.324,491.559,551.647,1148.33,0.049448,1.71974,0.014197,0.158774,2.0016,...,0.005,0.03742,0.33,0.5,0.1174,0.16,0.16,0.08,0.025,6.9e-05
2,2,235.634,488.437,551.232,1145.12,0.049259,1.66112,0.013911,1.23787,1.94295,...,0.005,0.03742,0.33,0.5,0.1174,0.16,0.16,0.08,0.025,6.9e-05
3,3,234.251,483.497,550.352,1140.01,0.04908,1.60458,0.013739,3.91869,1.88642,...,0.005,0.03742,0.33,0.5,0.1174,0.16,0.16,0.08,0.025,6.9e-05
4,4,232.245,476.99,549.029,1133.28,0.048849,1.55008,0.013592,8.61316,1.83194,...,0.005,0.03742,0.33,0.5,0.1174,0.16,0.16,0.08,0.025,6.9e-05


## Formate Synthetic Data
Give fake data a meaningful name

In [3]:
data1 = TC.report_name

In [None]:
misc.correct_copasi_timecourse_headers(data1)

## Parameter Estimation Data Files
Rules to follow:

    - Column headings must match model variables exactly in order to be mapped. 
    - Independant variables can be mapped by appending the suffix `_indep` to the variable.
        - For example, if you vary the concentration of A between 0 and 10ng/mL in an experiment:
        
                Time A_indep  B   C
                t=1   0       -   - 
                t=2   0           -
                t=3  10           -
                t=4  10       -   -
    - Do not use non-asci strings (i.e. greek alpha or beta) in parameter definitions. They are minimally supported but this is unstable. 
    
 

# Setup and run single parameter estimation 

Method outline:

    1) Instantiate ParameterEstimation instance with model and data filenames
        - Ensure optional arguments are specified to the ParameterEstimation class
    2) Use `write_config_template()` method to write and configure a parameter estimation config file.
    2) Use `setup()` method
    3) Use `run()` method
    4) Use `format_results()` method (to give output results meaningful headers)

## Use HookeJeeves algorithm

In [4]:
PE=tasks.ParameterEstimation(zi, data1, method='hooke_jeeves')

## Use particle swarm algorithm

In [5]:
PE=tasks.ParameterEstimation(zi,data1, method='particle_swarm')

## Use genetic algorithm

In [6]:
## Set getetic algorithm parameters to low for speed of demonstration
PE=tasks.ParameterEstimation(zi, data1, method='genetic_algorithm', 
                             population_size=15, number_of_generations=10)

## Write and configure item template

In [7]:
PE.write_config_file()
os.path.isfile(PE.config_filename)

True

### Configure manually
Now you can go to where the file was saved and modify it how you like.

    - Delete rows with parameters you want not to be included in estimation
    - Change starting values, lower bounds and upper bounds
        - If you set start values, remember to set the randomize_start_values variable to False as it defaults to True

This is a useful way of running parameter estimations as it provides a record of how you've configured parameter estimations. 

### Configure with API
The `metabolites`, `global_parameters` and `local_parameters` arguments tell Pycotools which parameters to include in the config file. By default all model parameters are included but these can be overridden by giving our own lists. 

For example, here we set `metabolites` and `globl_quantities` to empty lists to specify that we do not want to include them in the estimation process. We can also use the `upper_bound` and `lower_bound` arguments to constrain the estimation problem. An argument can also be given to `start_value` but parameter estimation is usually better performed multiple times with randomized start values (or with latin hypercube sampling but this is not supported here).

In [9]:
config_file = os.path.join(os.path.dirname(zi_path), 'Zi2002PEConfig.csv')
PE = tasks.ParameterEstimation(
    zi, data1, method='genetic_algorithm', population_size=15,
    number_of_generations=10, metabolites=[], global_quantities=[],
    lower_bound = 0.1, upper_bound = 100, config_filename = config_file)

Here we've defined a config file with only kinetic parameters estimated between the boundaries of 0.1 and 100. 

In [12]:
PE.write_config_file()

## check that the config filename exists
print (os.path.isfile(PE.config_filename))
PE.config_filename

True


'/home/b3053674/Documents/pycotools/ZiModel/Zi2002PEConfig.csv'

## Setup and run parameter estimation

In [13]:
PE.setup()



SomethingWentHorriblyWrongError: Kdeg_T1R_EE not a metabolite, local_parameter or global_quantity

It can sometime be helpful to open the GUI and manually check that you' set all the options currectly. Run the next cell to open the model:

In [None]:
import os
os.system('CopasiUI {}'.format(kholodenko_model))
##  remember to close the model before continuing

## Run with CopasiSE

In [None]:
PE.run() # will take some time depending on arguments to ParameterEstimation

In [None]:
## lets take a look at the parameter estimation results
pe_data= pandas.read_csv(PE.kwargs['report_name'],sep='\t') 

LOG.info( 'These are your estimated parameters:\n\n{}'.format(pe_data.transpose())  )

## Multiple Data Files

### Change a parameter value  
PyCoTools can easily handle multiple data files by giving `ParameterEsimation` a list of data file paths. Lets simulate some more data.

Lets first change a model parameter so both sets of simulated data are not identical:

In [None]:
## Original value of Mek1-PP at time 0 is 10 
# #Use InsertParameters to change it to 20
I=PyCoTools.pycopi.InsertParameters(kholodenko_model,parameter_dict={'(MAPKKK activation).V1':45})

## Set to True for manual check
manual_check = False
if manual_check:
    os.system('CopasiUI {}'.format(kholodenko_model))

### Simulate second time course

In [None]:
report= 'parameter_estimation_synthetic_data2.txt'
TC=PyCoTools.pycopi.TimeCourse(kholodenko_model,
                               end=1000,
                               intervals=10,
                               step_size=100,
                               report_name = report,
                               plot=True)


## Give fake data a meaningful name
data2 = TC['report_name']

## Setup parameter estimation with two data files
Pass a list of properly configured COPASI data files rather than a single string. 

In [None]:
report = 'parameter_estimation_data2.txt'
config_file = os.path.join(os.path.dirname(kholodenko_model), 'kholodenko_config_file2.xlsx')
PE=PyCoTools.pycopi.ParameterEstimation(kholodenko_model,[data1,data2],method='GeneticAlgorithm',
                       population_size = 5,number_of_generations= 20,report_name = report,
                       metabolites=[], global_quantities=[], lower_bound=0.1, 
                      upper_bound=100, config_filename=config_file)
PE.write_config_template()
PE.setup()
PE.run()
PE.format_results()

### Check the parameter estimation data:

In [None]:
df = pandas.read_csv(PE['report_name'],sep='\t').transpose()
LOG.info('Estimated parameters are:\n{}'.format(df)) 

# Visualization 
Use the `Plot` keyword to visualize the data

In [None]:
from PyCoTools.pycopi import ParameterEstimation 

report = 'parameter_estimation_data2.txt'
config_file = os.path.join(os.path.dirname(kholodenko_model), 'kholodenko_config_file3.xlsx')
PE=ParameterEstimation(kholodenko_model,[data1,data2],method='GeneticAlgorithm',
                       population_size = 50,number_of_generations= 300,report_name = report,
                       metabolites=[], global_quantities=[], lower_bound=0.1, 
                       upper_bound=100, config_filename=config_file, plot=True)
PE.write_config_template()
PE.setup()
PE.run()
PE.format_results()

## Insert the parameters and simulate
Insert the estimated parameters into the model for further analysis

In [None]:
pe_data = pandas.read_csv(report,sep='\t')

PyCoTools.pycopi.InsertParameters(kholodenko_model,parameter_path = pe_data['report_name'],
                                  index = 0)

os.system('CopasiUI {}'.format(kholodenko_model))

