# Example: Vapour pressures and enthalpies of vaporisation of alkyl formamides

## Introduction

This example discusses the evaluation of the vapor pressure as a function of the temperature of pure N-butyl-formamide (Figure 1)


**Figure 1: N-butyl-formamide. Font: [1]**

<img src="Imagens/formamida.png" alt="drawing" width="150"/>


The experimental data are preented in Table 1.

**Table 1. Experimental data. Font: [2].**

Temperature / K | Vapor pressure (Pa) | Vapor pressure uncertainties
:--------------:|:---------------------:|:-------------------------------:
        297.1   |         2.93          |        0.08 
        298.2   |         3.21          |        0.09
        299.3   |         3.49          |        0.09
        301.2   |         4.22          |        0.11
        304.2   |         5.60          |        0.17
        307.2   |         7.31          |        0.21
          .     |           .           |          .
          .     |           .           |          .
          .     |           .           |          .
        334.2   |          63.36        |        1.61
        337.1   |          78.93        |        2.00
        340.2   |          93.65        |        2.37
        343.2   |          115.11       |        2.90
        346.2   |          140.27       |        3.53
        349.1   |          171.89       |        4.32
        352.2   |          208.00       |        5.23
        
        
The vapor pressure model for this problem is given by [1]:

$y = \exp\left(A-(\frac{B}{T+C})\right)$ (1),

on what $y$ is the substance vapor pressure, $T$ is the temperature, and finally $A$, $B$ e $C$ are the parameters to be estimated.

The optimization problem to be solved uses the objective function of least squares weighted by the inverse of the variance, according to [2]:

$\min_{A, B, C} \sum_{i=1}^{NE} \left(\frac{y^{exp}_i-y_i(A,B,C)}{u^2_{y_i}}\right)$ (2),

subject to (1).

The following symbols will be used to solve this problem in the MT_PEU:

* Symbols of the independent quantities (temperature): T
* Symbols of the dependent quantities (vapor pressure): y
* Symbols of the parameters: A, B e C

## Packages importing

Importing libraries (packages) needed to run the code.

* **MT_PEU**: library that contains the main functionalities of the tool

    * Import the class **EstimacaoNaoLinear**, that will be used in this non-linear estimation example.

* **casadi**: library for symbolic computation

    * only the function **exp** (exponential) will be required to build the model.

In [2]:
from MT_PEU import EstimacaoNaoLinear
from casadi import exp

## Model creation

The model (1) represents the behavior of the dependent quantity in which the parameters $A$, $B$ and $C$ will be estimated.

This model is then defined in the form of a python subroutine (**def**) and represented by:

In [3]:
def Model(param,x,args):


    T = x[:,0]
    A, B, C = param[0], param[1],  param[2]
    
    
    return exp(A-(B/(T+C))) # Pvp calculation - vectorized

## Class initialization

The first step to perform the estimation is to configure the class **EstimacaoNaoLinear** through the inclusion of basic information.:

* The model,
* List of symbols of the independent quantities *(T)*; 
* List of symbols of the dependent quantities *(y)*; 
* List of symbols of the parameters *(param)*;
* The project name, the folder's name where the results will be saved.

In [4]:
Estimation = EstimacaoNaoLinear(Model, symbols_x=[r'T'], symbols_y=[r'y'], 
                            symbols_param=['A','B','C'],  Folder='Exemple3' )

## Data inclusion

The experimental data provided in Table 1 of the dependent quantity (y) and the independent quantity (T), are presented below in the form of lists:

In [5]:
#Vapor pressure
y = [2.93,3.21,3.49,4.22,5.60,7.31,9.12,13.07,14.98,17.63,18.02,22.08,26.95,34.61,
       40.93,50.17,63.36,78.93,93.65,115.11,140.27,171.89,208.00]

#Temperature
temperature = [297.1,298.2,299.3,301.2,304.2,307.2,310.2,314.1,316.2,317.8,318.2,320.2,
               323.1,326.2,329.1,331.2,334.2,337.1,340.2,343.2,346.2,349.1,352.2]

The MT_PEU **needs** the **uncertainties of the experimental data** (utemperature, uy) to be informed.

In [6]:
utemperature = [0.1]*23
uy = [0.08,0.09,0.09,0.11,0.17,0.21,0.25,0.35,0.40,0.47,0.48,0.58,
      0.70,0.89,1.05,1.28,1.61,2.00,2.37,2.90,3.53,4.32,5.23]

**Entering the experimental data in MT_PEU:**

The *setDados* method is used to include the data for dependent and independent quantities. Syntax:

* Quantity identification, whether it is independent or dependent: 0 or 1 (respectively)
* The experimental data and their uncertainties must be entered in sequence in the form of tuples.

In [7]:
Estimation.setDados(0,(temperature,utemperature))
Estimation.setDados(1,(y,uy))

The setConjunto method defines that the experimental data previously entered will be used as a data set for which parameters will be estimated:

In [8]:
Estimation.setConjunto(dataType='estimacao')

## Optimization

In this example, the user can choose the algorithm to be used in the optimization. 
Available: 'ipopt', 'bonmin' and 'sqpmethod'.

If the user does not choose any optimization method, the default algorithm will be used: bonmin, with initial estimative equal to [1, 1.5, 0.009].

**algorithm:** Informs the optimization algorithm that will be used. Each algorithm has its own keywords;

**optimizationReport:** Informs whether the optimization report should be created (True or False);

**parametersReport:** Informs whether the parameters report should be created (True or False).

In [9]:
Estimation.optimize(initial_estimative = [1, 1.5, 0.009],algorithm='bonmin', optimizationReport = True, parametersReport = False)

Cbc3007W No integer variables - nothing to do

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit http://projects.coin-or.org/Ipopt
******************************************************************************

NLP0012I 
              Num      Status      Obj             It       time                 Location
NLP0014I             1         OPT 11.520671      103 0.073
Cbc3007W No integer variables - nothing to do




## Parameters uncertainty


In this example, it is possible to choose the method used to evaluate uncertainty. 
Available: 2InvHessiana, Geral, SensibilidadeModelo. 
By definition the likelihood region filling is 'True', if necessary this option can be changed.

**uncertaintyMethod:** method for calculating the covariance matrix of the parameters.

**parametersReport:** Informs whether the parameters report should be created (True or False).

**objectiveFunctionMapping:** Deals with mapping the objective function (True or False).

**iterations:** Number of iterations to perform the mapping of the objective function. The higher the better mapping, but it increases the execution time.

If the user **does not** have an interest in **evaluating the uncertainty of the parameters**, just **do not** perform **Estime.parametersUncertainty**.

In [10]:
Estimation.parametersUncertainty(uncertaintyMethod='SensibilidadeModelo', objectiveFunctionMapping=True, parametersReport = False, iterations=5000)

## Prediction and residual analysis

The prediction method evaluates the dependent quantity based on the estimated parameters, the model informed by the user, and data of the dependent quantities. In case validation data are informed (through the setConjunto method, defining "dataType = predicao") these will be used to perform the prediction. Also, this method evaluates the covariance matrix of the prediction, in case the parametersUncertainty method has been performed.
 
Through residual analysis, it is possible to obtain indicators of estimation quality.  Aspects such as mean, homoscedasticity (which allows us to infer possible dependency and/or tendency relationships between the variables), normality, $R^2$ and autocorrelation are evaluated. The optimal point of the objective function is also evaluated. The residues analysis is performed primarily with the validation data.

**export_y:** Exports the calculated data of y, its uncertainty, and degrees of freedom in a txt with comma separation (True or False);

**export_y_xls:** Exports the calculated data of y, its uncertainty, and degrees of freedom in a xls (True or False);

**export_cov_y:** Exports the covariance matrix of y (True or False);

**export_x:** Exports the calculated data of x, its uncertainty, and degrees of freedom in a txt with comma separation(True or False);

In [11]:
Estimation.prediction(export_y=True,export_y_xls=True, export_cov_y=True, export_x=True)
Estimation.residualAnalysis()

  S = np.sum((2*i[sl1]-1.0)/N*(np.log(z)+np.log(1-z[sl2])), axis=axis)


## Plots and reports

At this stage, the results obtained by the program are exported: reports and graphs. 
The graphs are generated according to the steps that have been performed. In the reports, information about the statistical tests, objective function value, covariance matrix of the parameters, optimization status, among others, are printed.

In [12]:
Estimation.plots()
Estimation.reports()

<Figure size 432x288 with 0 Axes>

<Figure size 360x240 with 0 Axes>

## References: 

[1] SANJARI, Ehsan. A new simple method for accurate calculation of saturated vapor pressure. Elsevier. 
p. 12-16. mar. 2013.

[2] ZAITSEVA, Ksenia V.; ZAITSAU, Dzmitry H.; VARFOLOMEEV, Mikhail A.. 
Vapour pressures and enthalpies of vaporisation of alkyl formamides. Elsevier. Alemanha, p. 228-238. maio 2019.

[3] INMETRO.: Avaliação de dados de medição — Guia para a expressão de incerteza de medição. Rio de Janeiro: Jcgm, 2008.