# Parameter estimation

In [1]:
using Pkg
Pkg.activate("..")
using Clapeyron, BlackBoxOptim

[32m[1m  Activating[22m[39m project at `~/Library/CloudStorage/OneDrive-ImperialCollegeLondon/University/UROP/SAFT_codes/Clapeyron`


In this notebook, we will illustrate how one can perform parameter estimation using `Clapeyron.jl`. To give the user the most-flexibility possible, we have left the choice of optimizer up to them. For all examples considered, we will be using `BlackBoxOptim.jl`.

## Pure-component parameter estimation in SAFT equations

As a first example, we will fit the pure-component PC-SAFT parameters for methane in `Clapeyron.jl`. Although we use the PC-SAFT equation of state in this example, this procedure could be repeated using any other pure-component equation of state available in `Clapeyron.jl`.

First we generate the model:

In [2]:
model = PCSAFT(["methane"]);

One can imagine that this model is our 'initial guess' for the parameters of methane. If the user wish to develop parameters for a species not available in `Clapeyron.jl`, they can introduce their parameters using the `userlocations` optional argument for the model. The next step is to define which parameters need to be fitted, along with their bounds and initial guesses:

In [3]:
toestimate = [
    Dict(
        :param => :epsilon,
        :lower => 100.,
        :upper => 300.,
        :guess => 250.
    ),
    Dict(
        :param => :sigma,
        :factor => 1e-10,
        :lower => 3.2,
        :upper => 4.0,
        :guess => 3.7
    )
    ,
    Dict(
        :param => :segment,
        :lower => 0.9,
        :upper => 1.1,
        :guess => 1.
    )
];

The next step is to define the properties we wish to fit to. While there are many property estimation methods available in `Clapeyron.jl`, they may not always output the desired values. For example, the `saturation_pressure` method outputs the saturation pressure, liquid volume and vapour volume. In most cases for SAFT-type parameters, we will want to fit to the saturation pressure and liquid density. As such, we can define two new functions:

In [4]:
function saturation_p(model::EoSModel,T)
    sat = saturation_pressure(model,T)
    return sat[1]
end

function saturation_rhol(model::EoSModel,T)
    sat = saturation_pressure(model,T)
    return 1/sat[2]
end

saturation_rhol (generic function with 1 method)

The last step is the provide the experimental data. Within `Clapeyron.jl`, we accept our inputs as .csv files with the following format:


Note that the inputs and outputs of the function named in the second cell is by the prefix `output_` in the case of the latter.

Now that each part of the parameter estimation problem has been defined, we can compile it all together:

In [5]:
estimator,objective,initial,upper,lower = Estimation(model,toestimate,["saturation_pressure.csv","saturation_liquid_density.csv"]);

The `estimator` object contains all of the information relevant to the parameter estimation problem and `objective` takes in guesses for the parameters and outputs the value of the objective function (we use the root-mean-squared-relative error). `initial`, `upper` and `lower` are self-explanatory. We can then use our global optimiser to solve for the optimal parameters given a set of experimental data:

In [6]:
nparams = length(initial)
bounds  = [(lower[i],upper[i]) for i in 1:nparams]

result = BlackBoxOptim.bboptimize(objective; 
        SearchRange = bounds, 
        NumDimensions = nparams,
        MaxSteps=10000,
        PopulationSize = 1000,
        TraceMode=:silent)

params = BlackBoxOptim.best_candidate(result);

Once the optimal parameters have been found, we can build our new, optimised model as:

In [8]:
model = return_model(estimator,model,params)

PCSAFT{BasicIdeal} with 1 component:
 "methane"
Contains parameters: Mw, segment, sigma, epsilon, epsilon_assoc, bondvol

If the user wishes to weight the various properties being fit to differently, this can be achieved by adding the weights when we build the estimator:

In [9]:
estimator,objective,initial,upper,lower = Estimation(model,toestimate,[(2.,"saturation_pressure.csv"),(1.,"saturation_liquid_density.csv")]);

We can then re-optimise the parameters:

In [10]:
nparams = length(initial)
bounds  = [(lower[i],upper[i]) for i in 1:nparams]

result = BlackBoxOptim.bboptimize(objective; 
        SearchRange = bounds, 
        NumDimensions = nparams,
        MaxSteps=10000,
        PopulationSize = 1000,
        TraceMode=:silent)

params = BlackBoxOptim.best_candidate(result);

One thing to note above is that, for evaluating the saturation pressure and saturated liquid densities, this is not the most-efficient way of doing so as it involves two calls to the `saturation_pressure` function. If we instead define a new function which outputs both properties, we can combine the csv spreadsheets into one:

In [12]:
function saturation_p_rhol(model::EoSModel,T)
    sat = saturation_pressure(model,T)
    return sat[1], 1/sat[2]
end

saturation_p_rhol (generic function with 1 method)

Re-building the estimator:

In [13]:
estimator,objective,initial,upper,lower = Estimation(model,toestimate,["saturation_pressure_liquid_density.csv"])

nparams = length(initial)
bounds  = [(lower[i],upper[i]) for i in 1:nparams]

result = BlackBoxOptim.bboptimize(objective; 
        SearchRange = bounds, 
        NumDimensions = nparams,
        MaxSteps=10000,
        PopulationSize = 1000,
        TraceMode=:silent)

params = BlackBoxOptim.best_candidate(result);

## Mixture system parameter estimation in Activity Coefficient Models

Consider a water+ethanol system modeled using NRTL where we need to fit the cross binary interaction parameters ($A_{ij}$). Again, as a first step, we construct the initial model:

In [15]:
model = NRTL(["water","ethanol"])

NRTL{PR{BasicIdeal, PRAlpha, NoTranslation, vdW1fRule}} with 2 components:
 "water"
 "ethanol"
Contains parameters: a, b, c, Mw

For the sake of simplicity, we are only going to re-fit $a_{12}$, $a_{21}$ and $c_{12}$. As before, we can define the set of parameters we wish to fit:

In [40]:
toestimate = [
    Dict(
        :param => :a,
        :indices => (1,2),
        :symmetric => false,
        :lower => 2.,
        :upper => 5.,
        :guess => 3.
    ),
    Dict(
        :param => :a,
        :indices => (2,1),
        :symmetric => false,
        :lower => -2.,
        :upper => 0.,
        :guess => -1.
    )
    ,
    Dict(
        :param => :c,
        :indices => (1,2),
        :lower => 0.2,
        :upper => 0.5,
        :guess => 0.3
    )
];

One might notice some slight differences in the above example. For one, we have now specified the indices of the parameters we wish to fit. If one isn't sure of the indices of the parameters one wants to fit, one can look at the `model.params` object. 

Furthermore, in the case of the `a` parameters, as they are asymmetric, an additional argument needs to be specified (`:symmetric=>false`) as `Clapeyron.jl` _assumes_ that all binary interaction parameters are symmetric. This is why the `:symmetric` argument for the `c` parameter did not need to be specified.

Subsequently, we can define the properties we wish to estimate:

In [48]:
function bubble_point(model::EoSModel,T,x)
    bub = bubble_temperature(model,T,[x,1-x])
    return bub[1], bub[4][1]
end

bubble_point (generic function with 1 method)

Building the estimator:

In [41]:
estimator,objective,initial,upper,lower = Estimation(model,toestimate,["bubble_point.csv"]);

And estimating:

In [46]:
nparams = length(initial)
bounds  = [(lower[i],upper[i]) for i in 1:nparams]

result = BlackBoxOptim.bboptimize(objective; 
        SearchRange = bounds, 
        NumDimensions = nparams,
        MaxSteps=10000,
        PopulationSize = 1000,
        TraceMode=:silent)

params = BlackBoxOptim.best_candidate(result);