# AK-MCS

## Theory

*AK-MCS* is an active learning reliability method combining *Kriging* and *Monte Carlo Simulation*.

```{todo}
Write this section.
```

## Syntax

```{eval-rst}
.. class:: flx.gpr.akmcs

   Represents/manages an instance of AK-MCS.

   .. method:: __init__(config)

      Defines the AK-MCS handler.
      
      :param config: Configuration directory. The structure of ``config`` is outlined in detail in the following.
      :type config: dict

      General properties:
         The following properties are required/allowed in *config*:

         - ``sampler`` (:class:`flx.sampler`): Sampler of a set of random variables.
         - ``lsf`` (:type:`flxPara`): Limit-state function of the structural reliability problem to investigate. The function should depend on the random variables associated with ``sampler``.
         - ``gp`` (:class:`flx.gpr.gp`, *optional*): Gaussian process for Kriging of the limit-state function.

             If not specified, a *zero*-mean Gaussian process with a *Gaussian* kernel structure is used.
             
             .. important::
            
                    It is strongly recommended to use only Gaussian processes with a ``mean_type`` of ``"zero"`` (see :func:`flx.gpr.gp.__init__`).
                    
         - ``seed`` (*int*, default: *0*): Seed value for the Pseudo-Random-Number-Generator. Number must be positive or *zero*.
         - ``N_RNG_init`` (*int*, default: *0*): Number of initial calls to the Pseudo-Random-Number-Generator. Number must be positive or *zero*.
         - ``N_reserve`` (*int*, default:  :math:`10^4`): Memory is allocated for *N_reserve* calls of the actual limit-state function (``lsf``). Number must be positive.
         - ``itermax`` (*int*, default: *500*): Maximum number of iterations used for optimizing the *process parameters* of the Gaussian process (see parameter ``itermax`` of :func:`flx.gpr.gp.optimize`).
         - ``NmaxSur`` (*int*, default: :math:`10^7`): Maximum number of samples to use for Monte Carlo sampling of the surrogate model. If set to ``0``, an upper limit is not taken into account. Value must not be negative.
         - ``Nsmpls`` (*int*, default: :math:`10^6`): Initial number of calls of the surrogate model (per simulation step). Value must be larger than *zero*.
         - ``err_thresh`` (*float*, default: *0.3*): Threshold :math:`\varepsilon_\mathrm{t}` for evaluating the :ref:`content:gpr:akmcs:stop_crit`. Value must be larger than *zero*.
         - ``tqi_val`` (*float*, default: *0.99*): Quantile value for evaluating the :ref:`content:gpr:akmcs:stop_crit`. Value should be within :math:`(0.5,1)`.
         - ``data_box`` (:class:`flx.dataBox`): A :class:`flx.dataBox` for post-processing (e.g. writing to a file) of performed limit-state evaluatios. The dimension of the input vector of the :class:`flx.dataBox` must equal the number of standard Normal random variables in ``sampler``, the dimension of the output vector must eqal *one*.

   .. py:method:: initialize_with_LHS(N=0)

      Generate an initial set of samples for AK-MCS, using ``N`` Latin-hypercube samples.

      :param N: Number of samples. Value must be *positive* or *zero*. If *N* is set to ``0``, the number of samples is set to two times the dimension of the input vector.
      :type N: int
      :rtype: None

   .. py:method:: simulate()

      Perform a single simulation step using the surrogate model.

      :rtype: flx.gpr.akmcs_status
      
   .. py:attribute:: res
       :type: dict

       A dictionary that contains information about the previous call to :func:`flx.gpr.akmcs.simulate`.

       The dictionary has the following structure:
           - ``pf_mle`` (type: *float*): MLE estimate of the probability of failure
           - ``mean_pf_bayesian`` (type: *float*): :math:`\operatorname{E}\left[p_\mathrm{f}\right]`, Bayesian posterior mean value of the probability of failure
           - ``Pr_q_tqi`` (type: *float*): :math:`p_\mathrm{tqi}` (see :ref:`content:gpr:akmcs:stop_crit`), the current value associated with the quantile ``tqi_val``.
           - ``err`` (type: *float*): the current error value (see :ref:`content:gpr:akmcs:stop_crit`)
           - ``r`` (type: *float*): Value of the ratio :math:`r` (see :ref:`content:gpr:akmcs:stop_crit`) that is associated with :math:`p_\mathrm{tqi}`.
           - ``r_increase_N_surrogate`` (type: *float*): Value of the ratio :math:`r` (see :ref:`content:gpr:akmcs:stop_crit`), if the MCS on the surrogate model is performed with twice as many samples.
           - ``r_no_Kriging_uncertainty`` (type: *float*): Value of the ratio :math:`r` (see :ref:`content:gpr:akmcs:stop_crit`), if the uncertainty from the Kriging model is ignored; i.e., only the sampling uncertainty from the surrogate model is considered.
           - ``af`` (type: *float*): ``af = (r_increase_N_surrogate-r_no_Kriging_uncertainty)/r``; if this value is positive, the actual limit-state function is evaluated in the next iteration; if the value is negative, the number of surrogate samples is increase in the next iteration.
           - ``propose_to_increase_N_smpls_surrogate`` (type: *int*): ``1``, if ``af`` is negative, otherwise ``0``.
           - ``N`` (type: *int*): Total number of samples used to perform a Monte Carlo Simulation on the surrogate model.
           - ``N_model_calls`` (type: *int*): Total number of observations of the actual limit-state function (i.e., number of data-points to train the surrogate model).
           - ``Uval_worst_point`` (type: *float*): The probability about the sign of the limit-state function is quantified for each surrogate sample. ``Uval_worst_point`` corresponds to the standard Normal transformation of this probability. Thus, the larger ``Uval_worst_point`` differs from *zero*, the larger the confidence about the sign of the most uncertain surrogate point.
      
   .. py:method:: get_GP()

      Retrieve a reference to the internal Gaussian process.

      .. note::

          Do not modify the properties of the returned object to avoid inconsistent behavior.

      :rtype: flx.gpr.gp
      
   .. py:method:: get_N_model_calls(only_from_current_run=True)

      Retrieve total number of calls of the actual limit-state function.

      :param only_from_current_run: 
           - ``True``: return number of limit-state function calls from the current instance. 
           - ``False``: return total number of available observations of the limit-state function.
      :type only_from_current_run: bool
      :rtype: int
      
```

```{eval-rst}
.. py:class:: flx.gpr.akmcs_status

   An enumeration of the status of a :class:`flx.gpr.akmcs`.

   Members:
       - ``undefined``: initial state; the Gaussian process is currently not initialized
       - ``defined``: internal state, nothing to do
       - ``evalLSF``: requires a new call of the actual limit-state function of the model
       - ``increase_N_surrogate``: requires an increase of surrogate samples
       - ``decrease_N_surrogate``: a decrease of the number of surrogate samples is proposed
       - ``stop_success``: stop is recommended, as error is below the specified threshold
       - ``stop_iterLimit``: maximum number of surrogate samples is exceeded
       
```       

(content:gpr:akmcs:stop_crit)=
## Stopping criterion

The value ``config['tqi_val']`` in {func}`flx.gpr.akmcs.__init__` defines a quantile of the distribution that represents the uncertainty about the value of the probability of failure estimated by {class}`flx.gpr.akmcs` (which includes both the uncertainty from the Kirging model and the sampling uncertainty from MCS on the surrogate model). 
We refer to the value associated with this quantile as $p_\mathrm{tqi}$.
The ratio $r$ is defined as:

$$
r = \frac{p_\mathrm{tqi}}{\operatorname{E}\left[p_\mathrm{f}\right]}\;,
$$

where $\operatorname{E}\left[p_\mathrm{f}\right]$ is the Bayesian posterior mean value of the probability of failure.
Note that $r$ equals *one* if $p_\mathrm{tqi}$ equals $\operatorname{E}\left[p_\mathrm{f}\right]$.

The error value $\varepsilon$ is evaluated based on $r$ as:

$$
\varepsilon = r - 1
$$

The probability of failure estimated through AK-MCS is considered sufficently accurate, when $\varepsilon\le\varepsilon_\mathrm{t}$ (where $\varepsilon_\mathrm{t}$ is defined through value ``config['err']`` in {func}`flx.gpr.akmcs.__init__`).
If this condition is met, AK-MCS stops the iteration.

(content:gpr:akmcs:postproc:import)=
## Importing data into an instance of AK-MCS

```{eval-rst}
.. py:property:: akmcs

   A :class:`post-processor<flx.dataBox.postProc>` that imports any sample added to the corresponding :class:`flx.dataBox` to an instance of :class:`flx.gpr.akmcs`.

   Parametrization:
       Parameters of this post-processor can be specified as additional key-value pairs in an object of type :type:`dataBox_postProc_type`. 
       The following parameters are accepted:

         - ``akmcs`` (:class:`flx.gpr.akmcs`): The instance of :class:`flx.gpr.akmcs` to which to add samples inserted into the :class:`flx.dataBox`.

   States:
       When the function :func:`flx.dataBox.postProc.eval` is called on this post-processor, the following states are returned:

       - ``akmcs`` (:class:`flx.gpr.akmcs`): A reference to the :class:`flx.gpr.akmcs`-instance linked to this :class:`post-processor<flx.dataBox.postProc>`.

```

## Application Examples
### Example 1

In [1]:
import fesslix as flx
flx.load_engine()
import fesslix.gpr

Random Number Generator: MT19937 - initialized with rand()=1529143242;
Random Number Generator: MT19937 - initialized with 1000 initial calls.


In [2]:
## ==============================================
## Generate input model
## ==============================================
config_rv_R = { 'name':'R', 'type':'logn', 'mu':6., 'sd':1. }
config_rv_S = { 'name':'S', 'type':'normal', 'mu':1., 'sd':1.0 }
rv_set = flx.rv_set( {'name':'rv_set'}, [ config_rv_R, config_rv_S ] )
sampler = flx.sampler(['rv_set'])

In [3]:
## ==============================================
## Set up dataBox (for storing preformed model calls)
## ==============================================
dBox_1 = flx.dataBox(2,1)
dBox_1.write2file( {
    'fname': "akmcs_samples.bin",
    'append': False,
    'binary': True,
    'cols': 'all'
    } )

In [4]:
## ==============================================
## Define the AK-MCS sampler
## ==============================================
config = {
        "sampler": sampler,
        "lsf": "rbrv(rv_set::R)-rbrv(rv_set::S)",
        "err_thresh": 0.05,
        "data_box": dBox_1
    }
ak_mcs = fesslix.gpr.akmcs(config)

In [5]:
## ==============================================
## Initialize with Latin-hypercube sampling
## ==============================================
ak_mcs.initialize_with_LHS(5)

In [6]:
## ==============================================
## Perform a single simulation step
## ==============================================
state = ak_mcs.simulate()
print(state, ak_mcs.res)

akmcs_status.evalLSF {'pf_mle': 0.00023317681449451868, 'mean_pf_bayesian': 0.00023417634614182639, 'err': 2.6772292779142517, 'af': 0.6842458095084958, 'r': 3.6772292779142517, 'r_increase_N_surrogate': 3.674377511550564, 'r_no_Kriging_uncertainty': 1.1582487875357854, 'r_no_Kriging_uncertainty_AND_N_half': 1.2321542359662232, 'propose_to_increase_N_smpls_surrogate': 0, 'N': 1000000, 'N_model_calls': 5, 'Uval_worst_point': 0.004629937881319396, 'Pr_q_tqi': 0.0008611201162277061}


In [7]:
## ==============================================
## Retrieve the current state of AK-MCS
## ==============================================
gp = ak_mcs.get_GP()
gp_info = gp.info()
print( gp_info['noise_log'] )
print( gp_info['kernel'] )
#print( gp_info['opt_log'] )

»» LSE-results  logl=-10.2334  »»  sd_obsv [8.561902 <- 4.759064] »»  sd_Z [8.561902 <- 4.759064] »»  sd_noise [8.561902e-4 <- 4.759064e-4] 

{'type': ['gauss', 'gauss'], 'para_vec': array([4.75906436, 7.69838173, 7.31099519]), 'n_vec': array([1.        , 1.9369855 , 1.98087669]), 'kernel_sd': 4.759064360890354}


In [8]:
for i in range(10):
    state = ak_mcs.simulate()
    print(state, ak_mcs.res)
    if state == fesslix.gpr.akmcs_status.stop_success or state == fesslix.gpr.akmcs_status.stop_iterLimit:
        break
dBox_1.close_file()
print(ak_mcs.get_N_model_calls())

akmcs_status.evalLSF {'pf_mle': 0.00010080173514854765, 'mean_pf_bayesian': 0.00010180153154548456, 'err': 0.6979391635653349, 'af': 0.24812366619591147, 'r': 1.697939163565335, 'r_increase_N_surrogate': 1.6661846572826973, 'r_no_Kriging_uncertainty': 1.244885767041247, 'r_no_Kriging_uncertainty_AND_N_half': 1.366028029669237, 'propose_to_increase_N_smpls_surrogate': 0, 'N': 1000000, 'N_model_calls': 6, 'Uval_worst_point': 0.022032125455684996, 'Pr_q_tqi': 0.0001728528073220101}
akmcs_status.increase_N_surrogate {'pf_mle': 5.069577894970781e-05, 'mean_pf_bayesian': 5.169567555835669e-05, 'err': 0.436545510071632, 'af': -0.012111499867191447, 'r': 1.436545510071632, 'r_increase_N_surrogate': 1.334268423173683, 'r_no_Kriging_uncertainty': 1.35166714392813, 'r_no_Kriging_uncertainty_AND_N_half': 1.5372428891184329, 'propose_to_increase_N_smpls_surrogate': 1, 'N': 1000000, 'N_model_calls': 7, 'Uval_worst_point': 0.11912529830116211, 'Pr_q_tqi': 7.426319061347711e-05}
akmcs_status.evalLSF {

### Example 2 - restart sampling

In [9]:
## ==============================================
## Set up dataBox (for storing preformed model calls)
## ==============================================
dBox_2 = flx.dataBox(2,1)
dBox_2.write2file( {
    'fname': "akmcs_samples.bin",
    'append': True,
    'binary': True,
    'cols': 'all'
    } )

In [10]:
## ==============================================
## Define the AK-MCS sampler
## ==============================================
config = {
        "sampler": sampler,
        "lsf": "rbrv(rv_set::R)-rbrv(rv_set::S)",
        "NmaxSur": int(1e8),
        "err_thresh": 0.05,
        "data_box": dBox_2
    }
ak_mcs = fesslix.gpr.akmcs(config)

In [11]:
## ==============================================
## Initialize with past model calls
## ==============================================
dBox_3 = flx.dataBox(2,1)
dBox_3.register_post_processor({ 'type':'akmcs', 'akmcs':ak_mcs })
dBox_3.read_from_file({'fname':"akmcs_samples.bin", 'binary':True})
print(ak_mcs.get_N_model_calls(), ak_mcs.get_N_model_calls(False))

0 8


In [12]:
state = ak_mcs.simulate()
print(state, ak_mcs.res)

akmcs_status.increase_N_surrogate {'pf_mle': 5.265890252715555e-05, 'mean_pf_bayesian': 5.3658795209565135e-05, 'err': 0.353978630687539, 'af': -0.07696168530365746, 'r': 1.353978630687539, 'r_increase_N_surrogate': 1.2404662916034237, 'r_no_Kriging_uncertainty': 1.3446707688862751, 'r_no_Kriging_uncertainty_AND_N_half': 1.5258208462964489, 'propose_to_increase_N_smpls_surrogate': 1, 'N': 1000000, 'N_model_calls': 8, 'Uval_worst_point': 0.06823295770553107, 'Pr_q_tqi': 7.265286206219007e-05}


In [13]:
state = ak_mcs.simulate(100)
print(state, ak_mcs.res)
print(ak_mcs.get_N_model_calls(), ak_mcs.get_N_model_calls(False))

akmcs_status.stop_success {'pf_mle': 6.234608797144315e-05, 'mean_pf_bayesian': 6.236171102263967e-05, 'err': 0.04492570252946004, 'af': -0.0008631253197446732, 'r': 1.04492570252946, 'r_increase_N_surrogate': 1.0362885110371378, 'r_no_Kriging_uncertainty': 1.037190412868243, 'r_no_Kriging_uncertainty_AND_N_half': 1.053067452416105, 'propose_to_increase_N_smpls_surrogate': 1, 'N': 64000000, 'N_model_calls': 9, 'Uval_worst_point': 0.0004919087334855564, 'Pr_q_tqi': 6.516335470127093e-05}
1 9


In [14]:
dBox_2.close_file()