# Defining Parameter Samples in fmdtools

Parameter Samples are used to evaluate the performance of a model over a set of input parameters. It can then be used to do things like:
- define/understand the operational envelope for different system parameters (i.e., what inputs can the system safely encounter)
- quantify failure probabilities given stochastic inputs (i.e., if the statistical distribution of inputs are known, what is the resulting probability of hazards given the design)

```
Copyright © 2024, United States Government, as represented by the Administrator of the National Aeronautics and Space Administration. All rights reserved.

The “"Fault Model Design tools - fmdtools version 2"” software is licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. 

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
```

In [1]:
import fmdtools.sim.propagate as prop
import inspect

The rover model (`rover_model.py`) from the examples module will be used to demonstrate this approach. The main task of the rover is to follow a given line from a starting location to an ending location. 

In [2]:
from examples.rover.rover_model import Rover

Additionally, this model has a corresponding `Parameter` class which generates the design parameters of the model given a reduced space of input parameters. 

In [3]:
from examples.rover.rover_model import RoverParam
print(inspect.getsource(RoverParam))

class RoverParam(Parameter):
    """Parameters for rover."""

    ground: GroundParam = GroundParam()
    correction: ResCorrection = ResCorrection()
    degradation: DegParam = DegParam()
    drive_modes: dict = {"mode_args": "set"}

    def __init__(self, *args, **kwargs):
        super().__init__(*args, strict_immutability=False, **kwargs)



This parameter is in turn made of other parameters:

In [4]:
from rover_model import GroundParam
print(inspect.getsource(GroundParam))

ModuleNotFoundError: No module named 'rover_model'

Below shows the performance of the rover during a (default) turn with a radius of 20 meters that begins at 20 meters. As shown, there is a slight drift from the centerline, but not enough for the rover to get lost (that would take 1 meter of drift).

In [5]:
p = RoverParam(ground=dict(linetype="turn"))
mdl = Rover(p=p)
results, mdlhist = prop.nominal(mdl)

In [6]:
fig, ax = mdlhist.plot_trajectories("flows.pos.s.x", "flows.pos.s.y")
mdl.flows['ground'].ga.show(fig=fig, ax=ax)

Below shows the performance of the model over a sine wave. As shown, similar to the turn line type, the drift small enough to where the rover completes its mission within acceptable bounds.

In [7]:
p = RoverParam(ground=dict(linetype="sine"))
mdl = Rover(p=p)
results, mdlhist = prop.nominal(mdl)
fig, ax = mdlhist.plot_trajectories("flows.pos.s.x", "flows.pos.s.y")
mdl.flows['ground'].ga.show(fig=fig, ax=ax)

The performance of the rover in these situations is dependent on the parameters of the situation (e.g., the radius of the curve and the amplitude of the sine wave). Thus, it is important to define the operational envelope for the system. This can be done using a `ParameterDomain`, which can be used to define ranges of variables to simulate the system under, and a `ParameterSample`, which then samples these ranges.

In [8]:
from fmdtools.sim.sample import ParameterDomain, ParameterSample

In this approach we define a parameter domain for the sine wave scenario.

Here we specify the linetype as sine (As a constant) and then add the variables "amp" and "period" which we will then sample:

In [9]:
pd_sine = ParameterDomain(RoverParam)
pd_sine.add_constant("ground.linetype", "sine")
pd_sine.add_variables("ground.amp", "ground.period", lims={"ground.amp":(0, 8), "ground.period": (10, 50)})

pd_sine

These can then be sampled using the `ParameterSample`. 

Below we specify that we will sample the given variables in combination at certain resolutions:

In [10]:
ps_sine = ParameterSample(pd_sine)
ps_sine.add_variable_ranges(comb_kwargs={'resolutions':{'ground.amp': 0.5, "ground.period": 10}})

ps_sine

Notice that the properties of the scenarios are stored in their corresponding `ParameterScenario`s.

In [11]:
ps_sine.scenarios()

ParameterSamples are sampled using `prop.parameter_sample`:

In [12]:
res, hist = prop.parameter_sample(mdl, ps_sine)
res

To speed up execution over large numbers of scenarios, multiprocessing can also be used to run the scenarios in parallel by passing an execution pool. This is not done here because it would require the model to be in a different file, and because the gains on a light-weight model like this are not significant.

Now that the approach has been simulated, the operational envelope can be visualized using the `fmdtools.analyze.tabulate.NominalEnvelope` class, which can be used to plot the *classification* of the model in the 1/2/3 dimensions over the set of given parameters as nominal or incomplete. 

Note that this classification must be in the dictionary returned from the Model's `find_classification` function at the end of the model run under the key `classification` as is done in the rover model. This classification must also be encoded as a string.

In [13]:
from fmdtools.analyze.tabulate import NominalEnvelope

We can then use these results to visualize the operational envelope for the system over each case. In this case, the parameter ranges of the sine wave are plotted, showing that the rover can only a low ration of amplitude to wavelenght.

In [14]:
ne = NominalEnvelope(ps_sine, res, 'at_finish', 'p.ground.amp', 'p.ground.period', func=lambda x: x == True)
ne.as_plot()

While this is helpful for plotting string classifications, we also might want to compare numeric quantities (e.g., costs, hazard probabilities, etc) over the set of factors. For this, `fmdtools.analyze.tabulate.Comparison` is used, which creates a comparison which can be visualized as a table or plot.

In [15]:
from fmdtools.analyze.tabulate import Comparison

comp = Comparison(res, ps_sine, metrics=['end_dist', 'tot_deviation'], factors=['p.ground.amp', 'p.ground.period'])
comp.sort_by_metric("end_dist")
comp.as_table()

In [16]:
comp.sort_by_factor("p.ground.period", reverse=True)
comp.sort_by_factor("p.ground.amp")
fig, ax = comp.as_plot("end_dist", color_factor = "p.ground.period", figsize=(20, 1))

This table can also be summarized on individual factors:

In [17]:
scomp = Comparison(res, ps_sine, metrics=['end_dist'], factors=['p.ground.amp'])
scomp.sort_by_factor('p.ground.amp')
scomp.as_table()

In [18]:
scomp.as_plot("end_dist")

## Quantifying probabilities

Given the ability to simulate over ranges, it can additionally be used to quantify probabilities of the different end-state classifications. `Result.state_probabilities()` can be used to quantify the probability these classifications.

In [19]:
res.state_probabilities()

## Nested Scenario Sampling

Thus far, we have introduced two types of samples 
- `FaultSample`, which is used to evaluate the system resilience to a set of faults
- `ParameterSample`, which is used to evaluate system performance over a set of parameters

These both have their limitations when used alone. Simulating a `FaultSample` using `propagate.fault_sample` solely evaluates evaluates fault-driven hazards in a single nominal set of parameters (which may not generalize) while simulating a `ParameterSample` using `propagate.parameter_sample` evaluates the systerm performance/resilience to external parameters (But not faults).

To resolve these limitations, one can use a *nested* scenario sampling approach where a `SampleApproach` is simulated at each parameter level of a `ParameterSample`, giving the resilience of the system to faults over a set of operational parameters. This is called using the `propagate.nested_sample` method.

Here we use the nominal approach generated earlier with a default sampling approach to quantify resilience.

In [20]:
from fmdtools.sim.sample import SampleApproach
sa = SampleApproach(mdl)
# adding fault domains
sa.add_faultdomain("power", "all_fxn_modes", "power")
sa.add_faultsample("power", "fault_times", "power", [1,2])
sa

In [21]:

nest_res, nest_hist, apps = prop.nested_sample(mdl, ps_sine,
                                               faultdomains = {"power": (('all_fxn_modes', 'power'), {})},
                                               faultsamples = {"power": (("fault_times", "power", [1,2]), {})})

 The resulting endclass/mdlhist dictionary is in turn nested within operational scenarios.

In [22]:
nest_res.keys()

In [23]:
apps

We can now compare the performance of the system over faults using a `NestedComparison`:

In [24]:
import numpy as np
from fmdtools.analyze.tabulate import NestedComparison

n_comp = NestedComparison(nest_res, ps_sine, ['p.ground.amp', 'p.ground.period'], apps, [], #['fault'],
                          metrics=['end_dist'], default_stat=np.mean, ci_metrics=['end_dist'])

n_comp.as_table()

In [25]:
n_comp.sort_by_factor("p.ground.period", reverse=True)
n_comp.sort_by_factor("p.ground.amp")
n_comp.as_plot("end_dist", color_factor="p.ground.period", figsize=(20,3))

This is of course consistent with the nominal case, with a higher average in all cases, since fault modes inherently deviate the trajectory.

Also note the wide error bars on the fault plot--this is again because individual fault modes have a substantial impact on the trajectory (and we are only sampling a few scenarios for the bootstrap)

In [26]:
fig, ax = comp.as_plot("end_dist", color_factor = "p.ground.period", figsize=(20, 3))