Skip to content

Commit

Permalink
Merge pull request #561 from SALib/update-misleading-docs-560
Browse files Browse the repository at this point in the history
Update misleading docs
  • Loading branch information
ConnectedSystems committed Apr 1, 2023
2 parents 50e4bde + f48aa45 commit 6a85248
Show file tree
Hide file tree
Showing 3 changed files with 168 additions and 71 deletions.
10 changes: 10 additions & 0 deletions docs/developers_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,13 @@ In a command prompt
> cd docs
> sphinx-build . ./html
```

## Prior to submitting a PR

Run the below to catch any formatting issues.

```bash
# pre-commit install

pre-commit run --all-files
```
225 changes: 156 additions & 69 deletions docs/user_guide/wrappers.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
==========================
Wrapping an existing model
--------------------------
==========================

SALib performs sensitivity analysis for any model that can be expressed in the form of :math:`f(X) = Y`,
where :math:`X` is a matrix of inputs (often referred to as the model's factors).
Expand All @@ -17,6 +18,7 @@ write a wrapper to allow use with SALib. This is illustrated here with a simple
"""Return y = a + b + x"""
return a + b + x
As SALib expects a (numpy) matrix of factors, we simply "wrap" the function above like so:

.. code:: python
Expand All @@ -29,12 +31,15 @@ As SALib expects a (numpy) matrix of factors, we simply "wrap" the function abov
# Then call the original model
return func(a, b, x)
.. note:: Wrapped function is an argument
.. note:: **Wrapped function is an argument**

Note here that the model being "wrapped" is also passed in as an argument.
This will be revisited further down below.


.. tip:: Interfacing with external models/programs
.. tip:: **Interfacing with external models/programs**

Here we showcase interacting with models written in Python.
If the model is an external program, this is where interfacing code
would be written.
Expand All @@ -57,25 +62,75 @@ Constants, which SALib should not consider, can be expressed by defining default
return func(a, b, x)
Note that the first argument to the wrapper function(s) is a numpy array of shape
:math:`N*D`, where :math:`D` is the number of model factors (dimensions) and
:math:`N` is the number of their combinations. The argument name is, by convention,
denoted as :code:`X`. This is to maximize compatibility with all methods provided
in SALib as they expect the first argument to hold the model factor values.
Using :py:func:`functools.partial` from the `functools` package to create wrappers can be useful.
Note that the first argument to any function provided to SALib is assumed to be
a numpy array of shape :math:`N*D`, where :math:`D` is the number of model
factors (dimensions) and :math:`N` is the number of their combinations. The
argument name is, by convention, denoted as :code:`X`. This is to maximize
compatibility with all methods provided in SALib as they expect the first
argument to hold the model factor values. Using :py:func:`functools.partial`
from the `functools` package to create wrappers can be useful.

In this example, the model (:code:`linear()`) can be used with both scalar inputs or `numpy` arrays.
In cases where `a`, `b` or `x` are a vector of inputs, `numpy` will automatically vectorize the
calculation.

There are many cases where the model is not (or cannot be easily) expressed in a vectorizable form.
When using the core SALib functions directly in such cases, the user is expected to evaluate the
model in a `for` loop themselves.
In this example, the model (:code:`linear()`) can be used with both scalar
inputs or `numpy` arrays. In cases where `a`, `b` or `x` are a vector of
inputs, `numpy` will automatically vectorize the calculation. There are many
cases where the model is not (or cannot be easily) expressed in a vectorizable
form. In such cases, simply apply a :code:`for` loop as in the example below.

.. code:: python
from SALib.sample import saltelli
from SALib.analyze import sobol
import numpy as np
from SALib import ProblemSpec
def linear(a: float, b: float, x: float) -> float:
return a + b * x
def wrapped_linear(X: np.ndarray, func=linear) -> np.ndarray:
N, D = X.shape
results = np.empty(N)
for i in range(N):
a, b, x = X[i, :]
results[i] = func(a, b, x)
return results
sp = ProblemSpec({
'names': ['a', 'b', 'x'],
'bounds': [
[-1, 0],
[-1, 0],
[-1, 1],
],
})
(
sp.sample_sobol(2**6)
.evaluate(wrapped_linear)
.analyze_sobol()
)
sp.to_df()
# [ ST ST_conf
# a 0.173636 0.072142
# b 0.167933 0.059599
# x 0.654566 0.208328,
# S1 S1_conf
# a 0.182788 0.111548
# b 0.179003 0.145714
# x 0.664727 0.241977,
# S2 S2_conf
# (a, b) -0.022070 0.185510
# (a, x) -0.010781 0.186743
# (b, x) -0.014616 0.279925]
Use of the core SALib functions equivalent to the previous example are shown
below:

.. code:: python
problem = {
'names': ['a', 'b', 'x'],
Expand Down Expand Up @@ -108,56 +163,85 @@ model in a `for` loop themselves.
# (a, x) -3.902439e-03 0.202343
# (b, x) -3.902439e-03 0.232957]
This highlights one usability aspect of using the SALib `ProblemSpec` Interface - it
automatically applies the model for each individual sample set in a `for` loop
(at the cost of computational efficiency).
Parallel evaluation and analysis
--------------------------------

Here we expand on some technical details that enable parallel evaluation and
analysis. We noted earlier that the model being "wrapped" is also passed in
as an argument. This is to facilitate parallel evaluation, as the arguments
to the wrapper are passed on to workers. The approach works be using Python's
`mutable default argument <https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments>`_
behavior.

A further consideration is that imported modules/packages are not made
available to workers in cases where functions are defined in the same file
SALib is used in. Running the previous example with :code:`.evaluate(wrapped_linear, nprocs=2)`
will fail with :code:`NameError: name 'np' is not defined`.

The quick fix is to re-import the required packages within the model function
itself:

.. code:: python
from SALib import ProblemSpec
def wrapped_linear(X: np.ndarray, func=linear) -> np.ndarray:
import numpy as np # re-import necessary packages
N, D = X.shape
results = np.empty(N)
for i in range(N):
a, b, x = X[i, :]
results[i] = func(a, b, x)
sp = ProblemSpec({
'names': ['a', 'b', 'x'],
'bounds': [
[-1, 0],
[-1, 0],
[-1, 1],
],
})
return results
(
sp.sample_sobol(2**6)
.evaluate(wrapped_linear)
.analyze_sobol()
)
sp.to_df()
This can, however, get unwieldy for complicated models. The recommended best
practice is to separate implementation (i.e., model definitions) from its use.
Simply moving the model functions into a separate file is enough for this
example, such that the project structure is something like:

# [ ST ST_conf
# a 0.173636 0.072142
# b 0.167933 0.059599
# x 0.654566 0.208328,
# S1 S1_conf
# a 0.182788 0.111548
# b 0.179003 0.145714
# x 0.664727 0.241977,
# S2 S2_conf
# (a, b) -0.022070 0.185510
# (a, x) -0.010781 0.186743
# (b, x) -0.014616 0.279925]
::

project_directory
|-- model_definition.py
└── analysis.py


.. tip:: **Project structure**

The project structure shown above is for example purposes only.
It is highly recommended that a standardized directory structure,
such as https://github.com/drivendata/cookiecutter-data-science,
be adopted to improve usability and reproducibility.

We also noted earlier that the model being "wrapped" is also passed in as an argument.
This is to facilitate parallel evaluation, as the arguments to the wrapper
are passed on to workers. The approach works be using Python's
`mutable default argument <https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments>`_
behavior.

Technical detail aside, defining the model this way allows the model to be evaluated in parallel:
Here, :code:`model_definitions.py` holds the model definitions:

.. code:: python
from SALib import ProblemSpec
import numpy as np
def linear(a: float, b: float, x: float) -> float:
return a + b * x
def wrapped_linear(X: np.ndarray, func=linear) -> np.ndarray:
N, D = X.shape
results = np.empty(N)
for i in range(N):
a, b, x = X[i, :]
results[i] = func(a, b, x)
return results
and :code:`analysis.py` contains use of SALib:

.. code:: python
from model_definition import wrapped_linear
sp = ProblemSpec({
Expand All @@ -172,20 +256,23 @@ Technical detail aside, defining the model this way allows the model to be evalu
(
sp.sample_sobol(2**6)
.evaluate(wrapped_linear, nprocs=2)
.analyze_sobol()
.analyze_sobol(nprocs=2)
)
sp.to_df()
# [ ST ST_conf
# a 0.166372 0.064571
# b 0.164554 0.068605
# x 0.665150 0.191152,
# S1 S1_conf
# a 0.201450 0.152915
# b 0.165128 0.124578
# x 0.670300 0.254541,
# S2 S2_conf
# (a, b) -0.027733 0.178632
# (a, x) -0.068051 0.257325
# (b, x) 0.000958 0.257001]
.. note:: **Multi-processing**

Some interactive Python consoles, including earlier versions of
IPython, may appear to hang on Windows when utilizing parallel
evaluation and analysis. In such cases, the recommended workaround
is to wrap use of SALib with a :code:`__main__` check to ensure
it is only run in the top-level environment.

.. code:: python
if __name__ == "__main__":
(
sp.sample_sobol(2**6)
.evaluate(wrapped_linear, nprocs=2)
.analyze_sobol(nprocs=2)
)
4 changes: 2 additions & 2 deletions src/SALib/util/problem.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,8 +173,8 @@ def evaluate(self, func, *args, **kwargs):
-------
self : ProblemSpec object
"""
if "nprocs" in kwargs:
nprocs = kwargs.pop("nprocs")
nprocs = kwargs.pop("nprocs", 1)
if nprocs > 1:
return self.evaluate_parallel(func, *args, nprocs=nprocs, **kwargs)

self.results = func(self._samples, *args, **kwargs)
Expand Down

0 comments on commit 6a85248

Please sign in to comment.