Merge pull request #561 from SALib/update-misleading-docs-560

Update misleading docs
SALib · Apr 1, 2023 · 6a85248 · 6a85248
2 parents 50e4bde + f48aa45
commit 6a85248
Show file tree

Hide file tree

Showing 3 changed files with 168 additions and 71 deletions.
diff --git a/docs/developers_guide.md b/docs/developers_guide.md
@@ -35,3 +35,13 @@ In a command prompt
 > cd docs
 > sphinx-build . ./html
 ```
+
+## Prior to submitting a PR
+
+Run the below to catch any formatting issues.
+
+```bash
+# pre-commit install
+
+pre-commit run --all-files
+```
diff --git a/docs/user_guide/wrappers.rst b/docs/user_guide/wrappers.rst
@@ -1,5 +1,6 @@
+==========================
 Wrapping an existing model
---------------------------
+==========================
 
 SALib performs sensitivity analysis for any model that can be expressed in the form of :math:`f(X) = Y`,
 where :math:`X` is a matrix of inputs (often referred to as the model's factors).
@@ -17,6 +18,7 @@ write a wrapper to allow use with SALib. This is illustrated here with a simple
         """Return y = a + b + x"""
         return a + b + x
 
+
 As SALib expects a (numpy) matrix of factors, we simply "wrap" the function above like so:
 
 .. code:: python
@@ -29,12 +31,15 @@ As SALib expects a (numpy) matrix of factors, we simply "wrap" the function abov
         # Then call the original model
         return func(a, b, x)
 
-.. note:: Wrapped function is an argument
+
+.. note:: **Wrapped function is an argument**
+
     Note here that the model being "wrapped" is also passed in as an argument.
     This will be revisited further down below.
 
 
-.. tip:: Interfacing with external models/programs
+.. tip:: **Interfacing with external models/programs**
+
     Here we showcase interacting with models written in Python.
     If the model is an external program, this is where interfacing code
     would be written.
@@ -57,25 +62,75 @@ Constants, which SALib should not consider, can be expressed by defining default
         return func(a, b, x)
 
 
-Note that the first argument to the wrapper function(s) is a numpy array of shape
-:math:`N*D`, where :math:`D` is the number of model factors (dimensions) and
-:math:`N` is the number of their combinations. The argument name is, by convention,
-denoted as :code:`X`. This is to maximize compatibility with all methods provided
-in SALib as they expect the first argument to hold the model factor values.
-Using :py:func:`functools.partial` from the `functools` package to create wrappers can be useful.
+Note that the first argument to any function provided to SALib is assumed to be
+a numpy array of shape :math:`N*D`, where :math:`D` is the number of model
+factors (dimensions) and :math:`N` is the number of their combinations. The
+argument name is, by convention, denoted as :code:`X`. This is to maximize
+compatibility with all methods provided in SALib as they expect the first
+argument to hold the model factor values. Using :py:func:`functools.partial`
+from the `functools` package to create wrappers can be useful.
 
-In this example, the model (:code:`linear()`) can be used with both scalar inputs or `numpy` arrays.
-In cases where `a`, `b` or `x` are a vector of inputs, `numpy` will automatically vectorize the
-calculation.
-
-There are many cases where the model is not (or cannot be easily) expressed in a vectorizable form.
-When using the core SALib functions directly in such cases, the user is expected to evaluate the
-model in a `for` loop themselves.
+In this example, the model (:code:`linear()`) can be used with both scalar
+inputs or `numpy` arrays. In cases where `a`, `b` or `x` are a vector of
+inputs, `numpy` will automatically vectorize the calculation. There are many
+cases where the model is not (or cannot be easily) expressed in a vectorizable
+form. In such cases, simply apply a :code:`for` loop as in the example below.
 
 .. code:: python
 
-    from SALib.sample import saltelli
-    from SALib.analyze import sobol
+    import numpy as np
+    from SALib import ProblemSpec
+
+
+    def linear(a: float, b: float, x: float) -> float:
+        return a + b * x
+
+
+    def wrapped_linear(X: np.ndarray, func=linear) -> np.ndarray:
+        N, D = X.shape
+        results = np.empty(N)
+        for i in range(N):
+            a, b, x = X[i, :]
+            results[i] = func(a, b, x)
+
+        return results
+
+
+    sp = ProblemSpec({
+        'names': ['a', 'b', 'x'],
+        'bounds': [
+            [-1, 0],
+            [-1, 0],
+            [-1, 1],
+        ],
+    })
+
+    (
+        sp.sample_sobol(2**6)
+        .evaluate(wrapped_linear)
+        .analyze_sobol()
+    )
+
+    sp.to_df()
+
+    # [         ST   ST_conf
+    #  a  0.173636  0.072142
+    #  b  0.167933  0.059599
+    #  x  0.654566  0.208328,
+    #           S1   S1_conf
+    #  a  0.182788  0.111548
+    #  b  0.179003  0.145714
+    #  x  0.664727  0.241977,
+    #                S2   S2_conf
+    #  (a, b) -0.022070  0.185510
+    #  (a, x) -0.010781  0.186743
+    #  (b, x) -0.014616  0.279925]
+
+
+Use of the core SALib functions equivalent to the previous example are shown
+below:
+
+.. code:: python
 
     problem = {
         'names': ['a', 'b', 'x'],
@@ -108,56 +163,85 @@ model in a `for` loop themselves.
     #  (a, x) -3.902439e-03  0.202343
     #  (b, x) -3.902439e-03  0.232957]
 
-This highlights one usability aspect of using the SALib `ProblemSpec` Interface - it
-automatically applies the model for each individual sample set in a `for` loop
-(at the cost of computational efficiency).
+
+Parallel evaluation and analysis
+--------------------------------
+
+Here we expand on some technical details that enable parallel evaluation and
+analysis. We noted earlier that the model being "wrapped" is also passed in
+as an argument. This is to facilitate parallel evaluation, as the arguments
+to the wrapper are passed on to workers. The approach works be using Python's
+`mutable default argument <https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments>`_
+behavior.
+
+A further consideration is that imported modules/packages are not made
+available to workers in cases where functions are defined in the same file
+SALib is used in. Running the previous example with :code:`.evaluate(wrapped_linear, nprocs=2)`
+will fail with :code:`NameError: name 'np' is not defined`.
+
+The quick fix is to re-import the required packages within the model function
+itself:
 
 .. code:: python
 
-    from SALib import ProblemSpec
+    def wrapped_linear(X: np.ndarray, func=linear) -> np.ndarray:
+        import numpy as np  # re-import necessary packages
 
+        N, D = X.shape
+        results = np.empty(N)
+        for i in range(N):
+            a, b, x = X[i, :]
+            results[i] = func(a, b, x)
 
-    sp = ProblemSpec({
-        'names': ['a', 'b', 'x'],
-        'bounds': [
-            [-1, 0],
-            [-1, 0],
-            [-1, 1],
-        ],
-    })
+        return results
 
-    (
-        sp.sample_sobol(2**6)
-        .evaluate(wrapped_linear)
-        .analyze_sobol()
-    )
 
-    sp.to_df()
+This can, however, get unwieldy for complicated models. The recommended best
+practice is to separate implementation (i.e., model definitions) from its use.
+Simply moving the model functions into a separate file is enough for this
+example, such that the project structure is something like:
 
-    # [         ST   ST_conf
-    #  a  0.173636  0.072142
-    #  b  0.167933  0.059599
-    #  x  0.654566  0.208328,
-    #           S1   S1_conf
-    #  a  0.182788  0.111548
-    #  b  0.179003  0.145714
-    #  x  0.664727  0.241977,
-    #                S2   S2_conf
-    #  (a, b) -0.022070  0.185510
-    #  (a, x) -0.010781  0.186743
-    #  (b, x) -0.014616  0.279925]
+::
+
+    project_directory
+    |-- model_definition.py
+    └── analysis.py
+
+
+.. tip:: **Project structure**
+
+    The project structure shown above is for example purposes only.
+    It is highly recommended that a standardized directory structure,
+    such as https://github.com/drivendata/cookiecutter-data-science,
+    be adopted to improve usability and reproducibility.
 
-We also noted earlier that the model being "wrapped" is also passed in as an argument.
-This is to facilitate parallel evaluation, as the arguments to the wrapper
-are passed on to workers. The approach works be using Python's
-`mutable default argument <https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments>`_
-behavior.
 
-Technical detail aside, defining the model this way allows the model to be evaluated in parallel:
+Here, :code:`model_definitions.py` holds the model definitions:
 
 .. code:: python
 
-    from SALib import ProblemSpec
+    import numpy as np
+
+
+    def linear(a: float, b: float, x: float) -> float:
+        return a + b * x
+
+
+    def wrapped_linear(X: np.ndarray, func=linear) -> np.ndarray:
+        N, D = X.shape
+        results = np.empty(N)
+        for i in range(N):
+            a, b, x = X[i, :]
+            results[i] = func(a, b, x)
+
+        return results
+
+
+and :code:`analysis.py` contains use of SALib:
+
+.. code:: python
+
+    from model_definition import wrapped_linear
 
 
     sp = ProblemSpec({
@@ -172,20 +256,23 @@ Technical detail aside, defining the model this way allows the model to be evalu
     (
         sp.sample_sobol(2**6)
         .evaluate(wrapped_linear, nprocs=2)
-        .analyze_sobol()
+        .analyze_sobol(nprocs=2)
     )
 
-    sp.to_df()
 
-    # [         ST   ST_conf
-    #  a  0.166372  0.064571
-    #  b  0.164554  0.068605
-    #  x  0.665150  0.191152,
-    #           S1   S1_conf
-    #  a  0.201450  0.152915
-    #  b  0.165128  0.124578
-    #  x  0.670300  0.254541,
-    #                S2   S2_conf
-    #  (a, b) -0.027733  0.178632
-    #  (a, x) -0.068051  0.257325
-    #  (b, x)  0.000958  0.257001]
+.. note:: **Multi-processing**
+
+    Some interactive Python consoles, including earlier versions of
+    IPython, may appear to hang on Windows when utilizing parallel
+    evaluation and analysis. In such cases, the recommended workaround
+    is to wrap use of SALib with a :code:`__main__` check to ensure
+    it is only run in the top-level environment.
+
+    .. code:: python
+
+        if __name__ == "__main__":
+            (
+                sp.sample_sobol(2**6)
+                .evaluate(wrapped_linear, nprocs=2)
+                .analyze_sobol(nprocs=2)
+            )
diff --git a/src/SALib/util/problem.py b/src/SALib/util/problem.py
@@ -173,8 +173,8 @@ def evaluate(self, func, *args, **kwargs):
         -------
         self : ProblemSpec object
         """
-        if "nprocs" in kwargs:
-            nprocs = kwargs.pop("nprocs")
+        nprocs = kwargs.pop("nprocs", 1)
+        if nprocs > 1:
             return self.evaluate_parallel(func, *args, nprocs=nprocs, **kwargs)
 
         self.results = func(self._samples, *args, **kwargs)