Before Beginning
--------------------

Ensure that NumPy, libEnsemble, and (optionally) MatPlotLib are installed via:

    pip install libensemble
    pip install matplotlib
    
**Note that for notebooks** the multiprocessing start method should be set to `fork` (default on Linux).
To use with `spawn` (default on Windows and macOS), use the `multiprocess` library.

If running online (e.g., in Colab), you can install libEnsemble here.

In [None]:
!pip install libensemble

Simple Sine Tutorial
--------------------

The foundation of writing libEnsemble routines is accounting for at least three components:

   1. A *generator function*, that produces values for simulations
   2. A *simulator function*, that performs simulations based on values from the generator.
   3. A *calling script*, for defining settings, fields, and functions, then starting the run
   
libEnsemble initializes a *manager* process and as many *worker* processes as the user requests.

For this tutorial, our generator will produce uniform randomly sampled values, and our simulator will calculate the sine of each. 

All input and output values for each evaluation are stored in ``H``, the History array.


Generator function
----------------------

Let's begin by writing our generator function, or `gen_f`.

An available libEnsemble worker will call this generator function with the following parameters:

* `H`: A selection of the History array, passed to the generator function in case the user wants to generate new values based on simulation outputs. Since our generator produces random numbers, it’ll be ignored this time.

* `persis_info`: Dictionary with worker-specific information. In our case, this dictionary
  contains NumPy Random Streams for generating random numbers.

* `gen_specs`: Dictionary with user-defined parameters for the generator. Customizable parameters 
  such as boundaries and batch sizes are placed within the `gen_specs['user']` dictionary.

Later on, we'll populate ``gen_specs`` and ``persis_info`` when we initialize libEnsemble.

The following code-block is our simple generator function:

In [None]:
import numpy as np


def gen_random_sample(H, persis_info, gen_specs, _):
    # Pull out user parameters
    user_specs = gen_specs["user"]

    # Get lower and upper bounds from gen_specs
    lower = user_specs["lower"]
    upper = user_specs["upper"]

    # Determine how many values to generate
    num = len(lower)
    batch_size = user_specs["gen_batch_size"]

    # Create array of 'batch_size' zeros. Array dtype should match 'out' fields
    H_out = np.zeros(batch_size, dtype=gen_specs["out"])

    # Set the 'x' output field to contain random numbers, using random stream
    H_out["x"] = persis_info["rand_stream"].uniform(lower, upper, (batch_size, num))

    # Send back our output and persis_info
    return H_out, persis_info

Simulator function
---------------------

Simulator functions or `sim_f`s perform calculations based on values from the generator function.
The only new parameter here is `sim_specs`, which serves a similar purpose to `gen_specs`.

In [None]:
def sim_find_sine(H, persis_info, sim_specs, _):
    # Create an output array of a single zero
    out = np.zeros(1, dtype=sim_specs["out"])

    # Set the zero to the sine of the input value stored in H
    out["y"] = np.sin(H["x"])

    # Send back our output and persis_info
    return out, persis_info

Calling Script
--------------

Our calling script contains configuration for libEnsemble, the generator function, and the simulator function. It also performs the primary libEnsemble function call to run the ensemble.

In `libE_specs` we specify the number of workers and the type of manager/worker communication libEnsemble will use. The communication method `local` refers to Python's Multiprocessing.

We configure our `sim_f` and `gen_f` functions in `gen_specs` and`sim_specs`, which were previously passed to our user functions. These may be defined as objects or dictionaries, and also describe to libEnsemble what inputs and outputs from those functions to expect.

We specify the circumstances where libEnsemble should stop execution in `exit_criteria`.

Finally, we create the ensemble object, assign random streams, and run the ensemble.

In [None]:
import numpy as np
from pprint import pprint

# If importing sim and gen from files
# from sine_gen import gen_random_sample
# from sine_sim import sim_find_sine

from libensemble import Ensemble
from libensemble.specs import ExitCriteria, GenSpecs, LibeSpecs, SimSpecs

libE_specs = LibeSpecs(nworkers=4, comms="local")

gen_specs = GenSpecs(
    gen_f=gen_random_sample,  # Our generator function
    out=[("x", float, (1,))],  # gen_f output (name, type, size)
    user={
        "lower": np.array([-3]),  # lower boundary for random sampling
        "upper": np.array([3]),  # upper boundary for random sampling
        "gen_batch_size": 5,  # number of x's gen_f generates per call
    },
)

sim_specs = SimSpecs(
    sim_f=sim_find_sine,  # Our simulator function
    inputs=["x"],  # Input field names. "x" from gen_f output
    out=[("y", float)],  # sim_f output. "y" = sine("x")
)

exit_criteria = ExitCriteria(sim_max=80)  # Stop libEnsemble after 80 simulations

# Initialize and run the ensemble.
ensemble = Ensemble(sim_specs, gen_specs, exit_criteria, libE_specs)
ensemble.add_random_streams()  # setup the random streams unique to each worker
H, persis_info, flag = ensemble.run()  # start the ensemble. Blocks until completion.

The following are a couple of lines for visualizing output data from this libEnsemble routine.

In [None]:
# See first 16 results
pprint(H[["sim_id", "x", "y"]][:16])

# To see all fields of H
# print([i for i in H.dtype.fields])
# print(H[:16])

In [None]:
import matplotlib.pyplot as plt

colors = ["b", "g", "r", "y", "m", "c", "k", "w"]

for i in range(1, ensemble.nworkers + 1):
    worker_xy = np.extract(H["sim_worker"] == i, H)
    x = [entry.tolist()[0] for entry in worker_xy["x"]]
    y = [entry for entry in worker_xy["y"]]
    plt.scatter(x, y, label="Worker {}".format(i), c=colors[i - 1])

plt.title("Sine calculations for a uniformly sampled random distribution")
plt.xlabel("x")
plt.ylabel("sine(x)")
plt.legend(loc="lower right")
plt.show()