Before Beginning
--------------------

Ensure that NumPy, libEnsemble, and (optionally) MatPlotLib are installed via:

    pip install libensemble
    pip install matplotlib

Simple Local Sine Tutorial
-------------------------------

This introductory tutorial demonstrates the capability to perform ensembles of
calculations in parallel using libEnsemble.

The foundation of writing libEnsemble routines is accounting for at least three components:

   1. A *generator function*, that produces values for simulations
   2. A *simulator function*, that performs simulations based on values from the generator.
   3. A *calling script*, for defining settings, fields, and functions, then starting the run
   
libEnsemble initializes a *manager* process and as many *worker* processes as the
user requests. The manager (via an *allocation function*) coordinates data-transfer between workers and assigns
each units of work, consisting of a function to run and
accompanying data. These functions perform their work in-line with Python and/or launch and control user applications with libEnsemble's executors.
Workers pass results back to the manager.

For this tutorial, we'll write our generator and simulator functions entirely in Python
without other applications. Our generator will produce uniform randomly sampled
values, and our simulator will calculate the sine of each. By default we don't
need to write a new allocation function. All generated and simulated values
alongside other parameters are stored in ``H``, the History array.

Generator function
----------------------

Let's begin by writing our generator function, or `gen_f`.

An available libEnsemble worker will call this generator function with the following parameters:

* `H`: The History array. A NumPy structured array
  for storing information about each point generated and processed in the ensemble.
  libEnsemble passes a selection of `H` to the generator function in case the user
  wants to generate new values based on previous data.

* `persis_info`: Dictionary with worker-specific
  information. In our case, this dictionary contains NumPy Random Stream objects
  for generating random numbers.

* `gen_specs`: Dictionary with user-defined fields and
  parameters for the generator. Customizable parameters such as boundaries and batch
  sizes are placed within the `gen_specs['user']` dictionary, while input/output fields
  and other specifications that libEnsemble depends on to operate the generator are
  placed outside `user`.

Later on, we'll populate ``gen_specs`` and ``persis_info`` when we initialize libEnsemble.

The following code-block is our simple generator function:

In [None]:
import numpy as np


def gen_random_sample(H, persis_info, gen_specs, _):
    # underscore parameter for advanced arguments

    # Pull out user parameters
    user_specs = gen_specs["user"]

    # Get lower and upper bounds from gen_specs
    lower = user_specs["lower"]
    upper = user_specs["upper"]

    # Determine how many values to generate
    num = len(lower)
    batch_size = user_specs["gen_batch_size"]

    # Create array of 'batch_size' zeros. Array dtype should match 'out' fields
    out = np.zeros(batch_size, dtype=gen_specs["out"])

    # Set the 'x' output field to contain random numbers, using random stream
    out["x"] = persis_info["rand_stream"].uniform(lower, upper, (batch_size, num))

    # Send back our output and persis_info
    return out, persis_info

Our function creates `batch_size` random numbers uniformly distributed
between the `lower` and `upper` bounds. A random stream
from `persis_info` is used to generate these values, which are then placed
into an output NumPy array that meets the specifications from `gen_specs['out']`.

### Exercise

Write a simple generator function that instead produces random integers, using
the `numpy.random.Generator.integers(low, high, size)` function.

In [None]:
import numpy as np


def gen_random_ints(H, persis_info, gen_specs, _):
    user_specs = gen_specs["user"]
    lower = user_specs["lower"]
    upper = user_specs["upper"]
    num = len(lower)
    batch_size = user_specs["gen_batch_size"]

    out = np.zeros(batch_size, dtype=gen_specs["out"])
    out["x"] = persis_info["rand_stream"].integers(lower, upper, (batch_size, num))

    return out, persis_info

Simulator function
---------------------

Simulator functions or `sim_f`s perform calculations based on values from the generator function.
The only new parameter here is `sim_specs`, which serves a similar purpose to `gen_specs`.

In [None]:
def sim_find_sine(H, persis_info, sim_specs, _):
    # Create an output array of a single zero
    out = np.zeros(1, dtype=sim_specs["out"])

    # Set the zero to the sine of the input value stored in H
    out["y"] = np.sin(H["x"])

    # Send back our output and persis_info
    return out, persis_info

Calling Script
--------------

Our calling script contains configuration for libEnsemble, the generator function, and the simulator function. It alsos performs the primary libEnsemble function call to initiate ensemble computation.

In a dictionary called `libE_specs` we specify the number of workers and the type of manager/worker communication libEnsemble will use. The communication method `local` refers to Python's Multiprocessing.

We configure the settings and specifications for our `sim_f` and `gen_f` functions in the `gen_specs` and
`sim_specs` dictionaries, which we saw previously being passed to our functions. These dictionaries also describe to libEnsemble what inputs and outputs from those functions to expect.

Recall that each worker is assigned an entry in the `persis_info` dictionary that, in this tutorial, contains a ``RandomState()`` random stream for uniform random sampling. We populate that dictionary here using a utility from
the `tools` module. Finally, we specify the circumstances where libEnsemble should stop execution in `exit_criteria`.

In [None]:
from libensemble.libE import libE
from libensemble.tools import add_unique_random_streams
from tutorial_gen import gen_random_sample
from tutorial_sim import sim_find_sine

nworkers = 4
libE_specs = {"nworkers": nworkers, "comms": "local"}

gen_specs = {
    "gen_f": gen_random_sample,  # Our generator function
    "out": [("x", float, (1,))],  # gen_f output (name, type, size).
    "user": {
        "lower": np.array([-3]),  # random sampling lower bound
        "upper": np.array([3]),  # random sampling upper bound
        "gen_batch_size": 5,  # number of values gen_f will generate per call
    },
}

sim_specs = {
    "sim_f": sim_find_sine,  # Our simulator function
    "in": ["x"],  # Input field names. 'x' from gen_f output
    "out": [("y", float)],
}  # sim_f output. 'y' = sine('x')

persis_info = add_unique_random_streams({}, nworkers + 1)  # Initialize manager/workers random streams

exit_criteria = {"sim_max": 80}  # Stop libEnsemble after 80 simulations

With specification complete, libEnsemble can be initiated via the following function call:

In [None]:
# Primary libEnsemble call. Initiates manager and worker team, begins ensemble-calculations.

H, persis_info, flag = libE(sim_specs, gen_specs, exit_criteria, persis_info, libE_specs=libE_specs)

The following are a couple of lines for visualizing output data from this libEnsemble routine.

In [None]:
print([i for i in H.dtype.fields])
print(H[:16])

In [None]:
import matplotlib.pyplot as plt

colors = ["b", "g", "r", "y", "m", "c", "k", "w"]

for i in range(1, nworkers + 1):
    worker_xy = np.extract(H["sim_worker"] == i, H)
    x = [entry.tolist()[0] for entry in worker_xy["x"]]
    y = [entry for entry in worker_xy["y"]]
    plt.scatter(x, y, label="Worker {}".format(i), c=colors[i - 1])

plt.title("Sine calculations for a uniformly sampled random distribution")
plt.xlabel("x")
plt.ylabel("sine(x)")
plt.legend(loc="lower right")
plt.savefig("tutorial_sines.png")