# Executor with Electrostatic Forces

This tutorial highlights the capability to run and monitor user
applications using a libEnsemble Executor.

Our calling script registers a compiled MPI program that simulates
electrostatic forces between a collection of particles. The simulator function
launches instances of this executable and reads the result from an output file.

libEnsemble's MPI Executor can automatically detect available MPI runners and 
resources, and by default, divides them equally among workers.

**Note that for notebooks** the multiprocessing start method should be set to
`fork` (default on Linux). To use with `spawn` (default on Windows and macOS),
use the `multiprocess` library.

Let's make sure libEnsemble is installed.

In [None]:
pip install libensemble

## Getting Started

**An MPI distribution is required to use this notebook**.

We recommend using `mpich`, but `Open MPI` can be used. If running in a hosted
notebook with Open MPI, the following my be required.

In [None]:
import os

os.environ['OMPI_ALLOW_RUN_AS_ROOT'] = '1'
os.environ['OMPI_ALLOW_RUN_AS_ROOT_CONFIRM'] = '1'

The simulation source code ``forces.c`` can be obtained directly from the
libEnsemble repository.

Assuming MPI and its C compiler ``mpicc`` are available, obtain
``forces.c`` and compile it into an executable (``forces.x``) with:  

In [None]:
import subprocess
import requests

# This line is not necessary if forces.c is present in the current directory.
url = "https://raw.githubusercontent.com/Libensemble/libensemble/main/libensemble/tests/scaling_tests/forces/forces_app/forces.c"
forces = requests.get(url)

open("./forces.c", "wb").write(forces.content)

# Compile the forces MPI executable.
subprocess.run("mpicc -O3 -o forces.x forces.c -lm".split())

### Clean Up

Before we start let's ensure any output from previous runs is cleaned up.

In [None]:
# To rerun this notebook, we need to delete the ensemble directory.
import os
import shutil
output_files = ["ensemble.log", "libE_stats.txt"]
for file_path in output_files:
    try:
        os.remove(file_path)
    except:
        pass
try:
    shutil.rmtree("ensemble")
except:
    pass

## Simulation Function

Our simulation function is where we'll use libEnsemble's executor to run our MPI application. We'll wait until this task's finished  and then send the results back to the manager.

Running the following cell will load the function into memory. In a standard Python environment (outside of a notebook), this function would be in a separate file (e.g., ``forces_simf.py``) and imported in your calling script.

In [None]:
import numpy as np

# Optional status codes to display in libE_stats.txt for each gen or sim
from libensemble.message_numbers import WORKER_DONE, TASK_FAILED


def run_forces(H, persis_info, sim_specs, libE_info):
    calc_status = 0

    # Parse out num particles, from generator function
    particles = str(int(H["x"][0][0]))

    # num particles, timesteps, also using num particles as seed
    args = particles + " " + str(10) + " " + particles

    # Retrieve our MPI Executor
    exctr = libE_info["executor"]

    # Submit our forces app for execution.
    # Exclude num_procs to use all cores available to worker
    task = exctr.submit(app_name="forces", app_args=args, num_procs=1)

    # Block until the task finishes
    task.wait(timeout=60)
    
    # Stat file to check for bad runs
    statfile = "forces.stat"

    # Try loading final energy reading, set the sim's status
    try:
        data = np.loadtxt(statfile)
        final_energy = data[-1]
        calc_status = WORKER_DONE
    except Exception:
        final_energy = np.nan
        calc_status = TASK_FAILED

    # Define our output array,  populate with energy reading
    outspecs = sim_specs["out"]
    output = np.zeros(1, dtype=outspecs)
    output["energy"][0] = final_energy

    # Return final information to worker, for reporting to manager
    return output, persis_info, calc_status

The `run_forces` function retrieves the generated number of particles from ``H`` and
constructs an argument string for our launched application. The particle count doubles
up as a random number seed here.

After submitting the "forces" app for execution, a `Task` object is returned.
This object can be polled, killed, and evaluated in a variety of helpful ways.
For now, we're satisfied with waiting for the task to complete via `task.wait()`.

Our application produces a `forces.stat` file that contains energy outputs from the simulation.

To complete our simulation function, parse the last energy value from the output file into
a local output History array, and if successful, set the simulation function's exit
status ``calc_status`` to `WORKER_DONE`. Otherwise, send back `NAN` and a `TASK_FAILED`
status:

`calc_status` is an optional return value and will be displayed in the `libE_stats.txt` log file.

## Calling Script

Let's begin by writing our calling script to parameterize our simulation and
generation functions and run the ensemble.

In [None]:
#!/usr/bin/env python
import os
import numpy as np
# from forces_simf import run_forces  # Sim func from current dir

from libensemble import Ensemble
from libensemble.gen_funcs.sampling import uniform_random_sample
from libensemble.tools import add_unique_random_streams
from libensemble.executors import MPIExecutor

# Initialize MPI Executor
exctr = MPIExecutor()

# Register simulation executable with executor
sim_app = os.path.join(os.getcwd(), "forces.x")
exctr.register_app(full_path=sim_app, app_name="forces")

We instantiate our `MPIExecutor` in the calling script.

Registering an application is as easy as providing the full file-path and giving
it a memorable name. This Executor will later be retrieved within our simulation
function to launch the registered app.

Next define the `libE_specs`, `sim_specs` and `gen_specs` data structures:

In [None]:
nworkers = 2

# Global settings - including creating directories for each simulation
libE_specs = {
    "nworkers": nworkers,
    "comms": "local",
    "sim_dirs_make": True,
}

# State the sim_f, inputs, outputs
sim_specs = {
    "sim_f": run_forces,
    "in": ["x"],  # Name of input for sim_f (defined in gen_specs["out"])
    "out": [("energy", float)],
}

# State the gen_f, inputs, outputs, additional parameters
gen_specs = {
    "gen_f": uniform_random_sample,
    "out": [("x", float, (1,))],
    "user": {
        "lb": np.array([1000]),  # min particles
        "ub": np.array([3000]),  # max particles
        "gen_batch_size": 8,
    },
}

After configuring ``persis_info`` and ``exit_criteria``, we initialize the ensemble: 


In [None]:
# Instruct libEnsemble to exit after this many simulations
exit_criteria = {"sim_max": 8}

# Seed random streams for each worker, particularly for gen_f
persis_info = add_unique_random_streams({}, nworkers + 1)

# Initialize ensemble object, passing executor.
ensemble = Ensemble(executor=exctr,
                    libE_specs=libE_specs,
                    gen_specs=gen_specs,
                    sim_specs=sim_specs,
                    exit_criteria=exit_criteria,
                    persis_info=persis_info,
                   )



Now we are ready to run. Remember to re-run you need to run the "Clean Up" cell again.

In [None]:
# Launch libEnsemble
H, persis_info, flag = ensemble.run()

# See results
print(H[["sim_id", "x", "energy"]])

## Output files

That's it! As can be seen, with libEnsemble, it's relatively easy to get started with launching applications. Behind the scenes, libEnsemble evaluates default MPI runners and available resources and divides them among the workers.

This completes our calling script and simulation function.

Output files for each simulation will appear under the `ensemble` directory. Overall workflow information should appear in `libE_stats.txt` and `ensemble.log` as usual.

In [None]:
# Print libE_stats.txt
with open("libE_stats.txt", "r") as file:
    print(file.read())

The file `ensemble.log` contains the MPI run lines for each simulation.
Note that the cores on your system should be divided equally among the two workers.

In [None]:
# Show my run-lines
with open("ensemble.log", 'r') as file:
    for line in file:
        if "Launching" in line:
            colon_index = line.index(":", line.index("Launching"))
            print(line[colon_index + 1:].strip())

That concludes this tutorial.

Each of these example files can be found in the repository in ``examples/tutorials/forces_with_executor``

For further experimentation, we recommend trying out this libEnsemble tutorial
workflow on a cluster or multi-node system, since libEnsemble can also manage
those resources and is developed to coordinate computations at huge scales.
Please feel free to contact us or open an issue on GitHub if this tutorial
workflow doesn't work properly on your cluster or other compute resource.