# SmartSim tutorial 1:  getting started
In this notebook, we will walk through the most basic functionalities of SmartSim, such as setting up an experiment running two models, launching it locally, and collecting its results. We will also look at how we can use the `Ensemble` API to run models collectively. 

## 1.1 Running simple models 
The most common way of defining a workflow in SmartSim is through `Experiment`s. An experiment can start and stop a `Model` and check (`poll`) its status at any time. In section *1.2* we will also see how an `Experiment` can be used to run experiments as `Ensemble`s.

We begin by importing the modules we need: `Experiment` and `RunSettings`. `RunSettings` is the object used in SmartSim to define what will be run by a given `Model`. `RunSettings` is the most basic way of defining execution parameters, and will be perfect for executing programs launched locally, i.e. directly by the operating system, without a workload manager. We also import `os`, as we will need to setup the directory where the `Model`s will place their output and error files.

In [2]:
import os
from smartsim import Experiment
from smartsim.settings import RunSettings

Throughout this notebook, we will incrementally build an `Experiment`. Let's start from the simplest case: a single-`Model` example. Our first `Model` will simply print `hello`, using the shell command `echo`.

In [3]:
exp = Experiment(name="tutorial-experiment", launcher="local")

settings_1 = RunSettings(exe="echo", exe_args="hello")
M1 = exp.create_model(name="tutorial-model-1", run_settings=settings_1)

Once the `Model` has been created by the `Experiment`, we can start it. By setting `summary=True`, we can see a summary of the experiment printed before it is effectively launched. The summary will stay for 10 seconds, and it is useful as a last check. If we set `summary=False`, then the experiment would be launched immediately. We also explicitly set `block=True` (even though it is the default), so that  `Experiment.start` waits until the last `Model` has finished before returning: it will act like a job monitor, letting us know if processes run, complete, or fail.

In [4]:
exp.start(M1, block=True, summary=True)



[36;1m=== LAUNCH SUMMARY ===[0m
[32;1mExperiment: tutorial-experiment[0m
[32mExperiment Path: /Users/arigazzi/Documents/DeepLearning/smartsim-dev/SmartSim/tutorials/01_getting_started/tutorial-experiment[0m
[32mLaunching with: local[0m
[32m# of Ensembles: 0[0m
[32m# of Models: 1[0m
[32mDatabase: no[0m

[36;1m=== MODELS ===[0m
[32;1mtutorial-model-1[0m
[32mModel Parameters: 
{}[0m
[32mModel Run Settings: 
Executable: /bin/echo
Executable arguments: ['hello']
[0m




23:16:10 C02YR4ANLVCJ SmartSim[90058] INFO tutorial-model-1(90114): Completed


The model has completed. Let's look at the content of the current working directory.

In [5]:
os.listdir('.')

outputfile = './tutorial-model-1.out'
errorfile = './tutorial-model-1.err'

print("Content of tutorial-model-1.out:")
with open(outputfile, 'r') as fin:
    print(fin.read())
print("Content of tutorial-model-1.err:")
with open(errorfile, 'r') as fin:
    print(fin.read())

Content of tutorial-model-1.out:
hello

Content of tutorial-model-1.err:



We can see that two files, `tutorial-model-1.out` and `tutorial-model-1.err` have been created. The `.out` file contains the output generated by `model-1`, and the `.err` file would contain the error messages generated by it. Since there were no errors, the `.err` file is empty.

Now let's run two different `Model` instances at the same time. This is just as easy as running one `Model`, and takes the same steps. This time, we will skip the summary. For each `Model`, we create a `RunSettings` object: it is recommended to always create separate `RunSettings` objects for each `Model`.

In [6]:
exp = Experiment(name="tutorial-experiment", launcher="local")

run_settings_1 = RunSettings("sleep", "3")
run_settings_2 = RunSettings("sleep", "5")
model_1 = exp.create_model("tutorial-model-1", run_settings_1)
model_2 = exp.create_model("tutorial-model-2", run_settings_2)
exp.start(model_1, model_2)

23:16:17 C02YR4ANLVCJ SmartSim[90058] INFO tutorial-model-1(90115): Completed
23:16:17 C02YR4ANLVCJ SmartSim[90058] INFO tutorial-model-2(90116): Running
23:16:18 C02YR4ANLVCJ SmartSim[90058] INFO tutorial-model-1(90115): Completed
23:16:18 C02YR4ANLVCJ SmartSim[90058] INFO tutorial-model-2(90116): Completed


Again, we can check the content of the output and error files.

In [7]:
outputfile = './tutorial-model-1.out'
errorfile = './tutorial-model-1.err'

print("Content of tutorial-model-1.out:")
with open(outputfile, 'r') as fin:
    print(fin.read())
print("Content of tutorial-model-1.err:")
with open(errorfile, 'r') as fin:
    print(fin.read())

outputfile = './tutorial-model-2.out'
errorfile = './tutorial-model-2.err'

print("Content of tutorial-model-2.out:")
with open(outputfile, 'r') as fin:
    print(fin.read())
print("Content of tutorial-model-2.err:")
with open(errorfile, 'r') as fin:
    print(fin.read())

Content of tutorial-model-1.out:

Content of tutorial-model-1.err:

Content of tutorial-model-2.out:

Content of tutorial-model-2.err:



In many cases, a launcher different from `local` can be useful. For example, if `mpirun` is installed on the system, we can run a model through it, by specifying it as `run_command` in `RunSettings`. Since `mpirun` takes arguments (e.g. to define how many processes will be run), we pass them by defining `run_args` in `RunSettings`.

In [8]:
exp = Experiment("tutorial", launcher="local")
run_settings = RunSettings("echo",
                           "hello world!",
                           run_command="mpirun",
                           run_args={"-np": 2}) # note that for base ``RunSettings`` run_args passed literally
                      
model = exp.create_model("tutorial-model-mpirun", run_settings)
exp.start(model, summary=True)



[36;1m=== LAUNCH SUMMARY ===[0m
[32;1mExperiment: tutorial[0m
[32mExperiment Path: /Users/arigazzi/Documents/DeepLearning/smartsim-dev/SmartSim/tutorials/01_getting_started/tutorial[0m
[32mLaunching with: local[0m
[32m# of Ensembles: 0[0m
[32m# of Models: 1[0m
[32mDatabase: no[0m

[36;1m=== MODELS ===[0m
[32;1mtutorial-model-mpirun[0m
[32mModel Parameters: 
{}[0m
[32mModel Run Settings: 
Executable: /bin/echo
Executable arguments: ['hello', 'world!']
Run Command: mpirun
Run arguments: {'-np': 2}[0m




23:16:36 C02YR4ANLVCJ SmartSim[90058] INFO tutorial-model-mpirun(90120): Completed


This time, since we passed `-np 2` to `mpirun`, in the output file we should find the line `hello world!` twice.

In [9]:
outputfile = './tutorial-model-mpirun.out'
errorfile = './tutorial-model-mpirun.err'

print("Content of tutorial-model-mpirun.out:")
with open(outputfile, 'r') as fin:
    print(fin.read())
print("Content of tutorial-model-mpirun.err:")
with open(errorfile, 'r') as fin:
    print(fin.read())

Content of tutorial-model-mpirun.out:
hello world!
hello world!
hello world!
hello world!
hello world!
hello world!
hello world!
hello world!

Content of tutorial-model-mpirun.err:



## 1.2 Creating and running replicas and ensembles of models
In the previous example, the two `Model` instances were created separately. There are more convenient ways of doing this, through `Ensemble`s. The first way we are going to see concerns running the same exact model several times. We first set up the example the standard way.

In [None]:
exp = Experiment(name="tutorial-experiment", launcher="local")
settings_1 = RunSettings(exe="echo", exe_args="hello")


Then, instead of creating it as we did before, we use `create_ensemble`. Let's assume we want to run the same experiment four times, then we will pass the `replicas=4` argument and simply start the `Ensemble`.

In [38]:
ensemble = exp.create_ensemble("ensemble-replica", replicas=4, run_settings=rs)
exp.start(ensemble, summary=True)

23:54:31 C02YR4ANLVCJ SmartSim[90058] INFO ensemble-replica_0(91216): Completed
23:54:31 C02YR4ANLVCJ SmartSim[90058] INFO ensemble-replica_1(91217): Completed
23:54:31 C02YR4ANLVCJ SmartSim[90058] INFO ensemble-replica_2(91218): Completed
23:54:31 C02YR4ANLVCJ SmartSim[90058] INFO ensemble-replica_3(91219): Completed
23:54:32 C02YR4ANLVCJ SmartSim[90058] INFO ensemble-replica_0(91216): Completed
23:54:32 C02YR4ANLVCJ SmartSim[90058] INFO ensemble-replica_1(91217): Completed
23:54:32 C02YR4ANLVCJ SmartSim[90058] INFO ensemble-replica_2(91218): Completed
23:54:35 C02YR4ANLVCJ SmartSim[90058] INFO ensemble-replica_3(91219): Completed


From the output, we see that four copies of our `Model`, named `ensemble-replica_0`, `ensemble-replica_1`, ... were run. In each output file, we will see that the same output was generated.

Now let's imagine that we don't want to run the *same* model four times, but we want to run variations of it. One way of doing this would be to define four models, and starting them through the `Experiment`.
For few, simple `Model`s, this woukd OK, but what if we needed to run a large number of models, which only differ for some parameter? Defining and adding each one separately would be tedious. For such cases, we will rely on a parameterized `Ensemble` of models.

Our goal is to run 

```python output_my_parameter.py```

 changing some internal parameters of `output_my_parameter.py`. Clearly, we could pass the parameters as arguments, but in some cases, this could not be possible (e.g. if the parameters were stored in a file and the executable would not accept them from the command line). We begin by defining the `Experiment` in the standard way.

In [28]:
exp = Experiment("tutorial-ensemble", launcher="local")
rs = RunSettings(exe="python", exe_args="output_my_parameter.py")

Then, we define the parameters we are going to set: `tutorial_name` and `tutorial_parameter`. In the original file `output_my_parameter.py`, which acts as a template, they occur as `;tutorial_name;` and `;tutorial_parameter;`. The semi-colons are used to perform a regexp substitution with the desired values. We pass them to `create_ensemble`, along with the argument `perm_strategy="all_perm"`. This argument means that we want all possible permutations of the given parameters, which are stored in the dict `params`. We have two options for both paramters, thus our ensemble will run 4 instances of the same `Experiment`, just using a different copy of `output_my_parameter.py`. We attach the template file to the `Ensemble` instance and we run the experiment.

In [34]:
params = {"tutorial_name": ["Ellie", "John"], "tutorial_parameter": [2, 11]}
ensemble = exp.create_ensemble("ensemble", params=params, run_settings=rs, perm_strategy="all_perm")
config_file = "./output_my_parameter.py"
ensemble.attach_generator_files(to_configure=config_file)

exp.generate(ensemble, overwrite=True)
exp.start(ensemble)

23:43:35 C02YR4ANLVCJ SmartSim[90058] INFO Working in previously created experiment
23:43:41 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_0(91006): Completed
23:43:41 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_1(91007): Completed
23:43:41 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_2(91008): Completed
23:43:41 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_3(91009): Completed
23:43:42 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_0(91006): Completed
23:43:42 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_1(91007): Completed
23:43:42 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_2(91008): Completed
23:43:45 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_3(91009): Completed


We can see from the output that four instances of our experiment were run, each one named like the `Experiment`, with a numeric suffix at the end: `ensemble_0`, `ensemble_1`, ... each ensemble member generated its own output files, which will be stored in `tutorial-ensemble/ensemble/ensemble_0`, `tutorial-ensemble/ensemble/ensemble_1`, and so on.

In [35]:
for ensemble_id in range(4):
    outputfile = 'tutorial-ensemble/ensemble/ensemble_' + str(ensemble_id)+"/ensemble_"+ str(ensemble_id)+".out"

    print(f"Content of {outputfile}:")
    with open(outputfile, 'r') as fin:
        print(fin.read())


Content of tutorial-ensemble/ensemble/ensemble_0/ensemble_0.out:
Hello, my name is Ellie and my parameter is 2

Content of tutorial-ensemble/ensemble/ensemble_1/ensemble_1.out:
Hello, my name is Ellie and my parameter is 11

Content of tutorial-ensemble/ensemble/ensemble_2/ensemble_2.out:
Hello, my name is John and my parameter is 2

Content of tutorial-ensemble/ensemble/ensemble_3/ensemble_3.out:
Hello, my name is John and my parameter is 11



That's it! All possible permutations of the input parameters were used to execute the experiment! Sometimes, the parameter space can be too large to be explored exhaustively. In that case, we can use a different permutation strategy, i.e. `random`. For example, if we want to only use two possible random combinations of our parameter space, we can run the following code, where we specift `n_models=2` and `perm_strategy="random"`.

In [40]:
params = {"tutorial_name": ["Ellie", "John"], "tutorial_parameter": [2, 11]}
ensemble = exp.create_ensemble("ensemble", params=params, run_settings=rs, perm_strategy="random", n_models=2)
config_file = "./output_my_parameter.py"
ensemble.attach_generator_files(to_configure=config_file)

exp.generate(ensemble, overwrite=True)
exp.start(ensemble)

00:24:15 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_0(91912): Completed
00:24:15 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_1(91913): Completed
00:24:16 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_0(91912): Completed
00:24:16 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_1(91913): Completed


Another possible permutation strategy is `stepped`, but it is also possible to pass a function, which will need to generate combinations of parameters starting from the dictionary. Please refer to the documentation to learn more about this.


It is also possible to use different delimiters for the parameter regexp. For example, if instead of `;`, we want to use `@`, we can set it as `tag` in `generate`. We have to use a different version of the parameterized file, one named `output_my_parameter_new_tag.py`.

In [42]:
exp = Experiment("tutorial-ensemble-new-tag", launcher="local")
rs = RunSettings(exe="python", exe_args="output_my_parameter_new_tag.py")
params = {"tutorial_name": ["Ellie", "John"], "tutorial_parameter": [2, 11]}
ensemble = exp.create_ensemble("ensemble", params=params, run_settings=rs, perm_strategy="all_perm")
config_file = "./output_my_parameter_new_tag.py"
ensemble.attach_generator_files(to_configure=config_file)

exp.generate(ensemble, overwrite=True, tag='@')
exp.start(ensemble)

00:29:22 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_0(92230): Completed
00:29:22 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_1(92232): Completed
00:29:23 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_0(92230): Completed
00:29:23 C02YR4ANLVCJ SmartSim[90058] INFO ensemble_1(92232): Completed
