# Tutorial 1: Basic usage
In this tutorial, the basic functionalities of the *PRISM* package are demonstrated and (somewhat) serve as a minimal example on how to use the package.
All functionality discussed here is described in more detail in the [online documentation](https://prism-tool.readthedocs.io/en/latest).
To make sure that everything is working properly, let's check first if *PRISM* can be imported correctly:

In [None]:
import prism
prism.get_info()

If executing the cell above results in no exceptions and shows a nice overview of the current configuration and requirements, then we are good to go.
It might be preferable to remove all *PRISM* working directories from the current directory as well, as this tutorial assumes they do not exist.

## Imports & initialization
In order to use the *PRISM* pipeline, one requires two things: the `Pipeline` class and a `ModelLink` subclass.
The `Pipeline` class is the main user class of the *PRISM* package and provides a user-friendly interface/environment that gives access to all operations within the pipeline.
It can be seen as the "conductor" of the *PRISM* package, as it governs all other objects and orchestrates their communications and method calls.
It is linked to a model by a user-written `ModelLink` subclass object, which allows the `Pipeline` to extract all necessary model information and call the model.
For this tutorial, we will use one of *PRISM*'s basic `ModelLink` subclasses (the `GaussianLink` class).
See [ModelLink subclasses](./2_modellink_subclasses.ipynb) for more information on the `ModelLink` class and how to make a subclass.

So, let's import the `Pipeline` and `GaussianLink` classes:

In [None]:
from prism import Pipeline
from prism.modellink import GaussianLink

Now that we have imported these two classes, we should initialize our `ModelLink` subclass, the `GaussianLink` class in this case.
In addition to user-defined arguments, every `ModelLink` subclass takes two optional arguments, *model_parameters* and *model_data*.
The use of either one will add the provided parameters/data to the default parameters/data defined in the class.
Since the `GaussianLink` class does not have default data defined, it is required to supply it with some data constraints during initialization (using an array, dict or external file).

So, let's define some data and initialize the `GaussianLink` class:

In [None]:
model_data = {3: [3.0, 0.1],    # f(3) = 3.0 +- 0.1
              5: [5.0, 0.1],    # f(5) = 5.0 +- 0.1
              7: [3.0, 0.1]}    # f(7) = 3.0 +- 0.1
modellink_obj = GaussianLink(model_data=model_data)

We can check the representation of the created `GaussianLink` object to see what exactly was used to initialize it:

In [None]:
print("ModelLink object representation:")
print('-'*32)
print(repr(modellink_obj))

Here, we can see that the created `ModelLink` object used the `GaussianLink` class and that it was initialized using the same data values as we provided, as expected.
It also shows the default parameter values that were used.

The `Pipeline` class takes a mandatory `ModelLink` object and several optional arguments, which are mostly paths and the type of `Emulator` class that must be used.
As we already have our `ModelLink` object and we want to use the standard paths and `Emulator`, we can initialize the `Pipeline`:

In [None]:
pipe = Pipeline(modellink_obj)
print("Pipeline object representation:")
print('-'*31)
print(repr(pipe))

Here, we can see that the `Pipeline` class was initialized using our `ModelLink` object.
Also, as we had not provided it and none existed yet, *PRISM* has created a working directory to store the emulator in (`./prism_0`).
If a working directory had already existed (due to previous runs), *PRISM* would have automatically attempted to load the last one that was created that starts with the given *prefix* (optional argument).
If no errors are raised during the initialization of the `Pipeline` class, *PRISM* is ready to start emulating.

## Basic user-methods
A `Pipeline` object has 6 user-methods:
- `analyze()`: Analyzes the current emulator iteration;
- `construct()`: Constructs the specified emulator iteration;
- `details()`: Prints detailed overview of specified iteration;
- `evaluate(sam_set)`: Evaluates *sam_set* in specified iteration;
- `project()`: Creates projections for specified iteration;
- `run()`: Runs a full cycle of the pipeline for specified iteration (can also be done using `pipe()`).

All user-methods besides `evaluate()` solely take optional input arguments.
In case an optional argument is not provided, the `Pipeline` assumes the most logical value.
The most important argument, the emulator iteration (*emul_i*), is assumed to be the last (`analyze()`, `evaluate()`, `project()`), next (`construct()`, `run()`) or current (`details()`) iteration.
'last' here refers to the latest iteration that has been fully constructed, while 'current' refers to the latest iteration (regardless of status).
Note that `analyze()` does not take *emul_i* as a valid input.

Below, we discuss the basic functionalities of these 6 user-methods.

### construct() & details()
The most important user-method in the `Pipeline` is the `construct()`-method.
The `construct()`-method allows for the specified emulator iteration to be constructed, or the next iteration if we do not specify it.
When constructing an emulator iteration, the `Pipeline` class will call the model wrapped in the provided `ModelLink` subclass (which is a Gaussian model in this case) with a pre-determined number of evaluations.
This number is either the initial number (if first iteration) or the number of plausible samples in the previous iteration (if not first iteration).
As we currently have not constructed anything yet and did not modify any of the default values, the initial number should be the default number:

In [None]:
pipe.n_sam_init

After evaluating this number of samples in the model, the `Pipeline` will instruct the `Emulator` (which was automatically initialized by the `Pipeline` class) to construct the first iteration of the emulator.
This involves many different calculations and operations, but a summary would be that it determines all active parameters; obtains the best-fitting polynomial terms; calculates the Gaussian variances and marks the current iteration as completed.

Now that we know what this method does, let's construct the next iteration of the emulator, which should be the first one (skip this step if one has already done this before):

In [None]:
pipe.construct()

If no emulator existed yet, then *PRISM* will have constructed the first iteration of the emulator.
We can check that this is the case by attempting to construct the first iteration again:

In [None]:
pipe.construct(1)

Here, we asked for the first iteration to be constructed, but as it is already finished, *PRISM* will simply state this and immediately return.

After the construction is completed, the `Pipeline` automatically calls the `details()`-method to provide an overview of the current status of the iteration.
This is done after all user-methods besides `evaluate()` and after initializing the `Pipeline` object (if a working directory already existed when it was initialized above, then we would see it as well).

This overview provides us with many details about the specified iteration, including what emulator is currently loaded; its construction, analysis and projection statuses; and the parameter space.
If the specified iteration has not finished constructing yet (because it was interrupted for example), the overview instead lists what components are still missing before construction completion.
Finishing an interrupted construction process is done in the exact same way as constructing an entirely new iteration; by calling the `construct()`-method.

### analyze()
By default, the emulator iteration is analyzed immediately after it has been constructed, which we could have disabled by passing *analyze=False* to `construct()` above.
Before we explain what analyzing does, let's make sure that the last iteration is analyzed by (re)analyzing it:

In [None]:
pipe.analyze()

When calling `analyze()`, the `Pipeline` will generate a large Latin-Hypercube design of proposed samples and evaluate them in the emulator at the last iteration.
Using the implausibility parameters that have been provided to the `Pipeline`, it will determine which samples are considered "plausible" and should be used for constructing the next emulator iteration.
If the number of plausible samples could be problematic, the `Pipeline` will either block the construction of the next iteration (if it is definitely too low) or raise a warning about it (if it is potentially too low), with the latter probably being shown for our case.
An emulator iteration must have been analyzed first before the next iteration can be constructed.
If we were to attempt to construct the next iteration without analyzing the last first, `Pipeline` will simply call `analyze()` first before construction.

Analyzing an emulator iteration also has a different effect: it sets the implausibility parameters for that iteration, which consist of the implausibility cut-offs and wildcards.
The implausibility cut-offs determine what the maximum implausibility values are that an evaluated sample is allowed to have, to be considered 'plausible'.
For example, if the implausibility cut-offs are $[4.0, 3.5, 3.2]$ and a sample evaluated in the emulator has implausibility values $[3.8, 3.1, 3.7]$, then this sample would be marked as 'implausible', as the second-highest implausibility value is higher than the second-highest implausibility cut-off ($3.7 > 3.5$).
However, if one would have added an implausibility wildcard to the analysis, which ignores the highest implausibility value, then this sample would be marked as 'plausible' (as $3.7 \leq 4.0$ and $3.1 \leq 3.5$).

Before the analysis, the implausibility parameters can be changed freely (which affects the `evaluate()` and `project()`-methods).
After the analysis however, they can only be changed by reanalyzing the iteration itself with different parameters.
This is to make sure that the used parameters throughout an iteration are consistent.

We can check what the current implausibility parameters are:

In [None]:
print("Implausibility cut-offs: %s" % (pipe.impl_cut[1]))
print("# of cut-off wildcards: %i" % (pipe.cut_idx[1]))

Now, let's change the implausibility parameters to ``[4, 3.5, 3.2]`` with a single wildcard, by providing this information to the *impl_cut* argument of `analyze()`:

In [None]:
pipe.analyze(impl_cut=[0, 4, 3.5, 3.2])

We can check if the implausibility parameters have been updated:

In [None]:
print("Implausibility cut-offs: %s" % (pipe.impl_cut[1]))
print("# of cut-off wildcards: %i" % (pipe.cut_idx[1]))

This will have permanently changed the implausibility parameters for the first emulator iteration, and will automatically be used by the other methods that require them.
If this iteration had not been analyzed yet, we could have set the implausibility parameters the same way or by using (which will raise an error now):

In [None]:
pipe.impl_cut=[0, 4, 3.5, 3.2]

### evaluate(sam_set)
Although the emulator is mostly evaluated internally, we can also evaluate specified sample sets in the emulator using the `evaluate()`-method.
Requested sample sets can be provided in either an array format or a dict.
For example, let's evaluate a sample with 1s for all parameters:

In [None]:
pipe.evaluate([1, 1, 1])                    # Using an array_like
print('\n')
pipe.evaluate({'A1': 1, 'B1': 1, 'C1': 1})  # Using a dict

We can see here that the answers of both lines are the same.
When using an array as the input, the parameter values are assumed to be sorted alphabetically on parameter name.

If we provide a 2D sample set to `evaluate()`, it will no longer print the result, but instead return it to us in a dict:

In [None]:
print(pipe.evaluate([[1, 1, 1]]))

This allows us to potentially pipe the results into a different processing pipeline if we wish.

### project()
In order to see how the emulator is doing, we need a method that allows us to visualize its performance.
This is done by creating so-called *projection figures* of all active parameters in the model.
Each projection figure consists out of two subplots: the minimum implausibility and the line-of-sight depth.
These two subplots show us where the best and the most plausible samples can be found in parameter space, respectively.
As projection figures have many different properties and options, please see the [online documentation](https://prism-tool.readthedocs.io/en/latest/user/using_prism.html#projections) for a detailed description.

By default, *PRISM* will create projection figures at decent resolutions, but it can easily take a few minutes per figure to make them, as many calculations are required.
So, before creating the projection figures, we will set the resolution a bit lower (from 25x25x250 to 15x15x75):

In [None]:
pipe.proj_res = 15
pipe.proj_depth = 75

By default, *PRISM* will create 2D and 3D projection figures of all parameter combinations in the model for the last iteration, which can be made using:

In [None]:
pipe.project()

To make sure that *PRISM* can run on most systems, it never shows the plots it creates.
Instead, we can find the created plots in the corresponding working directory (which is given in the `details()` overview given above).

The `project()`-method takes many different keyword arguments, whose descriptions and effects can be found in the online documentation mentioned earlier.

### run()
The last user-method in the `Pipeline` class, is the `run()`-method.
This method is basically nothing more than an accessibility method that allows us to run a full cycle of the `Pipeline`.
It can also be accessed by calling the `Pipeline` object directly (e.g., `pipe()` instead of `pipe.run()`).
Running a full cycle involves constructing an emulator iteration, analyzing it and making all projection figures for it.

So, let's say that we want to run a full cycle of the `Pipeline` for the second emulator iteration.
Then, instead of calling all methods separately like before, we can simply do:

In [None]:
pipe.run(2) # Or pipe(2)

Executing this is equivalent to:

In [None]:
pipe.construct(2)
pipe.project(2)

Due to the way the `construct()`-method is programmed, if the emulator iteration is already fully constructed, it will only call the `analyze()`-method if it has not been analyzed before and *analyze=True* is given.
This is probably more noticeable when using `run()` than when using `construct()`.
The reason for this is to make sure that using `run()` (or `construct()`) in an external pipeline cannot reanalyze an emulator iteration without being explicitly told to do so.