<a href="https://pymt.readthedocs.io"><img style="float: left" src="../media/powered-by-logo-header.png"></a>

## Why *pymt*?

*pymt* provides a standard, easy-to-use interface to a wide range of models. *pymt* solves
several problems often encountered when a user wants to find, run, and/or couple
models to one another. Below I'll go through some of the problems a model user often
encounters when trying to find and try out a new model. This is certainly not an exhaustive
list.

Can you think of other issues a user may encounter in trying to use and/or couple
a model?

Problems:
* [Source code](#Getting-source-code)
* [Compiling the source code](#Compiling)
* [Documentation](#Documentation) (or lack thereof)
* [Running a model](#Running-a-model)
* [Debugging](#Debugging)
* [Model coupling](#Model-coupling)

### Getting source code

Even if a user doesn't *need* source code to run a model (i.e. they been given
a binary program they can simply run), it's still important for them to be
able to have access to it: they may want to modify it, or have a closer look
under-the-hood to see what's going on.

#### Problem

This is certainly less of a problem nowadays. However, it can still be an issue.
A would-be model user has heard of some mysterious model and would like to try
it out but can't find the source code. Instead, they are left trying to
find the email (phone? fax?) for the *master-of-the-code* and then trying to convince that person
to send a floppy disk of the source code to you. It's hard to believe, but this used to
be a thing.

#### Solution

All of the models within *pymt* are open source. For the most part, the source code for
*pymt* models are available in repositories on
[GitHub](https://github.com) but some may be housed on other publicly
accessable websites like [bitbucket](https://bitbucket.com), and
[SourceForge](https://sourceforge.com). We don't enforce the use of *GitHub* but
we do use it extensively for our code and it seems to be the most widely used
version control system in our community. In any case, the source code for all of the
models are freely available and not located behind a gate keeper. 

CSDMS maintains a database of model metadata on its website:
* https://csdms.colorado.edu
This is not the source code but descriptions of models that have been contributed
by the community to CSDMS. Here you can query model by, for example, type, process,
or name.

The source code for many of the contributed models is on *GitHub* under the
*csdms-contrib* organization.
* https://github.com/csdms-contrib

### Compiling

You have the FORTRAN source code, it looks great (maybe you've even modified it), but now you need to
be to run it.

#### Problem

Depending on the model, this step may not be much of a problem. However, oftentimes this
is the *biggest* problem a user encounters when trying to run a new model. At it worst,
this step is extremely painful and is often enough of a hurdle that this is where the
user stops.

After getting the source code, there are still several issues to solve:
* Do I have the necessary compilers installed on my target platform?
* Was the code ever intended to be built on my target platform?
* Do I have all of the necessary dependencies installed? If not, this step is
  increased in complexity by the number of dependencies needed to install.


#### Solution

All of the models distributed in *pymt* come pre-compiled on a range of platforms. Although we
distribute *pymt* through *Anaconda*, which is primarily a Python package manager, models
written in C, C++, or FORTRAN are also available. Although we try to build all of the models
for Linux, Mac, and Windows, all of the models aren't yet available on all of those platforms.
We're trying though! Most all of them are available on Linux, and Mac.

We use [conda-forge](https://github.com/conda-forge) to build and distribute models that
can then be installed using *Anaconda*. One nice side effect of this is that we also provide
a recipe describes how each piece of software is built so that you can do it yourself, if
you need to.

And, as always, we accept pull requests! If you've built a model on a platform that we
haven't, we would love to hear from you.

### Documentation

Perhaps the number one thing that keeps a user from experimenting with a new model. Is a
lack of documentation. A user has found some source code but there is no documentation.
Unfortunately, we are seldom paid to write documentation. Instead, we are funded to write
some code to solve a specific problem and that's it. Encountered with a new mode, if a
user isn't told how to build, run, or modify a model, they'll most likely just move on.

#### Problem

A collection of source files without any documentation.

#### Solution

Both [*landlab*](https://landlab.readthedocs.io) and [*pymt*](https://pymt.readthedocs.io)
are well documented. And, as such, a user using a model from either of these
frameworks is able to tap into a either of these documentation bases for information
about how to get and run a model.

### Running a model

Getting, compiling, and installing a model isn't all that useful it the model can't be run.

#### Problem

Every model has it's own idosyncratic way of running. For example, model generally have
model-specific input/output files, or command line arguments (or a GUI?). Or, even worse,
there isn't an input file: instead, input parameters are changed in the source code and
the code recompiled.


#### Solution

All of the models in *pymt* have a uniform interface (based on the [Basic Model Interface (BMI)](https://bmi.readthedocs.io)). This means that if you know how to use one *pymt* model, you know how
to use all *pymt* models.

### Debugging

It looks like there may be a problem with a model but you're not sure. Or, you are sure,
but you don't know where exactly the problem lies.

#### Problem

Trying to track down bugs can be a difficult process: particularly in codes written in compiled
languages like C or FORTRAN. Debugging oftentimes means inserting a bunch of print statements
into the code, recompiling, examining the output, and repeating.

#### Solution

Because *pymt* models are written in Python, a user can interactively run a model. A model can
be updated one time step at a time, it's state examined (perhaps using Python tools like numpy
or plotted using *matplotlib*), or even changed, and then updated for another time step. This
ability to debug a model by playing with it in Python has proven to be a valuable way to,
not only get a feel for how a model works, but also to see if it's working properly or as
expected.

### Model coupling

A user would like to couple two models (or a model to a dataset). 

#### Problem

There are potentially many.

Some examples:
* Models are written in different languages
* Models don't provide a way for another model to access it

Can you think of others? Perhaps some that you've encountered? 

#### Solution

The *pymt* brings models from different languages (currently C, C++, FORTRAN, and Python)
into a Python environment. Because of the BMI (*pymt* models all expose a BMI), we can
mostly automate this process. Through *pymt* users are able to write Python scripts that
run single models, or multiple models together.

We'll show you how to do this today.

## The *pymt* model library

All of the models that are available through *pymt* are held in a Python module that you can import, `pymt.models`.

To have a look at what models are currently available, we'll import the library
and print the names of all of the models.

For more information you can look at the [pymt documentation](https://pymt.readthedocs.io)

In [None]:
import pymt.models

We'll now have a closer look at a model and see how a *pymt* model works. Rememeber, because
*pymt* models all have the same interface, and so if you know how to use one, you'll know
how to use all of them.

Let's begin by picking a model from the above list. Pick one that sounds interesting to you.

In [None]:
Model = pymt.models.Plume # <- type the name of the model you would like to use

`Model` is now a class of the model that you've chosen. You could create multiple instances
of a `Model` but until it's an instance, you can't do too much with it, so let's
create an instance.

In [None]:
model = Model()

We can now examine the model a little more. For instance, we can use the `help` function
to get some information about the model. This will give us a brief summary of the model,
the author, a version number, license, references, etc.

In [None]:
help(model)

Scroll down a little in the help message and have a look at the *Parameters* section. These
are input parameters to the model. That is, things that are set at the *beginning* of
the model and cannot be changed thereafter. You can also get a view of them programmatically
using the *parameters* attribute.

In [None]:
for name, value in model.parameters:
    print(f"{name} [default = {value}]")

## The lifecycle of a model

Running a model in *pymt* involves four steps:
* [setup](#setup): prepare input files
  * *setup*
* [initialize](#initialize): read input files
  * *initialize*
  * *input_var_names*
  * *output_var_names*
  * *var*
  * *get_value*
* [update](#update): advance one time step
  * *update*
  * *start_time*
  * *time*
  * *end_time*
* [finalize](#finalize): shutdown
  * *finalize*

Below we'll briefly go through each of these steps.

### Setup

Before a model can be run, it's input files must be prepared. If you haven't done this manually
(which you are definitely free to do), you can use the model's *setup* method to help with
this.

By default, *setup* will create a temporary folder with files containing default values. However,
for this example we'll specify a folder so that we can see what's going on. The following will create
a new folder, *_my_model* (you can call it whatever you like), and, in that folder, will
be model-specific input files. *setup* return a tuple that gives the name of the main configuration
file and the full path name of the folder. 

In [None]:
config_file, config_folder = model.setup("_my_model")
print(f"Input files are located here: {config_folder}")
print(f"The main configuration file is: {config_file}")

To double-check that something was actually done, you can use shell commands (hint: `ls`, and `cat`)
to see what files were created and what their contents are. If you don't like the shell, you can
always use the Jupyter tree-view.

The set of files that you just created depend completely on the model you chose. All of these files
a model-specific. However, note that we all used the exact same command to create them - regardless
of our chosen model.

Now, let's now change an input parameter. This is done through passing keywords to *setup*. The
keywords that you can use are specific to a model and can be found through *help*. 

In [None]:
model.setup("_fast_river", river_mouth_velocity=2.0)

You'll now see a new set up input files. If you look closely, you should be able to see your
change.

A couple of notes about the *setup* method.

***Note***: It's not strictly required to run *setup* before *initialize* - it's just a
conveient way to get a set of input files. One pattern that is sometimes used is to use
*setup* to get a base set of input files and then edit some of the files by hand.

***Note***: It's not strictly required that you run *initialize* at all - sometimes
*setup* is the goal. For example, *setup* provides an easy way to programmatically
create a large number of input files for, say, a Monte Carlo simulation.

```python
>>> from itertools import product
>>> velocity_samples = [0.5, 1.0, 1.5, 2.0, 2.5]
>>> width_samples = [100.0, 200.0, 300.0, 400.0, 500.0]
>>> for n, (velocity, width) in enumerate(product(velocity_samples, width_samples)):
...     model.setup(f"_sim-{n}", river_mouth_width=width, river_mouth_velocity=velocity)
```

In [None]:
from itertools import product

velocity_samples = [0.5, 1.0, 1.5, 2.0, 2.5]
width_samples = [100.0, 200.0, 300.0, 400.0, 500.0]

for n, (velocity, width) in enumerate(product(velocity_samples, width_samples)):
    model.setup(f"_sim-{n}", river_mouth_width=width, river_mouth_velocity=velocity)

### Initialize

Now that we have a set of input files, we're ready to get the model ready for time stepping. This is
done through the *initialize* method. The model is not in a state that we can query it, until
*initialize* has been run. This is important to understand.

To better understand this, consider the common pattern for a model to read from one of its input
files the size or resolution of its solution grid. Thus, in such a situation, we cannot ask
about the model's grid until its read input files, which is done in *initialize*.

To run *initialize*, we must pass the name of a configuration file and a folder - both of
which we got from *setup*.

In [None]:
model.initialize(config_file, config_folder)

Now we can ask some questions about the model:
* what variables do you provide as output?
* what variables to you use as input?
* what is the grid like on which these variables sit (if there even is a grid)?

#### Input and output variables

Input and output variables are different from the parameters we talked about above (the ones described
in the *help* message, or the *model.parameters* attribute). Input and output variables are
able to ***dynamically change with time***.

To get a list of the available input and output variables, you can use the *input_var_names* and
*output_var_names* attributes. Depending on the model you chose, you may have not have any
input variables. This means your model is configured once at the start but then can't be changed.
It could be part of a 1-way coupling but not a 1-way coupling with feedback. A *dataset* would
also be an example of a model without input variables.

In [None]:
print("Input variables:")
for name in model.input_var_names:
    print(f"- {name}")

print("Output variables:")
for name in model.output_var_names:
    print(f"- {name}")

These are just the names of the variables. We can get additional information as well. This can
be obtained several ways but the preferred method is using the `var` attribute. `var` is
a dictionary of variables names mapped to variable descriptions.

Pick a variable from the above list to find out more about it. We see attributes of the variable
such as its data type and units. This also gives us information about the grid that the
variable is defined on (i.e. the *grid* and *location* attributes). We'll get to grids later on.

In [None]:
variable = model.var["sea_bottom_sediment__deposition_rate"] # <- replace this string with a variable for your model

In [None]:
variable

You can get it values either with the `data` attribute or with the `get_values` method.

In [None]:
variable.data

In [None]:
model.get_value("sea_bottom_sediment__deposition_rate")

### Run

Our model is now initialized and ready to be advanced through time. The *update* method advances
the model's state by a single time step.

In [None]:
model.update()

That's it. There not too much to it. You can see that it's done something either by seeing if
an output variable has changed or using the *time* attribute to see the current model time -
if there is one.

In [None]:
print("Start time: {0} {1}".format(model.start_time, model.time_units))
print("Current time: {0} {1}".format(model.time, model.time_units))
print("End time: {0} {1}".format(model.end_time, model.time_units))

### Finalize

There's not much to this method, and often we don't even use it. This is where a model will free memory
or close files. If you're model uses lots of memory and you notice you're running out, it may
help to call this method. Calling *finalize* will put your model in a state where it is no
longer usable.

In [None]:
model.finalize()