# Run the FastScape landscape evolution model

In this notebook, we will see how to import a basic model, inspect the model, create a setup and run the model.

FastScape is implemented using the ``xarray-simlab`` framework. For more info:

- https://xarray-simlab.readthedocs.io/en/latest/inspect_model.html
- https://xarray-simlab.readthedocs.io/en/latest/run_model.html

Let's import some packages first (you can install them using ``conda``).

In [None]:
import numpy as np
import xarray as xr
import xsimlab as xs
import fastscape

In [None]:
print('xarray-simlab version: ', xs.__version__)
print('fastscape version: ', fastscape.__version__)

## Import the basic model

Note: the ``fastscape`` package is not yet available as a ``conda`` package, but it will soon!

In [None]:
from fastscape.models import basic_model

This model simulates the long-term evolution of topographic surface elevation (hereafter noted $h$) on a 2D regular grid. The local rate of elevation change, $\partial h/\partial t$, is determined by the balance between uplift (uniform in space and time) $U$ and erosion $E$.

$$\frac{\partial h}{\partial t} = U - E$$

Total erosion $E$ is the combined effect of the erosion of (bedrock) river channels, noted $E_r$, and erosion- transport on hillslopes, noted $E_d$

$$E = E_r + E_d$$

Erosion of river channels is given by the stream power law:

$$E_r = K_r A^m (\nabla h)^n$$

where $A$ is the drainage area and $K$, $m$ and $n$ are parameters.

Erosion on hillslopes is given by a linear diffusion law:

$$E_d = K_d \nabla^2 h$$


``xarray-simlab`` is a modular framework, where model inputs are automatically retrieved from model components. ``basic_model`` is a ``xsimlab.Model`` object that contains a bunch of components. Just typing ``basic_model`` shows the ordered list of components as well as all model inputs (parameters), grouped by the component to which they belong:

In [None]:
basic_model

To have a better picture of all processes (and inputs and/or variables) in the model, we can visualize it as a graph. Processes are in blue and inputs are in yellow. The order in the graph corresponds to the order in which the processes will be exectued during a simulation.

Note: the visualization requires graphviz and python-graphviz packages (both can be installed using conda and the conda-forge channel).

In [None]:
basic_model.visualize(show_inputs=True)

More information can be shown for each process in the model, e.g., for the grid component here below. We can see all the variables defined in that components (thus not only those that are inputs of ``basic_model``).

In [None]:
basic_model.grid

## Create a model setup

We create a simulation setup using the `create_setup` function.

In [None]:
nx = 101
ny = 101

in_ds = xs.create_setup(
    model=basic_model,
    clocks={
        'time': np.linspace(0., 1e6, 101),
        'out': np.linspace(0., 1e6, 11)
    },
    master_clock='time',
    input_vars={
        'grid__shape': [101, 201],
        'grid__length': [1e4, 2e4],
        'boundary__status': ['looped', 'looped', 'fixed_value', 'fixed_value'],
        'uplift__rate': 1e-3,
        'spl': {'k_coef': 1e-4, 'area_exp': 0.4, 'slope_exp': 1.},
        'diffusion__diffusivity': 1e-1
    },
    output_vars={
        'topography__elevation': 'out',
        'drainage__area': 'out',
        'flow__basin': 'out',
        'spl__chi': None
    }
)

Some explanation about the arguments of `create_setup` and the values given above:

- we specify the model we want to use, here `basic_model`,
- we specify values for clock coordinates (i.e., time coordinates),
- among these coordinates, we specify the master clock, i.e., the coordinate that will be used to
  set the time steps,
- we set values for model inputs (may be grouped by process in the model),
- we set the model variables and the clock coordinate for which we want to take snapshots during a simulation (`None` means that only one snapshot will be taken at the end of the simulation).
  
Here above, we define a 'time' coordinate and another coordinate 'out' with much larger but aligned
time steps (the values are in years). 'time' will be used for the simulation time steps and 'out' will be used to take just a few, evenly-spaced snapshots of
a few variables like topographic elevation, drainage area and catchments. We also save the $\chi$ values at the end of the simulation.

The initial conditions consist here of a nearly flat topographic surface with small random perturbations. Boundaries are periodic on the left and right borders and fixed on the top and bottom borders.

`create_setup` returns a `xarray.Dataset` object that contains everything we need to run the simulation.

More info about xarray: http://xarray.pydata.org/en/stable/

In [None]:
in_ds

If present, the metadata (e.g., description, units, math_symbol...) associated to each input variable in the model are added as attributes in the dataset, e.g.,

In [None]:
in_ds.spl__k_coef

## Run the model

We run the model simply by calling `in_ds.xsimlab.run()`, which returns a new Dataset with both the inputs and the outputs. 

In [None]:
out_ds = in_ds.xsimlab.run(model=basic_model)

out_ds

Note for example in `out_ds` the `topography__elevation` variable which has now an additional `out` dimension.

In [None]:
out_ds.topography__elevation

## Analyse, plot and save the results (some examples)

Having all the input and output data bundled into a ``xarray.Dataset`` is very convenient to further do some post-processing, visualization or writing the results to disk (e.g., as a netCDF file).

``xarray`` is a powerful library that is well connected to other libraries of the scientific Python ecosystem.

Plot the elevation values at the end of the simulation (note: the xarray plotting functions are built on top of the [matplotlib](https://matplotlib.org/) library):

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

out_ds.isel(out=-1).topography__elevation.plot(size=5, aspect=2);

Or plot it at a given time (one label of the ``out`` coordinate):

In [None]:
out_ds.sel(out=1e5).topography__elevation.plot(size=5, aspect=2);

It is also easy to extract profiles and plot it:

In [None]:
out_ds.sel(out=1e5, x=1000).topography__elevation.plot(size=5, aspect=2);

Or extract and plot swath profiles:

In [None]:
out_ds.sel(out=1e5).mean(dim='x').topography__elevation.plot(size=5, aspect=2);

With ``xarray``, you can extract the same swath profile for all output variables (and all saved time steps) at once:

In [None]:
out_ds.mean(dim='x')

We can also use ``hvplot`` (built on top of ``holoviews`` and ``bokeh``) to create interactive figures.

More info: https://hvplot.pyviz.org/user_guide/Gridded_Data.html

In [None]:
import hvplot.xarray

out_ds.topography__elevation.hvplot.image(x='x', y='y',
                                          cmap=plt.cm.viridis,
                                          groupby='out')

In [None]:
out_ds.flow__basin.hvplot.image(x='x', y='y',
                                cmap=plt.cm.tab20b,
                                width=800)

In [None]:
out_ds.mean(dim='x').topography__elevation.hvplot(groupby='out', ylim=(0, 300))

As a more advanced example, let's extract mean elevation for the largest drainage basins at the last time step.  

In [None]:
# extract last time step dataset
last_step_ds = out_ds.isel(out=-1)

# count the number of grid nodes in each basin
nnodes_per_basin = last_step_ds.groupby('flow__basin').count()

# get ids of large basins (i.e., more than 10 nodes)
basin_ids = (xr.where(nnodes_per_basin.topography__elevation > 10, 1, np.nan)
               .dropna('flow__basin')
               .flow__basin)

# extract mean elevation per basin
mean_elev = last_step_ds.groupby('flow__basin').mean().topography__elevation

# select only large basins
mean_elev_basins = mean_elev.sel(flow__basin=basin_ids)

# show histogram
mean_elev_basins.plot.hist();

Export the simulations data as netCDF files:

In [None]:
# TODO: Fix border fillvalue issue
out_ds.border.attrs.pop("_FillValue")

in_ds.to_netcdf('basic_input.nc')

out_ds.to_netcdf('basic_output.nc')