# grama Demo

---

`grama` is a *grammar of model analysis*---a language for describing and analyzing mathematical models. Heavily inspired by [ggplot](https://ggplot2.tidyverse.org/index.html), py_grama is a Python package that provides tools for defining and exploring models. This notebook illustrates how one can use grama.

In [None]:
### Setup
from dfply import *
import grama as gr
import numpy as np
import pandas as pd
import seaborn as sns

# Quick Tour: Analyzing a model

---

`grama` separates the model *definition* from model *analysis*; once the model is fully defined, only minimal information is necessary for further analysis.

As a quick demonstration, we import a fully-defined model provided with `grama`, and carry out a few analyses.

In [None]:
from grama.models import make_cantilever_beam

model_beam = make_cantilever_beam()
model_beam.printpretty()

A `grama` model has **functions** and **inputs**:  The method `printpretty()` gives a quick summary of the model's inputs and function outputs. Model inputs are organized into:

|            | Deterministic | Random     |
| ---------- | ------------- | ---------- |
| Variables  | `var_det`     | `var_rand` |
| Parameters | `d_param`     | (Future*)  |

- **Variables** are inputs to the model's functions
  + **Deterministic** variables are chosen by the user; the model above has `w, t`
  + **Random** variables are not controlled; the model above has `H, V, E, Y`
- **Parameters** define random variables
  + **Deterministic** parameters are currently implemented; these are listed under `var_rand` with their associated random variable
  + **Random** parameters* are not yet implemented

The `outputs` section lists the various model outputs. The model above has `c_area, g_stress, g_displacement`.

## Studying model behavior with uncertainty

Since the model has sources of randomness (`var_rand`), we must account for this when studying its behavior. We can do so through a Monte Carlo analysis. We make decisions about the deterministic inputs by specifying `df_det`, and the `grama` function `gr.ev_monte_carlo` automatically handles the random inputs. Below we fix a nominal value `w = 0.5 * (2 + 4)`, sweep over values for `t`, and account for the randomness via Monte Carlo.

In [None]:
df_beam_det = pd.DataFrame(
    data={
        "w": [0.5 * (2 + 4)] * 10,
        "t": np.linspace(2.5, 3, num=10)
    }
)

df_beam_mc = \
    model_beam >> \
    gr.ev_monte_carlo(n=1e2, df_det=df_beam_det)

To help plot the data, we use `dfply` to wrange the data, and `seaborn` to quickly visualize results.

In [None]:
df_beam_wrangled = \
    df_beam_mc >> \
    gather("output", "y", ["c_area", "g_stress", "g_displacement"])

g = sns.FacetGrid(df_beam_wrangled, col="output", sharey=False)
g.map(sns.lineplot, "t", "y")

From this plot, we can see:

- The random variables have no effect on `c_area`
- Comparing `g_stress` and `g_displacement`, the former is more strongly affected by the random inputs, as illustrated by its wider uncertainty band.

## Probing random variable effects

One way to quantify the effects of random variables is through *Sobol' indices*, which quantify variable importance by the fraction of output variance "explained" by each random variable. Since distribution information is included in the model, we can carry out a *hybrid-point Monte Carlo* and analyze the results with two calls to `grama`.

In [None]:
df_sobol = model_beam >> \
    gr.ev_hybrid(n_samples=1e3, df_det="nom", seed=101) >> \
    gr.tf_sobol()
df_sobol >> \
    select(X.g_stress, X.g_displacement, X.ind) >> \
    mask(str_detect(X.ind, "S_"))

These results suggest that `g_stress` is largely insensitive to `E`, while `g_displacement` is insensitive to `Y`. For `g_displacement`, the input `V` contributes about three times the variance as variables `H,E`.

To get a *qualitative* sense of how the random variables affect our model, we can perform a set of sweeps over random variable space with a *sinew* design. `grama` generates the design, then we use `dfply` to wrangle the data for plotting.

In [None]:
df_beam_sweeps = \
    model_beam >> \
    gr.ev_sinews(n_density=50, n_sweeps=10, df_det="nom")

First, we visualize the design in the four-dimensional random variable space of `[H,V,E,Y]`.

In [None]:
sns.pairplot(
    data=df_beam_sweeps,
    vars=model_beam.var_rand,
    hue="sweep_ind"
)

Here we can see the sweeps cross the domain in straight lines at random starting locations. Each of these sweeps gives us a "straight shot" within a single variable. Visualizing the outputs for these sweeps will give us a sense of a single variable's influence, contextualized by the effects of the other random variables.

In [None]:
sns.relplot(
    data=df_beam_sweeps >> \
        gather("input", "x", model_beam.var_rand) >> \
        gather("output", "y", model_beam.outputs) >> \
        mask(X.sweep_var == X.input),
    x="x",
    y="y",
    hue="sweep_ind",
    col="input",
    row="output",
    kind="line",
    facet_kws=dict(sharex=False, sharey=False)
)

Based on this plot, we can see:

- The output `c_area` is insensitive to all the random variables
- As the Sobol' analyis above suggested `g_stress` is insensitive to `E`, and `g_displacement` is insensitive to `Y`
- Visualizing the results shows that inputs `H,E` tend to 'saturate' in their effects on `g_displacement`, while `V` is linear over its domain. This may explain the difference in contributed variance

---

## Defining a Model