# Using WEAVE tools to run a Pyranda-based Rayleigh-Taylor study.

pyranda is an mpi parallel high order finited difference solver for arbitrary hyperbolic PDE systems [pyranda](https://github.com/LLNL/pyranda).  This set of examples works through simulations of the Rayleigh-Taylor problem,
which is an instability between interfaces of two fluids acting under non-impulsive accelerations (see Richtmyer-Meshkov for the impulsively accelerated version).

## Setup your WEAVE environment

### Global WEAVE Environment

Run [setup.sh](../setup.sh) in the [Public](..) directory to create a virtual environment with all necessary dependencies and install the jupyter kernel. 

### Activate the environment

Activate the environment according to the instructions given at the end of the [setup.sh](../setup.sh) script.

### Install Pyranda

Go to [https://github.com/LLNL/pyranda/blob/master/INSTALL.md](https://github.com/LLNL/pyranda/blob/master/INSTALL.md) and follow the instructions there

### Setup Merlin (optional)

Brian will put the container or lc instructions here

Remember that before you can run the merlin studies you will need to start your flux allocation e.g.

```bash
flux start -s 112
```

And **THEN** run the merlin command within that flux allocation

TODO BRIAN and JEREMY please check if there are other ways (like putting the flux allocation id in the study yaml)

## Nominal behaviors

The first set of models demonstrate the phenomena and explore the effects of different fluid densities via the non-dimensional Atwood number that expresses the light/heavy fluid
density ratios.  There are a variety of regimes that can be probed, but we'll focus on the configuration of multimode initial interface perturbations with miscible fluids.  In this
setup, the mixing width grows with a form of ~ alpha*A*g*t, where A = atwood number, g = accleration (often gravity), t = time, and alpha is a ~constant factor.  There are some caveats,
such as low wavenumber content in the initial condition (or large wavelength) tends to dominate and grow faster.  Thus this scaling law breaks down a bit in the presence of a lot of
low wavenumber content.  The intial study will show some of these effects with a caveat that doing this in 2D can't quite give the right answer owing to the significant 3D effects in
such problems.

<img src="rho_contour_long.gif" align="center" alt="Rayleigh-Taylor Simluation in Pyranda">

Later in the notebook we will setup a surrogate model to predict the mixing width (see graph below)


<img src="mixing_width.png" align="center" alt="mixing width def">

### Use Maestro To Run Experiments with pyranda

The [rayleigh_taylor.yaml](rayleigh_taylor.yaml) file is a Maestro study specification that can run parameter sweeps of our pyranda rayleigh taylor model, allowing variance of the Atwood number and
the seed fed into a randomized velocity perturbation applied to the fluid interface. This will run 7 variations of the simulation


In [6]:
# Edit the path to point to your weave env
# the -o option is to force the output directory rather than a time stamped one
# the -s option is it ask Maestro to check on a more frequent interval (10 seconds) than the default 60s
!../weave_demos_venv/bin/maestro run -y -s 10 -o RT_0 rayleigh_taylor.yaml

[32m[2024-10-14 16:15:08:[0m [1;30mINFO][0m INFO Logging Level -- Enabled
[32m[2024-10-14 16:15:08:[0m [1;30mCRITICAL][0m [1;31mCRITICAL Logging Level -- Enabled[0m
[32m[2024-10-14 16:15:08:[0m [1;30mINFO][0m Loading specification -- path = rayleigh_taylor.yaml
[32m[2024-10-14 16:15:08:[0m [1;30mINFO][0m Directory does not exist. Creating directories to /usr/WS2/cdoutrix/git/weave_docs/docs/tutorials/Public/pyranda_rayleigh_taylor/RT_STUDIES/rayleigh_taylor_pyranda_20241014-161508/logs
[32m[2024-10-14 16:15:08:[0m [1;30mINFO][0m Adding step 'run-pyranda' to study 'rayleigh_taylor_pyranda'...
[32m[2024-10-14 16:15:08:[0m [1;30mINFO][0m Adding step 'post-process-simulation' to study 'rayleigh_taylor_pyranda'...
[32m[2024-10-14 16:15:08:[0m [1;30mINFO][0m post-process-simulation is dependent on run-pyranda. Creating edge (run-pyranda, post-process-simulation)...
[32m[2024-10-14 16:15:08:[0m [1;30mINFO][0m Adding step 'post-process-all' to study 'rayleigh_

In [9]:
# Let's take a look at the status
import time
time.sleep(30)
!../weave_demos_venv/bin/maestro status RT_0

[3m                                     Study:                                     [0m
[3m/usr/WS2/cdoutrix/git/weave_docs/docs/tutorials/Public/pyranda_rayleigh_taylor/R[0m
[3m               T_STUDIES/rayleigh_taylor_pyranda_20241014-161702                [0m
┏━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━┓
┃[1m       [0m┃[1m       [0m┃[1m       [0m┃[1m       [0m┃[1m       [0m┃[1m       [0m┃[1m       [0m┃[1m       [0m┃[1m       [0m┃[1m [0m[1mNumb[0m[1m [0m┃
┃[1m       [0m┃[1m       [0m┃[1m       [0m┃[1m       [0m┃[1m       [0m┃[1m [0m[1mElaps[0m[1m [0m┃[1m       [0m┃[1m [0m[1mSubmi[0m[1m [0m┃[1m       [0m┃[1m [0m[1mer  [0m[1m [0m┃
┃[1m [0m[1mStep [0m[1m [0m┃[1m [0m[1mJob  [0m[1m [0m┃[1m [0m[1mWorks[0m[1m [0m┃[1m       [0m┃[1m [0m[1mRun  [0m[1m [0m┃[1m [0m[1med   [0m[1m [0m┃[1m [0m[1mStart[0m[1m [0m┃[1m [0m[1mt    [0m[1m [0m┃[1m [0m[1mEnd  [0m[1

The [rayleigh_taylor_overview.yaml](rayleigh_taylor_overview.yaml) file adds a post processing step at the end that generates a picture using the [overview_post_proc.py](overview_post_proc.py) script.

In [10]:
# Edit the path to point to your weave env
# the -o option is to force the output directory rather than a time stamped one
# the -s option is it ask Maestro to check on a more frequent interval (10 seconds) than the default 60s
!../weave_demos_venv/bin/maestro run -y -s 10 -o RT_overview rayleigh_taylor_overview.yaml

[32m[2024-10-14 16:58:55:[0m [1;30mINFO][0m INFO Logging Level -- Enabled
[32m[2024-10-14 16:58:55:[0m [1;30mCRITICAL][0m [1;31mCRITICAL Logging Level -- Enabled[0m
[32m[2024-10-14 16:58:55:[0m [1;30mINFO][0m Loading specification -- path = rayleigh_taylor_overview.yaml
[32m[2024-10-14 16:58:55:[0m [1;30mINFO][0m Directory does not exist. Creating directories to /usr/WS2/cdoutrix/git/weave_docs/docs/tutorials/Public/pyranda_rayleigh_taylor/RT_overview/logs
[32m[2024-10-14 16:58:55:[0m [1;30mINFO][0m Adding step 'run-pyranda' to study 'rayleigh_taylor_overview_pyranda'...
[32m[2024-10-14 16:58:55:[0m [1;30mINFO][0m Adding step 'post-process-all' to study 'rayleigh_taylor_overview_pyranda'...
[32m[2024-10-14 16:58:55:[0m [1;30mINFO][0m post-process-all is dependent on run-pyranda_*. Creating edge (run-pyranda_*, post-process-all)...
[32m[2024-10-14 16:58:55:[0m [1;30mINFO][0m 
------------------------------------------
Submission attempts =       1
Submi

In [12]:
# Let's take a look at the status
time.sleep(30)
!../weave_demos_venv/bin/maestro status RT_overview


No status to report -- the Maestro study in this path either unexpectedly crashed or the path does not contain a Maestro study.


### UQ

At this point we are going to run a full blown ensemble study.

#### Goal

Goal: Understand the uncertainty or variability of the mixing width in fluid mixing situations

Information sources:
* Physics equations in Pyranda simulation
* Real world experimental data

Use Bayesian inference to update our knowledge of fluid mixing width with experimental data

#### What is Bayesian?

<img src="bayesian.png" align="center" alt="what is bayesian?">

#### The workflow

In this UQ Study we will:
* Generate a lot of data with simulation (Trata and Maestro/Merlin)
* Fit surrogate model to simulation data for faster inference
* Use IBIS to do Bayesian Inference and understand input/output variance

Because our AWS instance were not really big it could take a long time to generate all the data using Maestro. But with Merlin we were able to take advantage of all of our instances and distribute the work across them. In this notebook we assume a single user and system and we will demonstrate how to use both [Maestro](rayleigh_taylor_full_uq.yaml) and [Merlin](rayleigh_taylor_full_uq_merlin.yaml).

Notice how both yaml files are very similar Merlin's yaml has a `merlin` block that decribes the resources (workers) and samples. 

Because we need a lot of points (and there simulations), rather than entering them manually in the yaml file we are using [this parameter generator script](simulations_pgen.py) to fill the parameters for 100 simulations covering our domain.

Note the difference between merlin and maestro here, maestro is defining SEED as a full blown parameter whereas merlin is using its sampling capabilities to generate the SEED. As a result in the `pgen` script we are checking for the presence of `SAMPLE_BOUNDS`, available only in merlin workflows in order to determine if we need to generate the `SEEDS~ paramters.

These results will be fed to a scikit to generate a GP surrogate to predict mixing width at 60 seconds. Then IBIS's MCMC sampling will use the GP surrogate model to interpolate to new samples that were drawn estimating the posterior distribution.

#### Running the study


In [10]:
# Edit the path to point to your weave env
# the -o option is to force the output directory rather than a time stamped one
# the -s option is it ask Maestro to check on a more frequent interval (10 seconds) than the default 60s
# We overwrite the N_RUNS parameter from the command line using --pargs
!../weave_demos_venv/bin/maestro run -y -s 10 -o RT_UQ -p simulations_pgen.py --pargs "N_RUNS:100" rayleigh_taylor_full_uq.yaml
# for merlin you would do (within flux allocation)
# Start the study similar to Maestro
!../weave_demos_venv/bin/merlin run -p simulations_pgen.py --pargs "N_RUNS:100" rayleigh_taylor_full_uq_merlin.yaml
# Start the workers
!../weave_demos_venv/bin/merlin run-workers rayleigh_taylor_full_uq_merlin.yaml

#### The results

The plot step generated some plot that let us see the points generated as input, it shows that the sampling generated via Trata indeed covers the space really well


<img src="atwood_vs_vel.png" align="center" alt="hypercube distribution">

It also generates a plot showing the spread of the mixing width

<img src="all_mixing_width.png" align="center" alt="mix width distribution">

as well as how good the GP is doing (Renee please replace with new figure)

<img src="GP_at_60.0_s.png" align="center" alt="mix width distribution">

Finally we used IBIS MCMC inference to test our GP model as shown bellow:


<img src="ibis_results.png" align="center" alt="mix width distribution">


#### Iterative study

Using Merlin we can run this UQ iteratively generating more data at each iteration until we are satified with our surrogate

A few considerations:
* in the staging step we copy the store from the previous iteration, we could copy the store in $(SPECROOT) onyl at the first iteration, but it might be advantageous to keep the store for each iterations
* we are using LHS to get samples the first iteration, but in subsequent iteration we are using EIS and some new sample points to optimize the next set of points.


In [None]:
# Edit the path to point to your weave env
# Start the study similar to Maestro
!../weave_demos_venv/bin/merlin run -p simulations_pgen.py --pargs "N_RUNS:100" rayleigh_taylor_full_uq_merlin_iterative.yaml
# Start the workers
!../weave_demos_venv/bin/merlin run-workers rayleigh_taylor_full_uq_merlin_iterative.yaml

You should end up with two timestamped directory

In the first one the `plot_all` step does show the original 25 points (controlled by `N_RUNS`) in the yaml file

<img src="atwood_vs_vel_iter1.png" align="center" alt="hypercube distribution at first iteration">

in the second directory you will notice the same plots now shows 10 (`N_CAND` in the yaml) new points

<img src="atwood_vs_vel_iter2.png" align="center" alt="hypercube distribution at first iteration">
