# 2024 SCOPED Workshop — Wavefield Simulations Using SPECFEM

## Notebook 5: Introduction to SeisFlows — Exercise Solutions

- In this notebook we will introduce two open-source Python packages for facilitating/automating seismic imaging  
- **Objective**: To introduce and tour around SeisFlows and Pyatoa, and see how they can be used to simplify working with SPECFEM     
- These instructions should be run from inside a Docker container, using Jupyter Lab (see instructions [here](https://github.com/adjtomo/adjdocs/blob/main/readmes/docker_image_install.md)).  
-----------

**Relevant Links:** 
- This Notebook: https://github.com/adjtomo/adjdocs/blob/main/workshops/2024-5-21_scoped_uw/5_intro_seisflows.ipynb

**adjTomo Software Suite:** 
- adjTomo: https://github.com/adjtomo
- SeisFlows GitHub Page: https://github.com/adjtomo/seisflows
- SeisFlows Documentation: https://seisflows.readthedocs.io/en/latest/


**Jupyter Quick Tips:**

- **Run cells** one-by-one by hitting the $\blacktriangleright$ button at the top, or by hitting `Shift + Enter`
- **Run all cells** by hitting the $\blacktriangleright\blacktriangleright$ button at the top, or by running `Run -> Run All Cells`
- **Currently running cells** that are still processing will have a `[*]` symbol next to them
- **Finished cells** will have a `[1]` symbol next to them. The number inside the brackets represents what order this cell has been run in.
- Commands that start with `!` are Bash commands (i.e., commands you would run from the terminal)
- Commands that start with `%` are Jupyter Magic commands.
----------

In [None]:
# Required Python packages for today's notebook
import os
import shutil
import numpy as np
from glob import glob
from pyasdf import ASDFDataSet
from pyatoa import Inspector
from seisflows.tools import unix
from IPython.display import Image

-----------
## 2) Exercise: Run an Inversion w/ SeisFlows

- Okay, now that we have solved the forward problem, we can tackle the inverse problem
- We will take our current working directory and make adjustments to the required modules to run an inversion
- First we'll clean up our working directory prior to getting started

In [None]:
# Move to the SeisFlows working directory
%cd /home/scoped/work/intro_seisflows
! ls 

------------
In order to run our inversion, we will need a few components we did not have in the Forward problem, 
these tasks will help guide you into setting up your inversion. Much of the code you will need is available
in previous notebooks. 

### Task 1) Create 'Data'

#### Background
- To run an inversion, we need some kind of 'data' to compare to our synthetics, the data-synthetic differences (i.e., **misfit**) will guide the inversion.
- Often tomographers will run **synthetic inversions**, where are data consist of synthetic waveforms generated using a **target model**.
- In this example, we will take the data we just created in our forward simulations to use as our **target synthetics**.

> NOTE: Seisflows has a `unix` module that allows you to run unix commands through python. For example `unix.cp` mimics the `cp` command

#### Exercise Tasks
1) Identify `path_data` in the 'parameters.yaml' file, this is where SeisFlows expects waveform data  
   - You can open the file with the file manager, or use `seisflows par`
2) Create the required directory structure in `path_data`, which follows the format `{path_data}/{event_id}/`   
   - Each source requires its own sub-directory
   - Follow the source naming convention we covered earlier
   - Check parameter `ntask` to determine how many sources will be used
3) Move or copy the synthetics generated by the forward problem we just ran into the directories you created in (2)  
   - Remember that synthetics are stored in: `scratch/solver/{event_id}/traces/syn/*`  
   - You can do this manually, with bash commands or with Python*)
4) Confirm that you have `ntask` sub-directories in `path_data`, each containing synthetic waveform data

In [None]:
# 1. Figure out where data are to be stored
! seisflows par path_data

In [None]:
# 2. Create the correct directory structure
! seisflows par ntask  # how many sources will we be using
! echo
! ls scratch/solver

In [None]:
# 2. Generate source directories
for i in range(1, 11):
    os.mkdir(f"waveforms/{i:0>3}")
! ls waveforms

In [None]:
# 3. Copy the "data" into these directories
for i in range(1, 11):
    for src in glob(f"scratch/solver/{i:0>3}/traces/syn/*"):
        dst = f"waveforms/{i:0>3}"
        unix.cp(src, dst)

In [None]:
! ls waveforms/001

---------------
### Task 2) Generate a new 'Starting Model'

- Because our 'data' was generated using the checkerboard model shown above, we need a new 'starting model'
- If we do not change our starting model, the synthetics we generate will be the same as our **target synthetics**, resulting in 0 misfit
- Let's modify the model located in `specfem2d_workdir`, there are two approaches with (1) being easier than (2).  

#### Exercise Tasks

**Option 1 (Homogeneous Halfspace):**
1) Change the value of parameter `Model` in `specfem2d_workdir/DATA/Par_file` from `gll` -> `Default`
    - You can do this manually or use `seisflows sempar`
    - This will tell the internal mesher to use the parameter file definition of the model, which is a homogeneous halfspace
2) Rerun `xmeshfem2D` and `xspecfem2D` to generate the required Model files. You can find the syntax for running these commands in previous notebooks.
3) Reset `Model` parameter to `gll` for the inversion
   - We do this because the actual inversion uses this option to be able to update model parameters

**Option 2 (Checkerboard Perturbation):**
>Warning: This requires some Python skill
1) Change the value of parameter `Model` in `specfem2d_workdir/DATA/Par_file` from `gll` -> `legacy`
    - This will tell the internal mesher to read model values from the file `model_velocity.dat_input`
2) Find the file that defines the legacy model values in `specfem2d_workdir/DATA` 
3) Modify this file in order to perturb the checkerboard model
    - The easiest thing to do is increase or decrease P and S-wave velocity structure by some percentage of their original value (5%?)
    - The column structure of this file is: `index, x-coordinate [m], y-coordinate [m], density, Vp [m/s], Vs [m/s]`
    - Probably best to use Python to read, write and modify the file (e.g., with NumPy `loadtxt` and `savetxt`)
5) Rerun `xmeshfem2D` and `xspecfem2D` to generate the required Model files. You can find these commands in previous notebooks.
6) Reset `Model` parameter to `gll` for the inversion
   - We do this because the actual inversion uses this option to be able to update model parameters


In [None]:
# Option 1: 
# Change parameter type
%cd /home/scoped/work/intro_seisflows/specfem2d_workdir
! seisflows sempar -P DATA/Par_file model default

In [None]:
# Run mesher and simulation code
! mpirun -n 1 bin/xmeshfem2D > OUTPUT_FILES/output_meshfem.txt
! mpirun -n 1 bin/xspecfem2D > OUTPUT_FILES/output_solver.txt

In [None]:
# Move model to the correct location
! mv DATA/proc000000_*.bin OUTPUT_FILES

In [None]:
# Reset model parameter for inversion
%cd /home/scoped/work/intro_seisflows
! seisflows sempar -P specfem2d_workdir/DATA/Par_file model gll

-----------

In [None]:
# Option 2: Change parameter type
%cd /home/scoped/work/intro_seisflows/specfem2d_workdir
! seisflows sempar -P DATA/Par_file model legacy

In [None]:
# Identify file
! ls DATA/*.dat_input

In [None]:
# Look at the input model
! echo "    index        x[m]               y[m]        density         Vp [m/s]        Vs [m/s]"
! echo
! head -5 DATA/proc000000_model_velocity.dat_input

In [None]:
# Modify the input checkerboard model
# Make sure we keep the original file incase we make a mistake
! cp DATA/proc000000_model_velocity.dat_input DATA/proc000000_model_velocity.dat_input_original  
data = np.loadtxt("DATA/proc000000_model_velocity.dat_input")
print(data)

# Modify velocity structure
data[:, 4] *= 1.05  # Increase Vp by 5%
data[:, 5] *= 1.05  # Increase Vs by 5%

In [None]:
# Overwrite the existing file
np.savetxt("DATA/proc000000_model_velocity.dat_input", data, fmt="%10.4f", delimiter="\t")
! head DATA/proc000000_model_velocity.dat_input

In [None]:
# Run mesher and simulation code
! mpirun -n 1 bin/xmeshfem2D > OUTPUT_FILES/output_meshfem.txt
! mpirun -n 1 bin/xspecfem2D > OUTPUT_FILES/output_solver.txt

In [None]:
! seisflows sempar -P specfem2d_workdir/DATA/Par_file model gll

----------------
### Task 3) Set up your SeisFlows Parameter File

- Now we need to modify our existing parameter file to switch our workflow from Forward simulations to Inversion
- Inversion workflows require additional modules for `preprocess` for data-synthetic comparisons, 
- They also require an `optimize` module which is in charge of model updates
- We will use the `seisflows swap` command which swaps in the set of parameters associated with a given module
- You can use the command `seisflows print modules` to check the available choices for each module

#### Exercise Tasks

1) `Swap` the `preprocess` module to option: `default`
    - SeisFlows currently has two preprocessing modules, 'Default' and 'Pyaflowa'
    - Both modules perform similar functionality, but Pyaflowa provides richer features such as windowing, improved data storage, and plotting
2) `Swap` the `optimize` module to option: `gradient`
    - The optimize module takes care of gradient regularization and model updates
    - Other optimization modules include L-BFGS and Nonlinear Conjugate Gradient (NLCG)
3) `Swap` the `workflow` module to option: `inversion`
    - The `inversion` submodule builds upon the forward simulation and adds in functionality for generating kernels and updating models
    - Other workflow modules include: Forward, Migration (for generating kernels), and NoiseInversion (for ambient noise adjoint tomography)
4) Change the location of `path_model_init` which points to your starting model.  
   - Note: in (2) we generated a starting model in `specfem2d_workdir/DATA` (this is specific to SPECFEM2D, model files 
   - You might use the command `seisflows par` to change parameters from the command line, or do this manually

#### Optional Tasks
- Have a look through the remainder of the parameter file, are there parameters you think would be useful to change?
- You can run the Inversion as is, but advanced Users may play around with filtering (preprocess module) and smoothing (solver module) .

In [None]:
%cd /home/scoped/work/intro_seisflows

! seisflows swap preprocess default
! seisflows swap optimize gradient
! seisflows swap workflow inversion
! seisflows par path_model_init /home/scoped/work/intro_seisflows/specfem2d_workdir/OUTPUT_FILES

--------------
### Task 4) Clean Up The Working Directory

- Run `seisflows clean` to delete all of the files from the previous Forward simulation, getting ready for our inversion.
- You can use the `-f/--force` option to skip over any 'are you sure about that?' prompts.

In [None]:
! seisflows clean -f

-------------
### Task 5) Ready to Run? Check and See!

- When your data are ready, and your parameter file is setup, you can perform a sanity check 
- Run `seisflows check` to perform a number of internal checks that makes sure paths and parameters are set properly  
- If you receive any error messages from `seisflows check`, please fix them and re-run `seisflows check` to see if new errors pop up.

In [None]:
! seisflows check

### Task 6) Let's go!

If you think you're ready, run `seisflows submit` to start your inversion. 

In [None]:
! seisflows submit