# SPECFEM Users Workshop -- Day 1 (Oct. 5, 2022)

## Part 1B: Homogeneous Halfspace Model

Following up on Part 1A, this notebook allows Users to play around with their own SPECFEM2D homogeneous halfspace example. This is meant to familiarize Users with setting `SOURCE` and `STATION` attributes, adjust velocity model parameters, and assess their simulation work.

-----------

### 0) Relevant Information

>__NOTE:__ These instructions should be run from inside the Docker container, using Jupyter Lab. The Docker container should have the adjTomo toolkit installed (SeisFlows, Pyatoa, PySEP), as well as SPECFEM2D and SPECEFM3D compiled with MPI. 

**Relevant Links:** 
- !!! ADD DAY 1 SLIDES HERE !!!
- Workshop Material: https://github.com/adjtomo/adjdocs/tree/main/workshops/2022-10-05_specfem_users
- Today's Notebook: !!! ADD THIS !!!

**Jupyter Quick Tips:**

- **Run cells** one-by-one by hitting the `Run` button at the top, or by hitting `Shift + Enter`
- **Currently running cells** that are still processing will have a `[*]` symbol next to them
- **Finished cells** will have a `[1]` symbol next to them. Where the number inside the brackets represents what order this cell has been run in.
- Commands that start with `!` are Bash commands (i.e., commands you would run from the terminal)
- Commands that start with `%` are Jupyter Magic commands


## 1) Setting Up 

It is often desireable to run SPECFEM outside of the cloned repository, in order to keep files and outputs manageable. The trick here is that SPECFEM only requires 3 compenents for a sucessful simulation, the `bin/`, `DATA/`, and `OUTPUT_FILES/` directories. In this section we will set up a SPECFEM2D working directory that we can play around with.

>__NOTE:__ We will be doing all our work in the directory /home/scoped/work_day_1. All the following cells assume that we are in this directory, so you must evaluate the '%cd' command to ensure that cells work as expected.

In [None]:
! mkdir /home/scoped/work_day_1
%cd /home/scoped/work_day_1

In [None]:
# Symlink the binary files, and copy the relevant DATA/ directory
! ln -s /home/scoped/specfem2d/bin .
! cp -r /home/scoped/specfem2d/EXAMPLES/Tape2007/DATA .
! mkdir OUTPUT_FILES

In [None]:
! ls

### 2) Tape et al. 2007 Example

We will be working with example data from the [Tape et al. 2007 GJI publication](https://academic.oup.com/gji/article/168/3/1105/929373). This example contains two synthetic models, a number of seismic sources, and a list of 132 station locations. The two synthetic models are 1) a homogeneous halfspace model, and 2) a checkerboard model with approximately $\pm$10\% perturbations with respect to the homogeneous halfspace model of (1). 

In [None]:
! ls DATA/

### a) The Homogeneous Halfspace Model

The homogeneous halfspace model in this example is defined in the `Par_file`, on Line 255. We can use the `seisflows sempar velocity_model` command to look at its values.

In [None]:
! seisflows sempar -P DATA/Par_file_Tape2007_onerec velocity_model

According to the `Par_file` comments, the model parameter values represent the following:
`model_number 1 rho Vp Vs 0 0 QKappa Qmu  0 0 0 0 0 0`. 

Therefore, the homogeneous halfspace model defines a region with P-wave velocity 5.8km/s and S-wave velocity 3.5km/s.

### b) Visualizing the Checkerboard

We can use Matplotlib and NumPy to help us visualize these a bit better. First we'll have a look at the checkerboard model data file, and then we'll plot it directly.

In [None]:
# The columns of the data file correspond to the following:
# line_no x[m] z[m] density vp[m/s] vs[m/s]
! head DATA/model_velocity.dat_checker

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Only grabbing X, Z, Vs and Vp
data = np.genfromtxt("DATA/model_velocity.dat_checker", dtype=float, usecols=[1,2,4,5])
chkbd_x, chkbd_z, chkbd_vp, chkbd_vs = data.T

In [None]:
# Plotting Vp
plt.tricontourf(chkbd_x, chkbd_z, chkbd_vp, levels=125, cmap="seismic_r")
plt.xlabel("X [m]")
plt.ylabel("Z [m]")
plt.title("Checkerboard Vp [m/s]")
plt.colorbar()

In [None]:
# Plotting Vs
plt.tricontourf(chkbd_x, chkbd_z, chkbd_vs, levels=125, cmap="seismic_r")
plt.xlabel("X [m]")
plt.ylabel("Z [m]")
plt.title("Checkerboard Vs [m/s]")
plt.colorbar()

### c) Visualizing Source-Receiver Geometry

We can similarly plot the SOURCES and STATIONS available to see what the experiemental setup looks like.
We can use Python to grab Cartesian coordinate values from these files.

In [None]:
# Grab STATIONS data
! head DATA/STATIONS_checker

# Small code snippet to grab coordinates from STATIONS file
sta_x, sta_z = np.genfromtxt("DATA/STATIONS_checker", dtype=float, usecols=[2, 3]).T
print(sta_x[:10])

In [None]:
# There are 25 sources
# `xs` is defined on line 2, `zs` on line 3
! ls DATA/SOURCE_???
! echo
! head DATA/SOURCE_001

In [None]:
# Small code snippet to grab coordinates from SOURCE files
ev_x, ev_z = [], []
for i in range(1, 26):
    source_file = f"DATA/SOURCE_{i:0>3}"
    with open(source_file, "r") as f:
        lines = f.readlines()
    # Trying to break apart the following line
    # 'xs = 299367.72      # source location x in meters\n'
    xs = float(lines[2].split("=")[1].split("#")[0].strip())
    zs = float(lines[3].split("=")[1].split("#")[0].strip())
    
    ev_x.append(xs)
    ev_z.append(zs)

print(ev_x[:10])

In [None]:
# Plot SOURCES and STATIONS together. Annotate names
plt.tricontourf(chkbd_x, chkbd_z, chkbd_vs, levels=250, cmap="seismic_r")
plt.scatter(ev_x, ev_z, c="y", marker="*", s=100, edgecolor="k")
plt.scatter(sta_x, sta_z, c="c", marker="v", s=20, edgecolor="k")
plt.title("SOURCE-RECEIVER GEOMETRY")

In [None]:
# Plot SOURCES next to source names
plt.tricontourf(chkbd_x, chkbd_z, chkbd_vs, levels=250, cmap="seismic_r", alpha=0.1)
for i, (x, z) in enumerate(zip(ev_x, ev_z)):
    plt.scatter(x, z, c="y", marker="*", s=45)
    plt.text(x, z, f"{i:0>3}")
plt.title("SOURCES")

In [None]:
# Plot SOURCES and STATIONS together. Annotate names
plt.tricontourf(chkbd_x, chkbd_z, chkbd_vs, levels=250, cmap="seismic_r", alpha=0.1)
for i, (x, z) in enumerate(zip(sta_x, sta_z)):
    plt.scatter(x, z, c="c", marker="v", s=8)
    plt.text(x, z, f"{i:0>3}", fontsize=9)
plt.title("STATIONS")

In the above figures, the upside-down blue triangles represent the 132 receivers in this example, while the 25 yellow stars are the sources. Now that we are familiar with our experimental setup, we can run SPECFEM2D to generate synthetics.

## 2) Running the Example

To run the example, we'll have to do a some setup of our working directory to get files in the correct place. 

In [None]:
! ls DATA

### a) Choose a Source file

SPECFEM2D will look for a file named `SOURCE` in the *DATA/* directory. There are 25 sources to choose from. You can have a look at the SOURCE plot we created in the previous section to choose which SOURCE
you'd like to run. By default the notebook choose SOURCE_001 as the main source.

In [None]:
# > CHOOSE your source file here by replacing 'SOURCE_001'
! cp -f DATA/SOURCE_001 DATA/SOURCE

In [None]:
! head -1 DATA/SOURCE

### b) Choosing Stations

The `STATIONS` file defines 132 different station locations, as visualized earlier. We can choose what stations we use by copying over a subset of the original station list.
By default the example chooses to use all 132 stations. 

>__NOTE:__ Because the wavefield is simulated in the entire domain, and individual synthetic seismograms simply extract the wavefield at a chosen location, computational expense is not tied to the number of stations. In other words, choosing 1 or 132 stations results in the same computational expense./

In [None]:
! head DATA/STATIONS_checker

In [None]:
# Write out a NEW stations file by choosing station numbers
# Change the range, or write your own list to choose station values
# e.g., STATION_CHOICE = [0, 1, 2, 3]
STATION_NUMBER_CHOICE = range(0, 132) 

# Read the existing stations file
stations = open("DATA/STATIONS_checker", "r").readlines()

# Write out only User defined stations
with open("DATA/STATIONS", "w") as f:
    for i in STATION_NUMBER_CHOICE:
        f.write(stations[i])

In [None]:
! tail DATA/STATIONS

### c) Setting up the `Par_file`

We need to change a few key parameters in the `Par_file` to run SPECFEM2D with desired behavior. We'll explain each of the parameter changes below, and use the `seisflows sempar` command to make the changes.
Optionally, you are welcome to open the `Par_file` directly (by double clicking) and editing parameters yourself. Be sure to check your spelling!

In [None]:
# Copy in the Example parameter file
! cp -f DATA/Par_file_Tape2007_132rec_checker DATA/Par_file

# Ensure that the checkerboard is named appropriately so SPECFEM can find it
! cp -f DATA/model_velocity.dat_checker DATA/proc000000_model_velocity.dat_input

! seisflows sempar -P DATA/Par_file model legacy
! seisflows sempar -P DATA/Par_file save_model binary
! seisflows sempar -P DATA/Par_file setup_with_binary_database 1

Explanations of the changes we are made include:

`MODEL`: Set to 'legacy', which tells SPECFEM2D to read an ASCII file defining the checkerboard model  
`setup_with_binary_database`: Writes database files in binary format     
`SAVE_MODEL`: Write model files in binary (.bin) format  

### d) Run the example

Now that we have set the `Par_file`, the `SOURCE` and `STATIONS` file, we are able to run `xmeshfem2D` and `xspecfem2D` to run our forward simulation.

In [None]:
# Ensures we're running with a clean OUTPUT directory
! rm -rf OUTPUT_FILES
! mkdir OUTPUT_FILES

! mpirun -n 1 bin/xmeshfem2D > OUTPUT_FILES/output_meshfem.txt
! mpirun -n 1 bin/xspecfem2D > OUTPUT_FILES/output_solver.txt

### e) Examine Output Files

!!! ANALYZE LOG MESSAGES HERE !!!

In [None]:
! ls OUTPUT_FILES/

In [None]:
# We can use the record section tool in PySEP to plot our waveforms
! recsec --syn_path OUTPUT_FILES/ --cmtsolution DATA/SOURCE --stations DATA/STATIONS --components Y --cartesian -L INFO 

In [None]:
from IPython.display import Image
Image("record_section.png")

In [None]:
# We can also look at the wavefield snapshots
Image("OUTPUT_FILES/forward_image000000800.jpg")

In [None]:
# We can also look at the wavefield snapshots
Image("OUTPUT_FILES/forward_image000001200.jpg")

In [None]:
# We can also look at the wavefield snapshots
Image("OUTPUT_FILES/forward_image000002200.jpg")

## 3) Choose Your Own Adventure

Now that we have a working directory that we know produces synthetics, we can play around with our setup.
Some things you might try:

- Change out the model to a homogeneous halfspace using the `Par_file` definition of the velocity model
- Define a uniform grid of stations to record synthetics throughout the domain
- Choose a different source, or increase the energy released by the source (using the moment tensor)

## 4) Automating Forward Simulations with SeisFlows

SeisFlows is an automated workflow tool which takes care of all the tasks required to run SPECFEM. In essence, SeisFlows is a Python wrapper for SPECFEM, which includes modular components for interfacing with various compute systems. It also employs various preprocessing and optimization methods for seismic inversions (to be completed in Day 2, 3). We can automate forward simulations for multiple events in the Example we just ran. SeisFlows Example 3 runs a automated en-masse forward simulations.

In [None]:
# Prints the help dialogue for SeisFlows example 3
! seisflows examples 3

In [None]:
! mkdir /home/scoped/work_day_1/example_3
%cd /home/scoped/work_day_1/example_3

! seisflows examples run 3 -r /home/scoped/specfem2d/ --with_mpi

In [None]:
! ls 