# SPECFEM Users Workshop -- Day 1 (Oct. 5, 2022)

## >> TO DO (ctrl + f `!!!`)
1) Add link to google slides
2) Background material (section 1)
3) Analyze mesh parameter and interface file (section 3c)
4) Analyze mesher log file (section 3d)
5) Analyze solver log file (section 4)

## Part 1A: Intro to SPECFEM2D

- Day 1A notebook is meant to walk Users through an introduction to `SPECFEM2D`, which includes:
    - navigating a SPECFEM2D working directory
    - generating a 2D mesh with homogeneous halfspace model
    - running forward simulations to generate synthetic seismograms. 
- We will note important files and key steps to take when running SPECFEM. 
- **Objective**: Understanding `SPECFEM2D` to help draw parallels with `SPECFEM3D` and better understand of how software like `SeisFlows` automates SPECFEM
- These instructions should be run from inside the Docker container, using Jupyter Lab (see *Docker Preamble*). 
-----------


**Relevant Links:** 
- Day 1 Slides: !!! ADD THIS !!!
- Today's Notebook: https://github.com/adjtomo/adjdocs/blob/main/workshops/2022-10-05_specfem_users/day_1a_intro_specfem.ipynb)
- Completed Notebook: https://github.com/adjtomo/adjdocs/blob/main/workshops/2022-10-05_specfem_users/completed_notebooks/day_1a_intro_specfem.ipynb
- Day 0 Notebook (Container Testing): https://github.com/adjtomo/adjdocs/blob/main/workshops/2022-10-05_specfem_users/day_0_container_testing.ipynb
- SPECFEM2D GitHub Repository: https://github.com/geodynamics/specfem2d/tree/devel
- SPECFEM2D Manual: https://specfem2d.readthedocs.io/en/latest/

**Jupyter Quick Tips:**

- **Run cells** one-by-one by hitting the $\blacktriangleright$ button at the top, or by hitting `Shift + Enter`
- **Run all cells** by hitting the $\blacktriangleright\blacktriangleright$ button at the top, or by running `Run -> Run All Cells`
- **Currently running cells** that are still processing will have a `[*]` symbol next to them
- **Finished cells** will have a `[1]` symbol next to them. The number inside the brackets represents what order this cell has been run in.
- Commands that start with `!` are Bash commands (i.e., commands you would run from the terminal)
- Commands that start with `%` are Jupyter Magic commands.
- To time a task, put a `%time` before the command (e.g., `%time ! ls`)


## 1) Background !!! TODO !!!


Potential topics: 
- Seismic waveforms
- Numerical modeling
- Spectral element method
- Meshes

## 2) SPECFEM2D Directory Tour

- In this section we will tour around the SPECFEM2D repository, which is located in `/home/scoped/specfem2d`.  
- **Note** that this directory will not exactly match a directory you clone from GitHub because we have removed a number of large files and directories in order to keep the size of this container reasonable.
- The `devel` branch of all SPECFEM versions contains the most up to date codebase

In [None]:
import numpy as np
import matplotlib.pyplot as plt 
from IPython.display import Image
from seisflows.tools.specfem import Model

%cd /home/scoped/specfem2d

### a) Binary Executables in *bin/* directory

- In this workshop container, we have already downloaded (git clone), configured (choosing compilers and compiler options) and compiled (make all) SPECFEM2D.  
- The binary executable files are located in the `bin/` directory. 
- Each of these executables performs a different function in the package.

In [None]:
# Let's have a look at the executables
! ls bin

The two most important executables we will be using today are `xmeshfem2D` and `xspecfem2D`. 
- `xmeshfem2D`: used to generate our numerical mesh, the skeleton of the domain upon which we run our numerical simulations. 
- `xspecfem2D`: runs the spectral element solver, generating synthetic seismograms for a given source and set of stations.

Some other important executables we will use in Day 2:
- `xsmooth_sem`: smooths volumentric quantities by convolving them with a 2D Gaussian. Users can define the horizontal and vertical half-widths of the Gaussian.
- `xcombine_sem`: combines multiple volumetric quantities, such as summing kernels to form the gradient.

### b) Metadata in *DATA/* directory

- Data that the User will provide to SPECFEM should be stored in the *DATA/* directory. 
- The most important files that we will concern ourselves with are the `Par_file`, `SOURCE` and `STATIONS` files.

    `Par_file`: The parameter file which allows the User to adjust parameters for a given simulation  
    `SOURCE`: Defines source characteristics (e.g., moment tensor, force). **NOTE**: SPECFEM2D and SPECFEM3D have a number of different types of available source files (e.g., SOURCE, FORCESOLUTION, CMTSOLUTION)  
    `STATIONS`: Defines station codes and locations (either Cartesian or geographic). **NOTE**: In SPECFEM2D, station information may also be defined in the `Par_file`

The following commands open these files for the SPECFEM2D example problem

In [None]:
# Look at the DATA/ directory
! ls DATA

In [None]:
# Illustrates that the Par_file is an ASCII file with key-value pairs
! head -38 DATA/Par_file

In [None]:
# The SPECFEM2D SOURCE file defines a 2D seismic source
! cat DATA/SOURCE

In [None]:
# Sometimes in SPECFEM2D, the Par_file defines station information directly
! head -194 DATA/Par_file | tail -n 16

In [None]:
# However, other examples may define station information using STATIONS files, which is formatted:
# STATION NETWORK X[m] Z[m] burial[m] elevation[m]
! head -5 EXAMPLES/Tape2007/DATA/STATIONS_checker

### c) Results stored in *OUTPUT_FILES/* directory

- Any outputs generated by SPECFEM will be stored in the `OUTPUT_FILES/` directory. 
- Outputs include log and error messages, synthetic seismograms, figures, and database files
- Most executables will put their outputs here
- **NOTE:** SPECFEM also maintains a DATABASE directory (typically called `DATABASES_MPI/`) which is used to store large database files containing the entire GLL mesh and model. This directory may be the same as `OUTPUT_FILES/`, or may be it's own separate directory.

In [None]:
# Currently empty because we have not run any executables
! ls OUTPUT_FILES

## 3) Running the mesher `xmeshfem2D` 

- The first thing we need to do when approaching numerical simulations is to generate our numerical mesh. 
- There are multiple approaches to meshing, such as using external software like Trellis. 
- During this workshop we will use SPECFEM's internal meshing software, known as `Meshfem`.
- We will use two terms to talk about meshing:  
    - *MESH*: a numerical grid which defines coordinate points only (i.e., X and Z in 2D).  
    - *MODEL*: parameter values (e.g., seismic velocity) approximating structure, assigned to locations on the MESH.  

In [None]:
%cd /home/scoped/specfem2d

### a) Velocity Model Parameters

- In SPECFEM2D, *mesh* and *model* parameters are defined in the `Par_file`.
- There are various parameter options we can use to customize our mesher run.
- The following parameter set allows us to read input values from the `Par_file`  
    `NPROC`: defines the number of processors used to partition the *mesh*  
    `MODEL`: set as 'default' which reads *model* parameters from `Par_file`  

>__NOTE:__ In SPECFEM3D, mesh files are defined separate from the `Par_file` to provide more control over a 3D domain. These files are typically stored in `specfem3d/DATA/meshfem3D_files`.

In [None]:
# Look at the definition of the model in the Par_file
! head -273 DATA/Par_file | tail -n 34

- In the output above we can see that our `Par_file` defines 4 separate regions, each with varying values for density and velocity.

```bash
REG - RHO   VP[m/s] VS[m/s]    - - QKAP QMU  - - - - -
1 1 2700.d0 3000.d0 1732.051d0 0 0 9999 9999 0 0 0 0 0 0
2 1 2500.d0 2700.d0 0 0 0 9999 9999 0 0 0 0 0 0
3 1 2200.d0 2500.d0 1443.375d0 0 0 9999 9999 0 0 0 0 0 0
4 1 2200.d0 2200.d0 1343.375d0 0 0 9999 9999 0 0 0 0 0 0
```
- These regions have **no** sense of space. They only represent material properties.
- These regions will be assigned to parts of the *mesh* in the following section

### b) Meshfem Parameters

- The internal mesher has a set of parameters that allows Users to provide interfaces, geometry, absorbing boundary conditions.
- These parameters allow a User to customize a 2D domain to fit their research problem
- The *mesh* parameters also distribute *model* properties defined above, to specific parts of the mesh


In [None]:
! head -320 DATA/Par_file | tail -n 29

In [None]:
# Look at the `interfacesfile` which defines boundary interfaces 
! cat /home/scoped/specfem2d/EXAMPLES/simple_topography_and_also_a_simple_fluid_layer/DATA/interfaces_simple_topo_curved.dat

!!! TO DO ANALYZE MESH PARAMETERS AND INTERFACE FILE !!!

### c) Setting Up Meshfem

- We need to set a few `Par_file` parameters to tell SPECFEM to output additional files that will facilitate understanding the outputs of `xmeshfem2D`.
- We will use the `seisflows sempar` commmand to print and edit values from the SPECFEM2D `Par_file`. This is simply a convenience function but can be replaced by bash commands like 'cat' + 'awk', or by opening the `Par_file` with a text editor.


The `sempar` (spectral element method parameter) command syntax is as follows
```python
seisflows sempar -P <Par_file> <key> <value:optional>
```
where <Par_file> is the path the SPECFEM `Par_file`, \<key> represents a parameter in the par file (case-insensitive) and \<value> is an optional parameter to overwrite an existing parameter. 
    

In [None]:
! seisflows sempar -P DATA/Par_file model
! seisflows sempar -P DATA/Par_file nproc 4
! seisflows sempar -P DATA/Par_file setup_with_binary_database 1
! seisflows sempar -P DATA/Par_file save_model binary

#### Meshfem Parameter Explanations

`MODEL`: Must be 'default' to use the model defined in the `Par_file` (this is the default option)  
`NPROC`: Number of MPI processes to run on. The mesh itself is partitioned into `NPROC` sections, each of which is provided to a separate processor.  
`setup_with_binary_database`: Writes database files in binary format, whereas by default they are not saved  
`SAVE_MODEL`: Write model files in Fortran `binary` format. As opposed to other formats like ASCII

#### Database Files

- Database files are files in which SPECFEM stores its internal representation of mesh and model
- These can take on various formats but in this workshop we store them as Fortran binary files
- SPECFEM3D operates in the same manner, storing mesh and model representations in DATABASE files

### d) Run Meshfem Executable

- This example problem is already set up to run, so we simply execute: `xmeshfem2D` 
- Under the hood, SPECFEM will look for relevant data in the *DATA/* directory
- It then generates our numerical mesh in the *OUTPUT_FILES/* directory. 
- The *model* will not be output until we run the solver (`xspecfem2D`) later in the notebook
- We run the problem with MPI on n=4 processors.
- We also redirect the output to a log file so that we can take a look at different parts of it.

In [None]:
! mpirun -n 4 bin/xmeshfem2D > OUTPUT_FILES/output_meshfem2d.txt

In [None]:
# The log file contains important information on how the mesher ran
! head OUTPUT_FILES/output_meshfem2d.txt

In [None]:
# e.g., the mesher has created our STATIONS File
! head -255 OUTPUT_FILES/output_meshfem2d.txt | tail -n 30

In [None]:
# The STATIONS file matches what the log file tells us 
! cat DATA/STATIONS

!!! TODO Look at other parts of the output log file here !!!!

In [None]:
# Database files are stored in the OUTPUT_FILES/ directory as FORTRAN binary (.bin) files
# The Database stores the internal definition of the numerical mesh
! ls OUTPUT_FILES/Database*.bin

## 4) Running the solver `xspecfem2D`

- The Solver `xspecfem2D` will now take the Database files generated by `xmeshfem2D` and run a forward simulation 
- `xspecfem2D` uses the provided `SOURCE` and `STATIONS` files. 
- We will view some of the parameters to look at how the output synthetic seismograms are generated

In [None]:
%cd /home/scoped/specfem2d

In [None]:
# Look at the Solver-specific parameters
! head -168 DATA/Par_file | tail -n 32

### Important Solver Parameters

`seismotype`: Sets the units of the output seismograms. This example outputs in units of 'displacement'  
`USER_T0`: Defines when the earlist starting time is, prior to time step 0. This allows some zero padding before initiating the source, and is useful e.g., in cases where you have very short source-receiver distances  
`save_ASCII_seismograms`: Outputs seismograms in two-column ASCII files.

In [None]:
# Run the solver on 4 cores
! mpirun -n 4 bin/xspecfem2D > OUTPUT_FILES/output_solver.txt

In [None]:
# Again, the log file contains important information on the process
! head OUTPUT_FILES/output_solver.txt

In [None]:
# Mesh and Model parameters are assigned here
! head -280 OUTPUT_FILES/output_solver.txt | tail -n 16

In [None]:
# During the simulation, the log file updates the User on progress
! head -1023 OUTPUT_FILES/output_solver.txt | tail -n 33

In [None]:
# The solver writes out model files at the end of the simulation
! head -1400 OUTPUT_FILES/output_solver.txt | tail -n 23

!!! TODO Go through more of the solver log here !!!

## 5) Understanding SPECFEM2D Output Files

- `xspecfem2D` has created a plethora of results
- We will have a look one by one to see what each of these files are, and how they can help us understand our simulation.

### a) Velocity Model

- `xspecfem2D` outputs the velocity model into the *DATA/* directory. 
- We can use some utility functions written into `SeisFlows` to plot this model to help us visualize our domain.
- SPECFEM3D outputs velocity model files in `LOCAL_PATH` which is commonly `OUTPUT_FILES/DATABASES_MPI`
- We'll use Python to visualize results from our simulation

In [None]:
%cd /home/scoped/specfem2d

In [None]:
# The .bin files define our velocity model
! ls DATA/*bin

In [None]:
# Grab STATION coordinates by reading ACSII files
sta_x, sta_z = np.genfromtxt("DATA/STATIONS", dtype=float, usecols=[2, 3]).T
sta_id = np.genfromtxt("DATA/STATIONS", dtype=str, usecols=[0]).T

# Grab SOURCE coordinates from SOURCE file
source_file = f"DATA/SOURCE"
with open(source_file, "r") as f:
    lines = f.readlines()
    
# Trying to break apart the following line
# 'xs = 299367.72      # source location x in meters\n'
ev_x = float(lines[2].split("=")[1].split("#")[0].strip())
ev_z = float(lines[3].split("=")[1].split("#")[0].strip())

In [None]:
# We can use SeisFlows to plot this model in 2D as it knows hwo to read .bin files
m = Model(path="DATA")
m.plot2d(parameter="vs", show=False)

# Plot SOURCE and STATIONS on top of the model
for x_, z_, id_ in zip(sta_x, sta_z, sta_id):
    plt.scatter(float(x_), float(z_), c="g", marker="v", ec="k", s=50)
    plt.text(x_, z_, id_)
plt.scatter(ev_x, ev_z, c="y", marker="*", ec="k", s=250)

- Model above shows shear wave velocities (Vs) in a 2D domain
- Our model is defined by 3 distinct layers. 
    - Top: from Z=3500m down to Z=2000m, features a moderate velocity with topography at the surface (Z>3000m). 
    - Middle: from Z=2000m down to Z=1000m shows a low velocity zone with a high-velocity column (turquoise square). 
    - Bottom: from Z=1000m to Z=0m, features a realtively fast velocity. 
- 22 Station locations (green triangles), along the top boundary, and in a 'borehole' below the event
- 1 event, yellow star, colocated with station S0012
- Each interface (topography and contact between layers), was defined in a file specified by `Par_file` parameter `interfacesfile`.

In [None]:
# We can compare the interface file with the mesh above
! head -39 EXAMPLES/simple_topography_and_also_a_simple_fluid_layer/DATA/interfaces_simple_topo_curved.dat | tail -n 34

### b) Synthetic waveforms

- During the simulation, `xspecfem2D` initiated the `SOURCE` file at time 0. 
- Over the course of the simulation, seismic waves propogated outward and were recorded at synthetic receiver locations defined by the `STATIONS` file. 
- Each `STATION` therefore has a corresponding synthetic seismogram located in the *OUTPUT_FILES/* directory.
- We specified output in units of displacement with parameter `seismotype`, so our synthetics have the file extension `.semd`
- Here, 'd' stands for displacement. Velocity seismograms would be extension'ed `.semv`, acceleration `.sema`. This is the same in SPECFEM3D
- Synthetic waveforms can be generated in a variety of formats. For simplicity we have chosen to output our synthetics in ACSII format. These ASCII files are two columns, representing time and amplitude, respectively.

In [None]:
# We have generated synthetics for each station location shown above
! ls OUTPUT_FILES/*.semd

In [None]:
# The first 10 lines of a seismogram show the two-column (time, amplitude) format
! head -10 OUTPUT_FILES/AA.S0001.BXX.semd

In [None]:
# We can easily plot these using NumPy and Matplotlib
data = np.loadtxt("OUTPUT_FILES/AA.S0001.BXX.semd", dtype=float)
plt.plot(data[:,0], data[:,1], c="k")
plt.title("AA.S0001.BXX.semd")
plt.xlabel("Time [s]")
plt.ylabel("Displacement [m]")

In [None]:
# SeisFlows also has a simple command line tool to plot seismograms using ObsPy
! seisflows plotst OUTPUT_FILES/AA.S0001.BXX.semd --savefig AA.S0001.BXX.semd.png
Image("AA.S0001.BXX.semd.png")

In [None]:
# We can use PySEP's record section (RecSec) tool to plot SPECFEM2D synthetics
# Because SPECFEM2D's SOURCE files don't contain origin time information, RecSec uses a dummy time
! recsec --syn_path OUTPUT_FILES/ --cmtsolution DATA/SOURCE --stations DATA/STATIONS --cartesian --overwrite
Image("record_section.png")

### c) SPECFEM2D Wavefield Snapshots 

- `xspecfem2D` generates snapshots of the forward wavefield. 
- These are automatically generated during a simulation as .jpg files
- The `Par_file` parameter `NTSTEP_BETWEEN_OUTPUT_IMAGES` controls how often it generates figures during a simulation. 
- We can see below that `DT`=.0011, so we output images every 0.11s of simulation time. 

In [None]:
# Use sempar to display parameter values
! seisflows sempar -P DATA/Par_file ntstep_between_output_images
! seisflows sempar -P DATA/Par_file dt

In [None]:
# Wavefield snapshots every 100 time steps
! ls OUTPUT_FILES/*.jpg

In [None]:
# Forward wavefield at NSTEP=100, T=.11s
Image("OUTPUT_FILES/forward_image000000100.jpg")

In [None]:
# Forward wavefield at NSTEP=400, T=.44s
Image("OUTPUT_FILES/forward_image000000400.jpg")

In [None]:
# Forward wavefield at NSTEP=700, T=.77s
Image("OUTPUT_FILES/forward_image000000700.jpg")

## 6) Conclusions

- In this notebook we explored SPECFEM2D, and learned to run the default example mesh generation and forward simulation.  
- We took a look at the most important files required for a simulation, and how User's can manipulate various parameters and files to run their own simulations. 
- We had a look at the results of a SPECFEM2D simulation, including waveforms, models, and wavefield snapshots.