# Working Directory Structure

SeisFlows3 hardcodes it's own working directory when executing a workflow. Below we explore the working directory set up by the SPECFEM2D-workstation example. Working directories may change slightly depending on the chosen workflow, but will more or less follow the following structure. The two specfem2d directories listed below are not part of the SeisFlows3 working directory.

In [23]:
%cd ~/Work/official/workshop_pyatoa_sf3/ex1_specfem2d_workstation
! ls

/home/bchow/Work/official/workshop_pyatoa_sf3/ex1_specfem2d_workstation
logs	output_sf3.txt	 scratch	    stats
output	parameters.yaml  specfem2d_workdir


----------------------
## scratch/
The active working directory of SeisFlows3 where all of the heavy lifting takes place. Each module in the SeisFlows3 package may have it's own sub-directory where it stores temporary work data. Additionally, we have two eval*/ directories where objective function evaluation (evalfunc) and gradient evaluation (evalgrad) files are stored.

In [24]:
! ls scratch

evalfunc  evalgrad  optimize  preprocess  solver  system


### solver/

A collection of event-specific directories (one directory per event), each of which is a self contained SPECFEM run directory (i.e., they contain all the necessary files to run SPECFEM binaries within). 

In [25]:
! ls scratch/solver

001  002  003  mainsolver


In [26]:
! ls scratch/solver/mainsolver

bin  DATA  kernel_paths  mesher.log  OUTPUT_FILES  SEM	solver.log  traces


The bin/, DATA/ and OUTPUT_FILES/ directories are the same as those found in SPECFEM. The SEM file defines the locations of the adjoint sources, which is dictated by SPECFEM. The traces/ directory contains all of the output waveforms required by this event. They are separated into observed (obs), synthetic (syn) and adjoint (adj) waveforms. 

In [27]:
! ls scratch/solver/mainsolver/traces

adj  obs  syn


In [28]:
! ls scratch/solver/mainsolver/traces/obs

AA.S0001.BXY.semd


In [33]:
# These waveforms are saved into a two-column ASCII format
! tail scratch/solver/mainsolver/traces/obs/AA.S0001.BXY.semd

   251.39999999999998         -1.1814422395268879E-005
   251.45999999999998         -1.1800275583562581E-005
   251.51999999999998         -1.1769315129746346E-005
   251.57999999999998         -1.1721248953632887E-005
   251.63999999999999         -1.1655830825336088E-005
   251.69999999999999         -1.1572872866742356E-005
   251.75999999999999         -1.1472248505521453E-005
   251.81999999999999         -1.1353902449899163E-005
   251.88000000000000         -1.1217847351013855E-005
   251.94000000000000         -1.1064166223014224E-005


### optimize/

Values relating to the optimization algorithm. These variables define model vectors, misfits, gradient directions and search directions. Optimization vectors are stored as NumPy arrays and tagged with the .npy suffix. Optimization scalars are stored as text files and tagged with the .txt suffix.

Optimization Variable Names are described as:

In [41]:
! ls scratch/optimize

alpha.npy  f_old.txt  g_old.npy  m_new.npy  p_old.npy
f_new.txt  f_try.txt  LBFGS	 m_old.npy


In [43]:
import numpy as np
m_new = np.load("scratch/optimize/m_new.npy")
print(m_new)

[5800.         5800.         5800.         ... 3499.77655379 3499.9021825
 3499.99078301]


In [45]:
! cat scratch/optimize/f_new.txt

2.591424e-03


### evalfunc/ & evalgrad/

Scratch directories containing objective function evaluation and gradient evaluation files. These include (1) the current **model** being used for misfit evaluation, and (2) **residuals** which define the misfit for each event. **evalgrad/** also contains **kernels** which define per-event kernels which are summed and manipulated with the postprocess module.

In [48]:
! ls scratch/evalfunc
! echo
! ls scratch/evalgrad

model  residuals

kernels  model	residuals


In [49]:
! ls scratch/evalgrad/residuals

001  002  003


In [55]:
! cat scratch/evalgrad/residuals/001

2.413801941841247842e-02
2.413801941841247842e-02
2.413801941841247842e-02


In [56]:
! ls scratch/evalgrad/kernels

001  002  003  sum


In [57]:
! ls scratch/evalgrad/kernels/sum

proc000000_vp_kernel.bin  proc000000_vs_kernel.bin


### system & preprocess

These two directories are empty in our example problem, but are catch-all directories where module-specific files can be output. If you are extending SeisFlows3 with other base or subclasses, it is preferable to adhere to this structure where each module only interacts with it's own directory

In [58]:
! ls scratch/system
! ls scratch/preprocess

---------------------
##  output/
The current active state of SeisFlows3, containing pickle (.p) and JSON files which describe a Python environment of a current workflow. Additionally files to be permanently saved (e.g., models, graidents, traces) can be located here. These are tagged in ascending order, e.g., model_0001 refers to the updated model derived during the first iteration.

In [38]:
! ls output

gradient_0001  seisflows_optimize.p	  seisflows_solver.p
kwargs	       seisflows_parameters.json  seisflows_system.p
model_0001     seisflows_paths.json	  seisflows_workflow.p
model_init     seisflows_postprocess.p
model_true     seisflows_preprocess.p


In [39]:
! ls output/model_0001

proc000000_vp.bin  proc000000_vs.bin


In [40]:
! ls output/gradient_0001

proc000000_vp_kernel.bin  proc000000_vs_kernel.bin


-----------------------------
## logs/
Where any text logs are stored. If running on a cluster, all submitted jobs will be instructed to write their logs into this directory. Additionally, if a workflow is resumed (previous log files exist in the other directory) copies are saved to this directory.


In [37]:
! ls logs

output_sf3_001.txt  parameters_001.yaml


------------------------------
## stats/

Text files describing the optimization statistics of the current workflow. This directory is only relevant if you are running an inversion workflow. 

In [6]:
! ls stats

factor.txt	      line_search.txt  slope.txt	theta.txt
gradient_norm_L1.txt  misfit.txt       step_count.txt
gradient_norm_L2.txt  restarted.txt    step_length.txt


In [7]:
! cat stats/step_count.txt

ITER          STEP_COUNT
   1        0.000000E+00
   1        2.000000E+00


----------------------------
## output_sf3.txt

The main log file for SeisFlows3, where all log statements written to stdout are recorded during a workflow.

In [36]:
! head -50 output_sf3.txt

2022-04-29 16:45:35 | initializing SeisFlows3 in sys.modules
2022-04-29 16:45:39 | copying par/log file to: /home/bchow/Work/official/workshop_pyatoa_sf3/ex1_specfem2d_workstation/logs/output_sf3_001.txt
2022-04-29 16:45:39 | copying par/log file to: /home/bchow/Work/official/workshop_pyatoa_sf3/ex1_specfem2d_workstation/logs/parameters_001.yaml
2022-04-29 16:45:39 | exporting current working environment to disk
2022-04-29 16:45:39 | 
////////////////////////////////////////////////////////////////////////////////
                   WORKFLOW WILL STOP AFTER FUNC: 'finalize'                    
////////////////////////////////////////////////////////////////////////////////
2022-04-29 16:45:39 | 
                          STARTING INVERSION WORKFLOW                           
2022-04-29 16:45:39 | 
////////////////////////////////////////////////////////////////////////////////
                                ITERATION 1 / 1                                 
////////////////