<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Overview" data-toc-modified-id="Overview-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Overview</a></span></li><li><span><a href="#Parameter-settings" data-toc-modified-id="Parameter-settings-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Parameter settings</a></span></li><li><span><a href="#PFLOTRAN-preparation-&amp;-spin-up" data-toc-modified-id="PFLOTRAN-preparation-&amp;-spin-up-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>PFLOTRAN preparation &amp; spin-up</a></span><ul class="toc-item"><li><span><a href="#Generate-PFLOTRAN.in-and-parameter.h5" data-toc-modified-id="Generate-PFLOTRAN.in-and-parameter.h5-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Generate <code>PFLOTRAN.in</code> and <code>parameter.h5</code></a></span></li><li><span><a href="#Model-spin-up" data-toc-modified-id="Model-spin-up-3.2"><span class="toc-item-num">3.2&nbsp;&nbsp;</span>Model spin-up</a></span></li></ul></li><li><span><a href="#DART-files-preparation" data-toc-modified-id="DART-files-preparation-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>DART files preparation</a></span><ul class="toc-item"><li><span><a href="#Generate-the-templates-for-DART-generic-variable-quantity-files" data-toc-modified-id="Generate-the-templates-for-DART-generic-variable-quantity-files-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>Generate the templates for DART generic variable quantity files</a></span></li><li><span><a href="#Generate-prior.nc-from-the-model-spin-up" data-toc-modified-id="Generate-prior.nc-from-the-model-spin-up-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>Generate <code>prior.nc</code> from the model spin-up</a></span></li><li><span><a href="#Generate-input.nml" data-toc-modified-id="Generate-input.nml-4.3"><span class="toc-item-num">4.3&nbsp;&nbsp;</span>Generate <code>input.nml</code></a></span></li><li><span><a href="#Convert-observation-to-DART-observation-format" data-toc-modified-id="Convert-observation-to-DART-observation-format-4.4"><span class="toc-item-num">4.4&nbsp;&nbsp;</span>Convert observation to DART observation format</a></span></li><li><span><a href="#Run-check_model_mod" data-toc-modified-id="Run-check_model_mod-4.5"><span class="toc-item-num">4.5&nbsp;&nbsp;</span>Run <code>check_model_mod</code></a></span></li></ul></li><li><span><a href="#(TODO)-Run-DART-and-PFLOTRAN" data-toc-modified-id="(TODO)-Run-DART-and-PFLOTRAN-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>(TODO) Run DART and PFLOTRAN</a></span></li></ul></div>

# Overview
- The objective of this notebook.
- Data Assimilation Research Testbed (DART) background
- Include the links for the below sections

# Parameter settings

In [59]:
import re
import pickle

In [60]:
# Directories
obs_kind_dir    = '../obs_kind/'
obs_type_dir    = '../obs_type/'
obs_convert_dir = '../obs_converter/'
utils_dir       = '../utils/'
work_dir        = '../work/'
pflotran_in_dir = '../pflotran_input/'
pflotran_out_dir= '../pflotran_output/'
dart_data_dir   = '../dart_inout/'

# DART file names
def_obs_kind   = obs_kind_dir+'DEFAULT_obs_kind_mod.f90'
obs_type       = obs_type_dir+'obs_def_pflotran_mod.f90'
input_nml      = work_dir+'input.nml'
input_nml_dict = work_dir+'inputnml.p'

# PFLOTRAN file names
pflotran_sh   = utils_dir+'pflotran.sh'
pflotran_exe  = '/Users/jian449/Codes/pflotran/src/pflotran/pflotran'
pflotran_in   = pflotran_in_dir+'pflotran.in'
pflotran_para = pflotran_in_dir+'parameter.h5'
pflotran_out  = pflotran_out_dir+'R[ENS].h5'

# Data file names, including observations and model input/output
obs_original = pflotran_in_dir+'temperature.csv'
obs_nc       = pflotran_in_dir+'obs_pflotran.nc'
obs_dart     = dart_data_dir+'obs_seq_pflotran.out'
dart_prior_nc= dart_data_dir+'prior_R[ENS].nc'

# Some shell scripts or executable files
compile_convert_nc      = work_dir+'dart_seq_convert.csh'
compile_model_check_mod = work_dir+'check_model_mod.csh'
run_filter              = work_dir+'run_filter.csh'
advance_model           = work_dir+'advance_model.csh'
convert_nc              = work_dir+'convert_nc'

# MPI settings
mpi_exe = '/usr/local/bin/mpirun'
ncore   = 1

# Utility file names
csv_to_nc          = utils_dir+'csv2nc.py'
to_dartqty         = utils_dir+'list2dartqty.py'
obs_type_qty_ind   = utils_dir+'obs_type_qty_ind.txt'
prep_pflotran_in   = utils_dir+'prepare_pflotran_inpara.py'
prep_convert_nc    = utils_dir+'prepare_convert_nc.py'
prep_prior_nc      = utils_dir+'prepare_prior_nc.py'
prep_inputnml      = utils_dir+'prepare_input_nml.py'
prep_convertncnml  = utils_dir+'prepare_convertnc_nml.py'

In [61]:
# Data assimilation configurations
# More need to be added...
# And later on, these DA setting can be saved in a txt or pickel file for further loading
obs_timestep = 300.0  # second
obs_error    = 0.1    # observation error
spinup       = 1      # whether model spinup
nens         = 30     # number of ensembles

# Specify PFLOTRAN variables used as observation and state vector/parameters in DART
obs_var_set = ['TEMPERATURE']
para_set    = ['FLOW_FLUX','POROSITY','THERMAL_CONDUCTIVITY']
# state_set   = ['LIQUID_SATURATION','LIQUID_PRESSURE', 'TEST_VARIABLE', 'TEST_VARIABLEAAA'
pflotran_parastate_set = obs_var_set + para_set

# PFLOTRAN preparation & spin-up
**Here, we use Kewei's 1D thermal model as an example for generating PFLOTRAN input card and parameter.h5.**

In this section, the following procedures are conducted for PFLOTRAN:
- generate ```PFLOTRAN.in``` (for 1D thermal model, check Kewei's code)
- generate the parameter files in HDF 5, ```parameter.h5``` (for 1D thermal model, check Kewei's code)
- conduct model spin-up and generate PFLOTRAN output in HDF 5 format, ```PFLOTRAN.h5```

## Generate ```PFLOTRAN.in``` and ```parameter.h5```
- Run: ```prepare_pflotran_inpara.py```
- Code input arguments:
    - filename for ```pflotran.in```
    - filename for ```parameter.h5```
    - data assimilation settings (**to be revised later on**): observation timestep, observation error, number of ensemble, whether it is spinup

In [62]:
%%script bash -s "$prep_pflotran_in" "$pflotran_in" "$pflotran_para" "$obs_timestep" "$obs_error" "$nens" "$spinup"
python $1 $2 $3 $4 $5 $6 $7

Finished generating the input card for PFLOTRAN...
Finished generating the DBASE for PFLOTRAN...


## Model spin-up
Take in the ```pflotran.in``` and ```parameter.h5``` files and conduct the model spin-up by running ```pflotran.sh``` file. The ```pflotran.sh``` is a simple shell script executing ensemble simulation of PFLOTRAN by using MPI.

In [63]:
%%script bash -s "$pflotran_sh" "$pflotran_exe" "$pflotran_in" "$pflotran_out_dir" "$nens" "$mpi_exe" "$ncore"
echo $3
$1 $2 $3 $4 $5 $6 $7

../pflotran_input/pflotran.in
 here
          30
------------------------------ Provenance --------------------------------------
pflotran_compile_date_time = unknown
pflotran_compile_user = unknown
pflotran_compile_hostname = unknown
pflotran_changeset = unknown
pflotran_status = unknown
petsc_changeset = unknown
petsc_status = unknown
--------------------------------------------------------------------------------
 "grid_structured_type" set to default value.
 Opening hdf5 file: ../pflotran_input/parameter.h5
 pflotran card:: TIMESTEPPER
 pflotran card:: TIMESTEPPER
 pflotran card:: NEWTON_SOLVER
 pflotran card:: LINEAR_SOLVER
 pflotran card:: NEWTON_SOLVER
 pflotran card:: LINEAR_SOLVER
 pflotran card:: GRID
 pflotran card:: FLUID_PROPERTY
 "FLUID_PROPERTY,diffusion_coeffient units" set to default value.
 pflotran card:: MATERIAL_PROPERTY
   Name :: Alluvium
 "MATERIAL_PROPERTY,rock density units" set to default value.
 "MATERIAL_PROPERTY,specific heat units" set to default value.
 

# DART files preparation
In this section, the following procedures are conducted for DART:
- generate the template for DART generic variable quantity files (i.e., ```DEFAULT_obs_kind_mod.F90``` and ```obs_def_pflotran_mod.f90```);
- run ```preprocess``` to generate the rest of DART generic variable quantity files;
- convert the observation file to DART observation format;
- generate ```prior.nc``` at the first time step;
- generate ```input.nml```;
- conduct ```check_model_mod```.

## Generate the templates for DART generic variable quantity files
- Run: ```list2dartqty.py``` to sequentially generate
    - a mapping between PFLOTRAN variales and DART generic quantities in ```obs_def_pflotran_mod.F90```
    - the default DART generic quantity definition file ```DEFAULT_obs_kind_mod.F90```
- Code input arguments:
    - filename for ```DEFAULT_obs_kind_mod.F90```
    - filename for ```obs_def_pflotran_mod.F90```
    - a list of variables required to be assimilated

In [80]:
%%script bash -s "$to_dartqty" "$obs_type" "$def_obs_kind" "$pflotran_parastate_set"
python $1 $2 $3 $4

No new DART variable quantity is added...
Finished generating the ../obs_kind/DEFAULT_obs_kind_mod.f90...
Finished generating the ../obs_type/obs_def_pflotran_mod.f90...


## Generate ```prior.nc``` from the model spin-up
- The structure of ```prior_R[ENS].nc``` file (```[ENS]``` refers to the ensemble number):

| NetCDF dimensions |                      NetCDF variables                      |
|:-----------------:|:----------------------------------------------------------:|
| time: 1           | time: shape(time)                                          |
| x_location: nx    | x_location: shape(x_location)                              |
| y_location: ny    | y_location: shape(y_location)                              |
| z_location: nz    | z_location: shape(z_location)                              |
| member: 1         | member: shape(member)                                      |
|                   | physical variable: shape(x_location,y_location,z_location) |

**Note that** required by DART, each ```prior_R[ENS].nc``` file only includes the state/parameter values of one ensemble member at one given time. For the time, PFLOTRAN generates the time step with the initial time as 0, with time units converted *day* (requied by DART's ```read_model_time``` subroutine). Also, it is different from the definition for the [observation NetCDF](#observationconvertion), because ```prior_R[ENS].nc``` aims for the structured cartesian grids while the observation NetCDF aims for a general case.

- Run: ```prepare_prior_nc.py``` to generate 
    - the prior input file ```prior_R[ENS].nc``` for DART
    - the prior template file (copied from ```prior_R1.nc```) for ```input.nml```
- Code input arguments:
    - filename ```R[ENS].h5``` from PFLOTRAN model output
    - pflotran parameter HDF file ```parameter.h5```
    - filename ```prior_R[ENS].nc``` for the prior input file for DART
    - number of ensemble
    - a list of variables to be assimilated

In [91]:
%%script bash -s "$prep_prior_nc" "$pflotran_out" "$pflotran_para" "$dart_prior_nc" "$nens" "$pflotran_parastate_set"
python $1 $2 $3 $4 $5 $6

Converting state/parameter into NetCDF file for ensemble 1...
Converting state/parameter into NetCDF file for ensemble 2...
Converting state/parameter into NetCDF file for ensemble 3...
Converting state/parameter into NetCDF file for ensemble 4...
Converting state/parameter into NetCDF file for ensemble 5...
Converting state/parameter into NetCDF file for ensemble 6...
Converting state/parameter into NetCDF file for ensemble 7...
Converting state/parameter into NetCDF file for ensemble 8...
Converting state/parameter into NetCDF file for ensemble 9...
Converting state/parameter into NetCDF file for ensemble 10...
Converting state/parameter into NetCDF file for ensemble 11...
Converting state/parameter into NetCDF file for ensemble 12...
Converting state/parameter into NetCDF file for ensemble 13...
Converting state/parameter into NetCDF file for ensemble 14...
Converting state/parameter into NetCDF file for ensemble 15...
Converting state/parameter into NetCDF file for ensemble 16...
C

## Generate ```input.nml```
Provide parameters for different namelists in input.nml
- [filter_nml](https://www.image.ucar.edu/DAReS/DART/Manhattan/assimilation_code/modules/observations/obs_sequence_mod.html): namelist of the main module for driving ensemble filter assimilations
- [obs_kind_nml](https://www.image.ucar.edu/DAReS/DART/Manhattan/assimilation_code/modules/observations/obs_kind_mod.html#Namelist): namelist for controling what observation types are to be assimilated
- [preprocess_nml](file:///Users/jian449/Codes/DART/manhattan/assimilation_code/programs/preprocess/preprocess): namelist of the DART-supplied preprocessor program which creates observation kind and observation definition modules from a set of other specially formatted Fortran 90 files
- model_nml: a self-defined namelist for providing the basic information in the model
- convertnc_nml: a self-defined namelist for providing the NetCDF observation file name and the DART observation file name used in ```convert_nc.f90```

***************
Assemble all the namelists in input.nml

In [93]:
# Parameters for different namelists in input.nml
filter_nml = {"input_state_file_list":"filter_input_list.txt",
              "output_state_file_list":"filter_output_list.txt",
              "ens_size":nens,
              "async":2,
              "adv_ens_command":advance_model,
              "obs_sequence_in_name":obs_dart}
obs_kind_nml = {"assimilate_these_obs_types":obs_var_set}
model_nml = {"time_step_days":0,
             "time_step_seconds":0,
             "nvar":len(pflotran_parastate_set),
             "var_names":pflotran_parastate_set,
             "template_file":re.sub(r"\[ENS\]",'template',dart_prior_nc),
             "var_qtynames":['QTY_PFLOTRAN_'+v for v in pflotran_parastate_set]}
preprocess_nml = {"input_files":obs_type,
                  "input_obs_kind_mod_file":def_obs_kind}
convertnc_nml = {"netcdf_file": obs_nc,
                 "out_file": obs_dart}
inputnml = {"filter_nml":filter_nml,
            "obs_kind_nml":obs_kind_nml,
            "model_nml":model_nml,
            "preprocess_nml":preprocess_nml,
            "convertnc_nml":convertnc_nml}

# Save it in a temperory pickle file
with open(input_nml_dict, 'wb') as f:
    pickle.dump(inputnml, f)

***************
- Run: ```prepare_inputnml.py```
- Code input arguments:
    - input_nml: the ```input.nml``` namelist file
    - input_nml_dict: the ```inputnml.p``` pickle file

In [94]:
%%script bash -s  "$prep_inputnml" "$input_nml" "$input_nml_dict"
python $1 $2 $3

Finished generating the input namelist file...


<a id='observationconvertion'></a>
## Convert observation to DART observation format
In this section, the observation data is converted in DART format. We first convert observation data in raw format into NetCDF format. Then, the observation file is converted into DART format. The structure of NetCDF file for recording observation file.

| NetCDF dimensions |           NetCDF variables          |
|:-----------------:|:-----------------------------------:|
| time: 1           | time: shape(time)                   |
| location: nloc    | location: shape(location)           |
|                   | physical variable: shape(time,nloc) |

**Note that** if the time calendar follows *gregorian*, the time unit should be entered as ```seconds since YYYY-MM-DD HH:MM:SS```. Otherwise, put the time calender as *None* and time unit as ```second``` (make sure convert your measurement times to seconds).

***************
- Run: ```csv2nc.py``` to convert the raw csv temperature observations to NetCDF file
- Code input arguments:
    - filename for the original observed temperature file
    - filename for the observation NetCDF file

In [68]:
%%script bash -s "$csv_to_nc" "$obs_original" "$obs_nc"
python $1 $2 $3

Finished converting raw observation in NetCDF format...


***************
- Run: ```prepare_convert_nc.py``` to prepare the ```convert_nc.f90``` based on the list of observation variables.
- Code input arguments:
    - filename for the observation NetCDF file

In [69]:
%%script bash -s "$prep_convert_nc" "$obs_nc"
python $1 $2

***************
- Run shell script: ```dart_seq_convert.csh``` to 
    - preprocess the DART generic variable quantity files prepared by the previous section 
    - generate an executable file for converting observation file in NetCDF format to DART format used by the next section

In [72]:
%%script bash -s "$work_dir" "$compile_convert_nc"
cd $1
csh $2

---------------------------------------------------------------
Removing *.o *.mod files


---------------------------------------------------------------
NetCDF converters build number 1 is preprocess
 Makefile is ready.
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/modules/utilities/types_mod.f90
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/modules/utilities/utilities_mod.f90
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/modules/utilities/time_manager_mod.f90
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/modules/utilities/null_mpi_utilities_mod.f90
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/programs/preprocess/preprocess.f90
gfortran null_mpi_utilities_mod.o ut

.........................................

***************
- Run: ```convert_nc``` to convert the observation file in NetCDF to DART format

In [17]:
%%script bash -s "$obs_convert_dir" "$convert_nc"
cd $1
$2


 --------------------------------------
 Starting ... at YYYY MM DD HH MM SS = 
                 2019  9 19 13 25 31
 Program convert_nc
 --------------------------------------

  set_nml_output Echo NML values to log file only
  write_obs_seq  opening formatted observation sequence file "../dart_inout/obs_seq_pflotran.out"
  write_obs_seq  closed observation sequence file "../dart_inout/obs_seq_pflotran.out"

 --------------------------------------
 Finished ... at YYYY MM DD HH MM SS = 
                 2019  9 19 13 25 33
 --------------------------------------



## Run ```check_model_mod```
- Run shell script: ```check_model_mod.csh``` to check the model_mod.F90 interface

In [92]:
%%script bash -s "$work_dir" "$compile_model_check_mod"
cd $1
csh $2



---------------------------------------------------------------
pflotran build number 1 is preprocess
 Makefile is ready.
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/modules/utilities/types_mod.f90
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/modules/utilities/utilities_mod.f90
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/modules/utilities/time_manager_mod.f90
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/modules/utilities/null_mpi_utilities_mod.f90
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/programs/preprocess/preprocess.f90
gfortran types_mod.o utilities_mod.o null_mpi_utilities_mod.o time_manager_mod.o preprocess.o -o preprocess  -O2 -ffree-line-length-no

...................................................

# (TODO) Run DART and PFLOTRAN
In this section, run the shell script to couple DART and PFLOTRAN

In [None]:
%%script bash -s "$work_dir" "$run_filter"
cd $1
csh $2