# Overview
The **objective** of this notebook is to present the flow chart of conducting data assimilation on [PFLOTRAN](https://www.pflotran.org/) by using [DART](https://www.image.ucar.edu/DAReS/DART/). Briefly, the procedures are as follows:
- [x] [Configuration](#parameter): define directories, file locations, and other parameters
- [x] [PFLOTRAN preparation](#pflotran_prepare): generate PFLOTRAN input files
- [x] [PFLOTRAN model spin-up](#pflotran_spinup): conduct model spin-up
- [x] [DART files preparation](#dart_prepare): add new DART quantities, prepare DART input namelists, prepare DART prior data, prepare observations in DART format, and check ```model_mod``` interface
- [x] [Generate all the executable files](#dart_executables): generate all the executables, convert observations in DART format, check ```model_mod``` interface, and test the filter
- [ ] [Run DART and PFLOTRAN](#run_dart_pflotran): run the shell script for integrating DART filter and PFLOTRAN model

Here, we perform inverse modeling on a 1D thermal model for illustration. The model assimilates temperature observation to update its parameters (i.e., flow flux, porosity, and thermal conductivity). For now, the ensemble Kalman filter (EnKF) is used for assimilation.

<a id='parameter'></a>
# Configuration

In [118]:
import os
import re
import sys
import shutil
import pickle
import f90nml
from math import floor
from datetime import datetime, timedelta
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [119]:
# MPI settings
mpi_exe  = '/usr/local/bin/mpirun'  # The location of mpirun
ncore_da = 1                        # The number of MPI cores used for DART
ncore_pf = 1                        # The number of MPI cores used for PFLOTRAN

# PFLOTRAN executable
pflotran_exe  = '/Users/jian449/Codes/pflotran/src/pflotran/pflotran'

# Main directory names
temp_app_dir = os.path.abspath("../template" )          # The template for application folder
app_dir      = os.path.abspath("../1dthermal")          # The application folder name
dart_dir     = os.path.abspath("../../../../")
dart_pf_dir  = os.path.join(dart_dir, "models/pflotran")     # The dart pflotran utitlity folder name

# configs = {}
configs = f90nml.namelist.Namelist()
configs["main_dir_cfg"] = {"app_dir": app_dir, "dart_dir": dart_dir, "dart_pf_dir": dart_pf_dir}
configs["exe_cfg"]      = {"pflotran_exe": pflotran_exe, "mpi_exe": mpi_exe, 
                           "ncore_pf": ncore_pf, "ncore_da": ncore_da}

****************
**Prepare the application directory**

In [120]:
# Create the application directory if it does not exists
if not os.path.isdir(app_dir):
    shutil.copytree(temp_app_dir, app_dir)

****************
**Load all the required file paths/names from ```file_paths.nml```**

In [121]:
sys.path.append(dart_pf_dir)
from utils.read_filepaths_nml import read_filepaths_nml

In [122]:
dirs_cfg, files_cfg  = read_filepaths_nml(app_dir=app_dir, dart_pf_dir=dart_pf_dir)
configs["other_dir_cfg"] = dirs_cfg
configs["file_cfg" ]     = files_cfg
config_file              = files_cfg["config_file"]

****************
**Specify the observation data to be assimilated and the PFLOTRAN parameters to be analyzed in DART**

In [123]:
# Observation data to be assimilated
obs_set  = ['TEMPERATURE']
# The PFLOTRAN parameters to be analyzed
para_set = ['FLOW_FLUX','POROSITY','THERMAL_CONDUCTIVITY']

configs["obspara_set_cfg"] = {"obs_set": obs_set, "para_set": para_set}

****************
**Specify the temporal information**
- model spinup time/start time
- the map between the begin of observation assimilation and model start time

**note that** model start time is considered after the spinup

In [124]:
time_cfg = {}
# Model spinup length
time_cfg["spinup_length"]  = 0.5    # spinup time (day)
time_cfg["is_spinup_done"] = False  # whether spinup is conducted

# Model start time
time_cfg["current_model_time"] = 0.     # model start time zero (after spinup)

# Map between assimilation start time and model start time
assim_start = datetime(2017,4,1,0,0,0)
time_cfg["assim_start"] = assim_start.strftime("%Y-%m-%d %H:%M:%S")

# The maximum time for observation
time_cfg["last_obs_time_days"]    = 0
time_cfg["last_obs_time_seconds"] = 1200*7
time_cfg["last_obs_time_size"] = time_cfg["last_obs_time_days"]+float(time_cfg["last_obs_time_seconds"])/86400. # day

# Save them to configs
configs["time_cfg"] = time_cfg

****************
**Define the data assimilation configurations**

In [125]:
da_cfg = {}
# More need to be added...
# And later on, these DA setting can be saved in a txt or pickel file for further loading
da_cfg["obs_reso"]  = 300.0  # second
da_cfg["obs_error"] = 0.1    # observation error
da_cfg["nens"]      = 30     # number of ensembles

# Assimilation time window time_step_days+time_step_seconds
# Assimilation window
da_cfg["assim_window_days"]    = 0     # assimilation time window/step (day)
da_cfg["assim_window_seconds"] = 1200  # assimilation time window/step  (second)
da_cfg["assim_window_size"] = da_cfg["assim_window_days"]+float(da_cfg["assim_window_seconds"])/86400. # day

# Assimilation start and end time
da_cfg["assim_start_days"]    = int(floor(time_cfg["current_model_time"]))
da_cfg["assim_start_seconds"] = int((time_cfg["current_model_time"] - da_cfg["assim_start_days"])*86400)
da_cfg["assim_end_days"]      = int(floor(time_cfg["current_model_time"]+da_cfg["assim_window_size"]))
da_cfg["assim_end_seconds"]   = int((time_cfg["current_model_time"]+da_cfg["assim_window_size"] - 
                                     da_cfg["assim_end_days"])*86400-1)

# Save them to configs
configs["da_cfg"] = da_cfg

****************
**Save all the configurations in pickle**

In [126]:
# Save it in a temperory pickle file or namelist???
# with open(config_file, 'wb') as f:
#     pickle.dump(configs, f)
configs.write(config_file, force=True)

<a id='pflotran_prepare'></a>
# PFLOTRAN preparation
*Here, we use Kewei's 1D thermal model as an example for generating PFLOTRAN input card and parameter.h5.*

In this section, the following procedures are performed:
- generate PFLOTRAN input deck file ```PFLOTRAN.in```
- generate the parameter files in HDF 5, ```parameter_prior.h5```, used by PFLOTRAN input deck file

**Note that**
- ```PFLOTRAN.in``` for each DA scenario should be prepared by users.

**Run code**
- Run: ```prepare_pflotran_input.py```
- Code input arguments (loaded from the configuration file):
    - <span style="background-color:yellow">pflotran_in</span>: filename for ```pflotran.in```
    - <span style="background-color:yellow">pflotran_para</span>: filename for ```parameter_prior.h5```
    - <span style="background-color:yellow">obs_resolution, obs_error, nens, spinup_length, spinup</span>: data assimilation settings (i.e., observation timestep, observation error, number of ensemble, whether it is spinup, **to be revised**)

In [10]:
prep_pflotran_input, pflotran_in = files_cfg["prep_pflotran_input_file"], files_cfg["pflotran_in_file"]

In [11]:
%%script bash -s "$prep_pflotran_input" "$config_file"
python $1 $2

Finished generating the input card for PFLOTRAN...
Finished generating the DBASE for PFLOTRAN...


In [12]:
%%script bash -s "$pflotran_in"
head $1

#Description: 1D thermal

SIMULATION
  SIMULATION_TYPE SUBSURFACE
  PROCESS_MODELS
    SUBSURFACE_FLOW FLOW
      MODE TH
#      OPTIONS
#	REVERT_PARAMETERS_ON_RESTART
#      /


<a id='pflotran_spinup'></a>
# PFLOTRAN model spin-up
Take in the ```pflotran.in``` and ```parameter.h5``` files and conduct the model spin-up by running ```pflotran.sh``` file. The ```pflotran.sh``` is a simple shell script executing ensemble simulation of PFLOTRAN by using MPI.

**Run the code**
- Run: ```prepare_pflotran_inpara.py```
- Code input arguments (loaded from the configuration file):
    - <span style="background-color:yellow">pflotran_exe</span>: location of the executable PFLOTRAN
    - <span style="background-color:yellow">pflotran_in</span>: filename for ```pflotran.in```
    - <span style="background-color:yellow">pflotran_out_dir</span>: directory of PFLOTRAN output
    - <span style="background-color:yellow">nens</span>: number of ensemble
    - <span style="background-color:yellow">mpi_exe, ncore</span>: location of mpirun and number of cpu cores

In [13]:
pflotran_sh, pflotran_out_dir = files_cfg["pflotran_sh_file"], dirs_cfg["pflotran_out_dir"]

In [14]:
%%script bash -s "$pflotran_sh" "$config_file"
$1 $2
# %%script bash -s "$pflotran_sh" "$pflotran_exe" "$pflotran_in" "$pflotran_in_dir" "$pflotran_out_dir" "$nens" "$mpi_exe" "$ncore_pf"
# $1 $2 $3 $4 $5 $6 $7 $8

 here
          30
------------------------------ Provenance --------------------------------------
pflotran_compile_date_time = unknown
pflotran_compile_user = unknown
pflotran_compile_hostname = unknown
pflotran_changeset = unknown
pflotran_status = unknown
petsc_changeset = unknown
petsc_status = unknown
--------------------------------------------------------------------------------
 "grid_structured_type" set to default value.
 Opening hdf5 file: /Users/jian449/Codes/DART/manhattan/models/pflotran/applications/1dthermal/pflotran_input/parameter_prior.h5
 pflotran card:: TIMESTEPPER
 pflotran card:: TIMESTEPPER
 pflotran card:: NEWTON_SOLVER
 pflotran card:: LINEAR_SOLVER
 pflotran card:: NEWTON_SOLVER
 pflotran card:: LINEAR_SOLVER
 pflotran card:: GRID
 pflotran card:: FLUID_PROPERTY
 "FLUID_PROPERTY,diffusion_coeffient units" set to default value.
 pflotran card:: MATERIAL_PROPERTY
   Name :: Alluvium
 "MATERIAL_PROPERTY,rock density units" set to default value.
 "MATERIAL_PROPE

****************
**Once the model spinup finishes, modify the corresponding configuration entry**

In [15]:
configs["time_cfg"]["is_spinup_done"] = True
configs.write(config_file, force=True)

In [16]:
# %%script bash -s "$pflotran_out_dir"
# cd $1
# ls *.h5

<a id='dart_prepare'></a>
# DART files preparation
In this section, the following procedures are performed:
- generate the template for DART generic variable quantity files (i.e., ```DEFAULT_obs_kind_mod.F90``` and ```obs_def_pflotran_mod.f90```);
- generate the DART input namelists;
- generate DART prior NetCDF data ```prior_ensemble_[ENS].nc``` from PFLOTRAN's parameter and outputs;
- generate DART posterior NetCDF files (*sharing the same variable names and dimensions as the prior NetCDF files but without the data values*);
- convert the observation file to DART observation format;
- check ```model_mod.F90``` based on current setting by using the ```check_model_mod``` provided by DART.

<a id='dart_generic_prepare'></a>
## Generate the templates for DART generic variable quantity files
- Run: ```list2dartqty.py``` to sequentially generate
    - a mapping between PFLOTRAN variales and DART generic quantities in ```obs_def_pflotran_mod.F90```
    - the default DART generic quantity definition file ```DEFAULT_obs_kind_mod.F90```
- Code input arguments:
    - <span style="background-color:yellow">obs_type</span>: filename for ```DEFAULT_obs_kind_mod.F90```
    - <span style="background-color:yellow">def_obs_kind</span>: filename for ```obs_def_pflotran_mod.F90```
    - <span style="background-color:yellow">pflotran_parastate_set</span>: a list of variables required to be assimilated

In [17]:
to_dartqty, obs_type_file = files_cfg["to_dartqty_file"], files_cfg["obs_type_file"]

In [18]:
%%script bash -s "$to_dartqty" "$config_file"
python $1 $2 $3 $4

No new DART variable quantity is added...
Finished generating the /Users/jian449/Codes/DART/manhattan/models/pflotran/obs_kind/DEFAULT_obs_kind_mod.f90...
Finished generating the /Users/jian449/Codes/DART/manhattan/models/pflotran/applications/1dthermal/obs_type/obs_def_pflotran_mod.f90...


In [19]:
%%script bash -s "$obs_type_file"
cat $1

! BEGIN DART PREPROCESS KIND LIST
!TEMPERATURE,  QTY_PFLOTRAN_TEMPERATURE, COMMON_CODE
!FLOW_FLUX,  QTY_PFLOTRAN_FLOW_FLUX, COMMON_CODE
!POROSITY,  QTY_PFLOTRAN_POROSITY, COMMON_CODE
!THERMAL_CONDUCTIVITY,  QTY_PFLOTRAN_THERMAL_CONDUCTIVITY, COMMON_CODE
! END DART PREPROCESS KIND LIST


## Generate  DART input namelists in ```input.nml```

The ```input.nml``` file is generated based on a template ```input.nml.template``` by modifying the following namelist entries:

```input.nml.template``` $\rightarrow$ ```input.nml```

|filter_nml|obs_kind_nml|preprocess_nml|model_nml|convertnc_nml|
|:--:|:--:|:--:|:--:|:--:|
| input_state_file_list, output_state_file_list, ens_size, async, adv_ens_command, obs_sequence_in_name | assimilate_these_obs_types | input_files, input_obs_kind_mod_file | time_step_days, time_step_seconds, nvar, var_names, template_file, var_qtynames | netcdf_file, out_file |

**Namelists from DART**
- [filter_nml](https://www.image.ucar.edu/DAReS/DART/Manhattan/assimilation_code/modules/assimilation/filter_mod.html): namelist of the main module for driving ensemble filter assimilations
- [obs_kind_nml](https://www.image.ucar.edu/DAReS/DART/Manhattan/assimilation_code/modules/observations/obs_kind_mod.html#Namelist): namelist for controling what observation types are to be assimilated
- [preprocess_nml](https://www.image.ucar.edu/DAReS/Codes/DART/manhattan/assimilation_code/programs/preprocess/preprocess): namelist of the DART-supplied preprocessor program which creates observation kind and observation definition modules from a set of other specially formatted Fortran 90 files

**Self-defined namelists**
- model_nml: a self-defined namelist for providing the basic information in the model
    - time_step_days, time_step_seconds: the assimilation time window
    - template_file: the template prior NetCDF file for ```model_mod.F90``` to digest the spatial information of the model
    - var_names: the original variable names
    - var_qtynames: the corresponding DART variable quantities
    - nvar: the number of variables
- convertnc_nml: a self-defined namelist for providing the NetCDF observation file name and the DART observation file name used in ```convert_nc.f90```
    - netcdf_file: the location of the NetCDF file containing the observation data
    - out_file: the location of the DART observation file

**Note that**
- There are more namelists or other items in the above namelist in input.nml.template. Users can edit the below python dictionary ```inputnml``` to include their modifications.
- Users can also include more namelists provided by DART by modifying ```inputnml```.

***************
**Assemble all the namelists in input.nml**

In [127]:
# Parameters for different namelists in input.nml
filter_nml = {"input_state_file_list":files_cfg["dart_input_list_file"],
              "output_state_file_list":files_cfg["dart_output_list_file"],
              "ens_size":da_cfg["nens"],
              "num_output_state_members":da_cfg["nens"],
              "obs_sequence_in_name":files_cfg["obs_dart_file"]}
#               "obs_window_days":obs_window_days,
#               "obs_window_seconds":obs_window_seconds}
obs_kind_nml = {"assimilate_these_obs_types":obs_set}
model_nml = {"time_step_days":da_cfg["assim_window_days"],
             "time_step_seconds":da_cfg["assim_window_seconds"],
             "nvar":len(obs_set)+len(para_set),
             "var_names":obs_set+para_set,
             "template_file":files_cfg["dart_prior_template_file"],
             "var_qtynames":['QTY_PFLOTRAN_'+v for v in obs_set]+['QTY_PFLOTRAN_'+v for v in para_set]}
preprocess_nml = {"input_files":files_cfg["obs_type_file"],
                  "input_obs_kind_mod_file":files_cfg["def_obs_kind_file"]}
convertnc_nml = {"netcdf_file": files_cfg["obs_nc_file"],
                 "out_file": files_cfg["obs_dart_file"],
                 "obs_start_day": da_cfg["assim_start_days"],
                 "obs_start_second": da_cfg["assim_start_seconds"],
                 "obs_end_day": da_cfg["assim_end_days"],
                 "obs_end_second":da_cfg["assim_end_seconds"]}
modelmodcheck_nml = {"input_state_files": files_cfg["dart_prior_template_file"]}
inputnml = {"filter_nml":filter_nml,
            "obs_kind_nml":obs_kind_nml,
            "model_nml":model_nml,
            "preprocess_nml":preprocess_nml,
            "convert_nc_nml":convertnc_nml,
            "model_mod_check_nml":modelmodcheck_nml}


configs["inputnml_cfg"] = inputnml

# Save the configurations
configs.write(config_file, force=True)
# with open(config_pickle, 'wb') as f:
#     pickle.dump(configs, f)

***************
**Run the code**
- Run: ```prepare_inputnml.py```
- Code input arguments:
    - <span style="background-color:yellow">input_nml</span>: the ```input.nml``` namelist file
    - <span style="background-color:yellow">input_nml_dict</span>: the ```inputnml.p``` pickle file

In [128]:
prep_inputnml = files_cfg["prep_inputnml_file"]

In [129]:
%%script bash -s  "$prep_inputnml" "$config_file"
python $1 $2

Finished generating the input namelist file...


## Convert the model output to DART prior NetCDF and  generate the preliminary DART posterior NetCDF file
- The structure of ```prior_ensemble_[ENS].nc``` and ```posterior_ensemble_[ENS].nc``` files (```[ENS]``` refers to the ensemble number):

| NetCDF dimensions |                      NetCDF variables                      |
|:-----------------:|:----------------------------------------------------------:|
| time: 1           | time: shape(time)                                          |
| x_location: nx    | x_location: shape(x_location)                              |
| y_location: ny    | y_location: shape(y_location)                              |
| z_location: nz    | z_location: shape(z_location)                              |
| member: 1         | member: shape(member)                                      |
|                   | physical variable: shape(x_location,y_location,z_location) |

**Note that** 
- required by DART, each ```prior_R[ENS].nc``` file only includes the state/parameter values of one ensemble member at one given time. 
- For the time, we set the initial time as 0, with time units converted *day* (requied by DART's ```read_model_time``` subroutine). 
- Also, it is different from the definition for the [observation NetCDF](#observationconvertion), because ```prior_R[ENS].nc``` aims for the structured cartesian grids while the observation NetCDF aims for a general case.

**Run the code**
- Run: ```prepare_prior_nc.py``` to generate 
    - the DART prior input file ```prior_ensemble_[ENS].nc```
    - the DART posterior output file ```prior_ensemble_[ENS].nc``` (*sharing the same variable names and dimensions as the prior files but without the variable values*)
    - the prior template file (copied from ```prior_ensemble_1.nc```) used by ```input.nml```
    - the dart_input_list and dart_output_list used by DART
- Code input arguments:
    - <span style="background-color:yellow">pflotran_out</span>: filename ```R[ENS].h5``` from PFLOTRAN model output
    - <span style="background-color:yellow">pflotran_para</span>: pflotran parameter HDF file ```parameter.h5```
    - <span style="background-color:yellow">dart_prior_nc</span>: filename ```prior_R[ENS].nc``` for the prior input file for DART
    - <span style="background-color:yellow">dart_input_list</span>: filename for recording the list of dart_prior_nc
    - <span style="background-color:yellow">nens</span>: number of ensemble
    - <span style="background-color:yellow">spinup</span>: whether it is spinup (if yes, the time is set to zero; otherwise, the time is read from ```R[ENS].h5```)
    - <span style="background-color:yellow">pflotran_parastate_set</span>: a list of variables to be assimilated

In [23]:
prep_prior_nc, dart_prior_template = files_cfg["prep_prior_nc_file"], files_cfg["dart_prior_template_file"]

In [57]:
%%script bash -s "$prep_prior_nc" "$config_file"
python $1 $2

Converting state/parameter into NetCDF file for ensemble 1...
Converting state/parameter into NetCDF file for ensemble 2...
Converting state/parameter into NetCDF file for ensemble 3...
Converting state/parameter into NetCDF file for ensemble 4...
Converting state/parameter into NetCDF file for ensemble 5...
Converting state/parameter into NetCDF file for ensemble 6...
Converting state/parameter into NetCDF file for ensemble 7...
Converting state/parameter into NetCDF file for ensemble 8...
Converting state/parameter into NetCDF file for ensemble 9...
Converting state/parameter into NetCDF file for ensemble 10...
Converting state/parameter into NetCDF file for ensemble 11...
Converting state/parameter into NetCDF file for ensemble 12...
Converting state/parameter into NetCDF file for ensemble 13...
Converting state/parameter into NetCDF file for ensemble 14...
Converting state/parameter into NetCDF file for ensemble 15...
Converting state/parameter into NetCDF file for ensemble 16...
C

In [26]:
%%script bash -s "$dart_prior_template"
ncdump -h $1

netcdf prior_ensemble_template {
dimensions:
	x_location = 1 ;
	y_location = 1 ;
	z_location = 64 ;
	time = 1 ;
	member = 1 ;
variables:
	double time(time) ;
		time:units = "day" ;
		time:calendar = "none" ;
		time:type = "dimension_value" ;
	double member(member) ;
		member:type = "dimension_value" ;
	double x_location(x_location) ;
		x_location:units = "m" ;
		x_location:type = "dimension_value" ;
	double y_location(y_location) ;
		y_location:units = "m" ;
		y_location:type = "dimension_value" ;
	double z_location(z_location) ;
		z_location:units = "m" ;
		z_location:type = "dimension_value" ;
	double TEMPERATURE(z_location, y_location, x_location) ;
		TEMPERATURE:type = "observation_value" ;
		TEMPERATURE:unit = "[C]" ;
	double FLOW_FLUX(z_location, y_location, x_location) ;
		FLOW_FLUX:type = "observation_value" ;
		FLOW_FLUX:unit = "" ;
	double POROSITY(z_location, y_location, x_location) ;
		POROSITY:type = "observation_value" ;
		POROSITY:unit = "" ;
	double THERMAL_CONDUCTIVITY

<a id='observationconvertion'></a>
## Prepare the observation conversion to DART observation format
In this section, we prepare the process of converting the observation data to DART format. We first convert observation data in raw format into NetCDF format. Then, a fortran script is prepared for the conversion from the NetCDF to to DART format. The structure of NetCDF file for recording observation file.

| NetCDF dimensions |           NetCDF variables          |
|:-----------------:|:-----------------------------------:|
| time: 1           | time: shape(time)                   |
| location: nloc    | location: shape(location)           |
|                   | physical variable: shape(time,nloc) |

**Note that** 
- if the time calendar follows *gregorian*, the time unit should be entered as ```seconds since YYYY-MM-DD HH:MM:SS```. Otherwise, put the time calender as *None* and time unit as ```second``` (make sure convert your measurement times to seconds).

***************
**Convert the raw csv temperature observations to NetCDF file**
- Run: ```csv2nc.py```
- Code input arguments:
    - <span style="background-color:yellow">obs_original</span>: filename for the original observed temperature file
    - <span style="background-color:yellow">obs_nc</span>: filename for the observation NetCDF file
    - <span style="background-color:yellow">assim_start_str</span>: the reference time to set zero

In [27]:
csv_to_nc, obs_nc = files_cfg["csv_to_nc_file"], files_cfg["obs_nc_file"]

In [28]:
%%script bash -s "$csv_to_nc" "$config_file"
python $1 $2

Finished converting raw observation in NetCDF format...


In [29]:
%%script bash -s "$obs_nc"
ncdump -h $1

netcdf obs_pflotran {
dimensions:
	time = 8641 ;
	location = 5 ;
variables:
	double time(time) ;
		time:calendar = "None" ;
		time:units = "days" ;
		time:type = "dimension_value" ;
	double x_location(location) ;
		x_location:units = "m" ;
		x_location:type = "dimension_value" ;
	double y_location(location) ;
		y_location:units = "m" ;
		y_location:type = "dimension_value" ;
	double z_location(location) ;
		z_location:units = "m" ;
		z_location:type = "dimension_value" ;
	double TEMPERATURE(location, time) ;
		TEMPERATURE:_FillValue = -99999. ;
		TEMPERATURE:unit = "C" ;
		TEMPERATURE:type = "observation_value" ;
}


***************
**Prepare the ```convert_nc.f90``` based on the list of observation variables**
- Run: ```prepare_convert_nc.py```
- Code input arguments:
    - <span style="background-color:yellow">obs_nc</span>: filename for the observation NetCDF file

In [30]:
prep_convert_nc, convert_nc_file = files_cfg["prep_convert_nc_file"], files_cfg["convert_nc_file"]

In [31]:
%%script bash -s "$prep_convert_nc" "$config_file"
python $1 $2

In [32]:
%%script bash -s "$convert_nc_file"
head $1

! DART software - Copyright UCAR. This open source software is provided
! by UCAR, "as is", without charge, subject to all terms of use at
! http://www.image.ucar.edu/DAReS/DART/DART_download
!
! Revised from convert_madis_profiler.f90 written by Nancy Colin
! $Id: convert_nc.f90 2019-09-09 15:48:00Z peishi.jiang@pnnl.gov $

program convert_nc

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


<a id='dart_executables'></a>
# Generate all the executable files
Now, we compile all the executables from ```mkmf_*```. The following executables are generated here:
- ```preprocess```: for preprocessing the [prepared DART generic variable quantity files prepared](#dart_generic_prepare)
- ```convert_nc```: for [converting the observations from NetCDF to DART format](#observationconvertion)
- ```model_mod_check```: for checking ```model_mod.F90``` interface file
- ```filter```: for conducting the [DART data assimilation](https://www.image.ucar.edu/DAReS/DART/Manhattan/assimilation_code/programs/filter/filter.html)

## Generate the executables
- Run: ```quickbuild.csh```
- Code input arguments:
    - <span style="background-color:yellow">app_work_dir</span>: location of the application work folder

In [33]:
dart_work_dir, app_work_dir = dirs_cfg["dart_work_dir"], dirs_cfg["app_work_dir"]
quickbuild = files_cfg["quickbuild_csh"]

In [34]:
%%script bash -s "$dart_work_dir" "$quickbuild" "$app_work_dir"
cd $1
csh $2 $3

---------------------------------------------------------------
Removing *.o *.mod files


---------------------------------------------------------------
PFLOTRAN build number 1 is preprocess
 Makefile is ready.
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/modules/utilities/types_mod.f90
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/modules/utilities/utilities_mod.f90
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/modules/utilities/time_manager_mod.f90
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/programs/preprocess/preprocess.f90
gfortran -O2 -ffree-line-length-none -I/usr/local/Cellar/netcdf/4.6.3_1/include  -c	../../../assimilation_code/modules/utilities/null_mpi_utilities_mod.f90
gfortran time_manager_mod.o utilities_mod.o p

rm: No match.
rm: No match.
.......rm: No match.
..................................rm: No match.
......................................................................................

## Convert the observation file in NetCDF to DART format
- Run: ```convert_nc```

In [78]:
convert_nc, obs_dart = files_cfg["convert_nc_exe"], files_cfg["obs_dart_file"]

In [79]:
%%script bash -s "$app_work_dir" "$convert_nc"
cd $1
$2


 --------------------------------------
 Starting ... at YYYY MM DD HH MM SS = 
                 2019 10  1 10 30 43
 Program convert_nc
 --------------------------------------

  set_nml_output No echo of NML values
  write_obs_seq  opening formatted observation sequence file "/Users/jian449/Codes/DART/manhattan/models/pflotran/applications/1dthermal/dart_inout/obs_seq_pflotran.out"

 --------------------------------------------------------
 -------------- ASSIMILATE_THESE_OBS_TYPES --------------
    TEMPERATURE
 --------------------------------------------------------
 -------------- EVALUATE_THESE_OBS_TYPES   --------------
    none
 --------------------------------------------------------
 ---------- USE_PRECOMPUTED_FO_OBS_TYPES   --------------
    none
 --------------------------------------------------------

  write_obs_seq  closed observation sequence file "/Users/jian449/Codes/DART/manhattan/models/pflotran/applications/1dthermal/dart_inout/obs_seq_pflotran.out"

 ---------

In [80]:
%%script bash -s "$obs_dart"
head -n20 $1

 obs_sequence
obs_kind_definitions
           1
           1 TEMPERATURE                                                     
  num_copies:            1  num_qc:            1
  num_obs:           20  max_num_obs:           20
observation                                                     
Data QC                                                         
  first:            1  last:           20
 OBS            1
   5.3399999999999999     
   1.0000000000000000     
          -1           2          -1
obdef
loc3Dxyz
     0.000000000000000         0.000000000000000       -0.1000000000000000E-01
kind
           1
     0          0
   1.0000000000000000     


## Check ```model_mod.F90``` interface file
- Run: ```model_mod_check```

In [38]:
model_mod_check = files_cfg["model_mod_check_exe"]

In [39]:
%%script bash -s "$app_work_dir" "$model_mod_check"
cd $1
$2


 --------------------------------------
 Starting ... at YYYY MM DD HH MM SS = 
                 2019 10  1  9 36 50
 Program model_mod_check
 --------------------------------------

  set_nml_output No echo of NML values
  initialize_mpi_utilities: Running single process


***************** RUNNING    TEST 0    ***********************
 -- Reading the model_mod namelist and implicitly running static_init_model
**************************************************************

 --------------------------------------------------------
 -------------- ASSIMILATE_THESE_OBS_TYPES --------------
    TEMPERATURE
 --------------------------------------------------------
 -------------- EVALUATE_THESE_OBS_TYPES   --------------
    none
 --------------------------------------------------------
 ---------- USE_PRECOMPUTED_FO_OBS_TYPES   --------------
    none
 --------------------------------------------------------



***************** FINISHED   TEST 0    ***********************
xxxxxxxxxxxxxxx

<a id='run_dart_pflotran'></a>
# (TODO) Run DART and PFLOTRAN
In this section, run the shell script to couple DART and PFLOTRAN

In [106]:
dart_work_dir = dirs_cfg["dart_work_dir"]
inputnml_file = files_cfg["input_nml_file"]

In [117]:
%%script bash -s "$dart_work_dir" "$inputnml_file" "$config_file"
cd $1
csh run_DART_PFLOTRAN.csh $2 $3


 --------------------------------------
 Starting ... at YYYY MM DD HH MM SS = 
                 2019 10  1 15 18 10
 Program Filter
 --------------------------------------

  set_nml_output No echo of NML values
  initialize_mpi_utilities: Running single process

 --------------------------------------------------------
 -------------- ASSIMILATE_THESE_OBS_TYPES --------------
    TEMPERATURE
 --------------------------------------------------------
 -------------- EVALUATE_THESE_OBS_TYPES   --------------
    none
 --------------------------------------------------------
 ---------- USE_PRECOMPUTED_FO_OBS_TYPES   --------------
    none
 --------------------------------------------------------

  quality_control_mod: Will reject obs with Data QC larger than    3
  quality_control_mod: No observation outlier threshold rejection will be done
  assim_tools_init: Selected filter type is Ensemble Kalman Filter (ENKF)
  assim_tools_init: The cutoff namelist value is     1000000.000000
  a

PREPARE_PFLOTRAN_INPUT: Undefined variable.


CalledProcessError: Command 'b'cd $1\ncsh run_DART_PFLOTRAN.csh $2 $3\n'' returned non-zero exit status 1.

# Some tests

## Test ```update_confignml_time.py```

In [140]:
update_confignml_time, inputnml_file = files_cfg["update_confignml_time_file"], files_cfg["input_nml_file"]

In [141]:
update_confignml_time

'/Users/jian449/Codes/DART/manhattan/models/pflotran/utils/update_confignml_time.py'

In [142]:
%%script bash -s "$update_confignml_time" "$config_file"
python $1 $2

/Users/jian449/Codes/DART/manhattan/models/pflotran/utils/update_confignml_time.py
0.013888888888888888


## Test ```filter```
- Run: ```filter```

In [81]:
filter_exe = files_cfg["filter_exe"]

In [138]:
%%script bash -s "$app_work_dir" "$filter_exe"
cd $1
$2


 --------------------------------------
 Starting ... at YYYY MM DD HH MM SS = 
                 2019 10  1 15 56 47
 Program Filter
 --------------------------------------

  set_nml_output No echo of NML values
  initialize_mpi_utilities: Running single process

 --------------------------------------------------------
 -------------- ASSIMILATE_THESE_OBS_TYPES --------------
    TEMPERATURE
 --------------------------------------------------------
 -------------- EVALUATE_THESE_OBS_TYPES   --------------
    none
 --------------------------------------------------------
 ---------- USE_PRECOMPUTED_FO_OBS_TYPES   --------------
    none
 --------------------------------------------------------

  quality_control_mod: Will reject obs with Data QC larger than    3
  quality_control_mod: No observation outlier threshold rejection will be done
  assim_tools_init: Selected filter type is Ensemble Kalman Filter (ENKF)
  assim_tools_init: The cutoff namelist value is     1000000.000000
  a