# Automated Post Processing Quick Start



This utility has been designed with the goal of being as simple to use as possible, but there are a number of configuration options that must be set before the first run. You will need to collect a few nessesary pieces of information. This document will assume you're running on the acme1.llnl.gov server, but the only thing that would be different if running elsewhere would be paths. All this information will need to be written to the run configuration file run.cfg.


## Setup

#### The model run directory on the compute machine. 

An example would be the 20170313.beta1_2 run by Chris Golaz on Edison: /scratch2/scratchdirs/golaz/ACME_simulations/20170313.beta1_02.A_WCYCL1850S.ne30_oECv3_ICG.edison/run. 

This path should be written to the run.cfg under 
* [transfer] source_path = /remote/model/path/run

#### Your desired output path. 

An example would be /p/cscratch/acme/[USERNAME]/output_20170313
This should be written to 
* [global] output_path = /your/output/path

#### The local model storage path.

Example: /p/cscratch/acme/[USERNAME]/input_20170313
This should be written to 
* [global] input_path = /your/input/path

#### The length of the simulation. 

This doesnt have to match the actual model run, its for our purposes only. If the simulation ran for 100 years, but you only want to run against the first 50 years, thats fine. Conversely, if the simulation is currently running and is only 5 year in, you can set the end year to be 100 years even if it doesnt produce 100 years of output. The simulation start year is assumed to be 1, but you can set it to anything you like.

This should be written to 
* [global] simulation_end_year = SOMENUMBER

#### The run frequency

This is the length of each set of diagnostic runs. If you set it to 10 years, then for every 10 years the climatologies, time series, and diagnostics will be produced. This value is a list of year lengths, so you could set it to [10, 50], which would cause output to be generated for every ten year span as well as every 50 year span.

This should be written to 
* [global] set_frequency = [LIST, OF, NUMBERS]

#### The run id.

This is a unique name for this run of the automated post processor. This is needed to differentiate the paths to the diagnostic output. If you were using the example model, an appropriate run_id would be 20170313.beta1_02

This should be written to 
* [global] run_id = YOUR_RUN_ID

#### compute_username and compute_password

Although its not required to add your password to the config file, it makes the run process faster.This should be written to:

* [monitor] compute_username = YOUR_EDISON_USERNAME and [monitor] compute_password = YOUR_EDISON_PASSWORD

#### processing_username and processing_password

* [transfer] processing_username = YOU_ACME1_USERNAME
* [transfer] processing_password = YOUR_ACME1_PASSWORD
* [transfer] globus_username = YOUR_GLOBUS_USERNAME 
* [transfer] globus_password = YOUR_GLOBUS_PASSWORD

### Example run configuration

This may seam like a lot of stuff, but you only need to adust the keys mentioned above. Everything else is needed for the run, but the default values shouldn't need to be changed.

```
[global]
output_path = /p/cscratch/acme/[YOUR_USERNAME]/output_20170313
data_cache_path = /p/cscratch/acme/[YOUR_USERNAME]/input_20170313
output_patterns = {"STREAMS":"streams", "ATM":"cam.h0", "MPAS_AM": "mpaso.hist.am.timeSeriesStatsMonthly", "MPAS_CICE":"mpascice.hist.am.timeSeriesStatsMonthly", "MPAS_RST": "mpaso.rst.0", "MPAS_O_IN": "mpas-o_in", "MPAS_CICE_IN": "mpas-cice_in"}
simulation_start_year =  1
simulation_end_year = 20
set_frequency = [10]
experiment = case_scripts
batch_system_type = slurm
run_id = 20170313.beta1_2
img_host_server = https://acme-viewer.llnl.gov

[amwg]
diag_home = /p/cscratch/acme/amwg/amwg_diag
host_directory = /var/www/html/amwg
host_prefix = amwg

[monitor]
compute_host = edison.nersc.gov
compute_username = YOUR_EDISON_USERNAME
compute_password = YOUR_EDISON_PASSWORD

[ncclimo]
regrid_map_path = /p/cscratch/acme/data/map_ne30np4_to_fv129x256_aave.20150901.nc
ncclimo_path = /p/cscratch/acme/bin
var_list = FSNTOA,FLUT,FSNT,FLNT,FSNS,FLNS,SHFLX,QFLX,PRECC,PRECL,PRECSC,PRECSL,TS,TREFHT

[uvcmetrics]
obs_for_diagnostics_path = /p/cscratch/acme/data/obs_for_diagnostics
host_prefix = uvcmetrics

[coupled_diags]
host_directory = /var/www/html/coupled_diag
host_prefix = coupled
coupled_diags_home = /p/cscratch/acme/data/PreAndPostProcessingScripts/coupled_diags
mpas_meshfile = /p/cscratch/acme/data/mapping/gridfile.oEC60to30.nc
mpas_remapfile = /p/cscratch/acme/data/mapping/map_oEC60to30_TO_0.5x0.5degree_blin.160412.nc 
pop_remapfile = /p/cscratch/acme/data/mapping/map_gx1v6_TO_0.5x0.5degree_blin.160413.nc 
remap_files_dir = /p/cscratch/acme/data/mapping/maps
gpcp_regrid_wgt_file = /p/cscratch/acme/data/ne30-to-GPCP.conservative.wgts.nc 
ceres_ebaf_regrid_wgt_file = /p/cscratch/acme/data/ne30-to-CERES-EBAF.conservative.wgts.nc 
ers_regrid_wgt_file = /p/cscratch/acme/data/ne30-to-ERS.conservative.wgts.nc 
test_native_res = ne30 
yr_offset =  1849
ref_case = obs 
ref_archive_dir = /p/cscratch/acme/data/obs_for_diagnostics 
ref_case_dir = /p/cscratch/acme/data/obs_for_diagnostics 
test_native_res = ne30 
obs_ocndir = /p/cscratch/acme/data/observations/Ocean 
obs_seaicedir = /p/cscratch/acme/data/observations/SeaIce 
obs_sstdir = /p/cscratch/acme/data/observations/Ocean/SST 
obs_iceareaNH = /p/cscratch/acme/data/observations/SeaIce/IceArea_timeseries/iceAreaNH_climo.nc 
obs_iceareaSH = /p/cscratch/acme/data/observations/SeaIce/IceArea_timeseries/iceAreaSH_climo.nc 
obs_icevolNH = /p/cscratch/acme/data/observations/SeaIce/PIOMAS/PIOMASvolume_monthly_climo.nc


[upload_diagnostic]
diag_viewer_username = btest
diag_viewer_password = test
diag_viewer_server = https://acme-viewer.llnl.gov

[transfer]
source_path =  /scratch2/scratchdirs/golaz/ACME_simulations/20170313.beta1_02.A_WCYCL1850S.ne30_oECv3_ICG.edison/run 
source_endpoint = b9d02196-6d04-11e5-ba46-22000b92c6ec
destination_endpoint = 43d64772-a82e-11e5-99d3-22000b96db58
processing_host = acme1.llnl.gov 
processing_username = YOUR_ACME1_USERNAME
processing_password = YOUR_ACME1_PASSWORD
globus_username = YOUR_GLOBUS_USERNAME
globus_password = YOUR_GLOBUS_PASSWORD
```

### Running

Once the setup process is done, running is simple. Simply activate your conda environment, and run the following command to start the post processor in interactive mode:

    python workflow.py -c run.cfg

This will start a new run, and start downloading the data. 

![initial run](https://github.com/sterlingbaldwin/acme_workflow/blob/master/resources/initial_run.png)

Once globus has transfered the first year_set of data, it will start running the post processing jobs.

![run in progress](https://github.com/sterlingbaldwin/acme_workflow/blob/master/resources/run_in_process.png)

### Subsequent runs

After your initial setup, to start new runs the only values you should need to change are
```
[global] output_path
[global] data_cache_path
[global] set_frequency
[global] simulation_end_year
[global] run_id
[transfer] source_path
```