# Index of Runs and Variables

These notebooks document model runs that are avaiable for analysis.

In [1]:
import cosima_cookbook as cc
import dataset
import pandas as pd
from IPython.display import display, HTML
import tqdm

Currently, the cookbook searches for NetCDF4 files in the following directories

In [2]:
cc.netcdf_index.directoriesToSearch

['/g/data3/hh5/tmp/cosima/', '/g/data1/v45/APE-MOM']

We first generate a database of all variables in all netCDF4 files found within these directories. Note that this needs only be called once as the database will persist between sessions.  If new output files are added to the data directories, build_index() will only important the new files.

If this database ever becomes corrupted, it can be be safely deleted and will be recreated whenever build_index() is next called.

In [3]:
cc.build_index()

Searching /g/data3/hh5/tmp/cosima/
Searching /g/data1/v45/APE-MOM
Found 38641 .nc files
Using database sqlite:////g/data1/v45/cosima-cookbook/cosima-cookbook.db
Files already indexed: 38232
Files found but not yet indexed: 409
Indexing new .nc files...

Found 0 new variables
Saving results in database...
Indexing complete.


True

This index of all variables is stored in a SQL database. If needed, it can be accessed directly using the `dataset` module.

In [3]:
db = dataset.connect(cc.netcdf_index.database_url)

In [6]:
rows = db.query('select * from ncfiles where variable = "tau_x" and experiment = "025deg_jra55_ryf_spinup1" limit 10')
for row in rows:
    print(row)

OrderedDict([('id', 87074), ('ncfile', '/g/data3/hh5/tmp/cosima/access-om2-025/025deg_jra55_ryf_spinup1/output003/ocean/ocean_month.nc'), ('rootdir', '/g/data3/hh5/tmp/cosima'), ('configuration', 'access-om2-025'), ('experiment', '025deg_jra55_ryf_spinup1'), ('run', 'output003'), ('basename', 'ocean_month.nc'), ('variable', 'tau_x'), ('dimensions', "('time', 'yu_ocean', 'xu_ocean')"), ('chunking', 'None')])
OrderedDict([('id', 380293), ('ncfile', '/g/data3/hh5/tmp/cosima/access-om2-025/025deg_jra55_ryf_spinup1/output004/ocean/ocean_month.nc'), ('rootdir', '/g/data3/hh5/tmp/cosima'), ('configuration', 'access-om2-025'), ('experiment', '025deg_jra55_ryf_spinup1'), ('run', 'output004'), ('basename', 'ocean_month.nc'), ('variable', 'tau_x'), ('dimensions', "('time', 'yu_ocean', 'xu_ocean')"), ('chunking', 'None')])
OrderedDict([('id', 392292), ('ncfile', '/g/data3/hh5/tmp/cosima/access-om2-025/025deg_jra55_ryf_spinup1/output008/ocean/ocean_month.nc'), ('rootdir', '/g/data3/hh5/tmp/cosima')

Let's bring this database into memory as a Pandas DataFrame for further analysis.

In [5]:
data = []
for row in tqdm.tqdm_notebook(db['ncfiles'].all(), total = 1200000):
    data.append(row)
df = pd.DataFrame(data)




Many of the output files have names of the form `output__123_45.nc`.  Here we constuct a more generalized name for this output file, namely the regular expression `output__\d+_\d+.nc`.

[To do: add basename_pattern to build_index() ]

In [7]:
pat = '(?P<root>[^\d]+)(?P<index>__\d+_\d+)?(?P<indexice>.\d+\-\d+)?(?P<ext>\.nc)'

repl = lambda m: m.group('root') + ('__\d+_\d+' if m.group('index') else '') + ('.\d+-\d+' if m.group('indexice') else '')+ m.group('ext')
df['basename_pattern'] = df.basename.str.replace(pat, repl)

### Number of runs per configuration/experiment

The data directory contains several model __configurations__ (e.g. mom01v5 or mom025)

Each configuration contains a number of __experiments__ (e.g. KDS75 or KDS75_wind)

The output is a set of several __runs__ (e.g. output266)

In [39]:
table = pd.pivot_table(df, index=["rootdir", "configuration"],  
                       values=['experiment'], 
                       aggfunc=lambda x: len(x.unique()))
display(table)

Unnamed: 0_level_0,Unnamed: 1_level_0,experiment
rootdir,configuration,Unnamed: 2_level_1
/g/data1/v45,APE-MOM,4
/g/data3/hh5/tmp/cosima,access-om2,2
/g/data3/hh5/tmp/cosima,access-om2-025,4
/g/data3/hh5/tmp/cosima,mom01v5,7
/g/data3/hh5/tmp/cosima,mom025,6


This table shows the number of runs that are available for analysis.  

## MOM-SIS 0.25$^\circ$ Diagnostics

| Experiment Name | Description |
|-----------------|-----------------|
|mom025_nyf | Original simulation, rerun from WOA13 initial conditions.|
|mom025_nyf_salt | As above, with new ew salt restoring file from WOA13 surface data. (Not running yet)|

## MOM-SIS 0.1$^\circ$

| Experiment Name | Description |
|-----------------|-----------------|
|GFDL50 | Original simulation with 50 vertical levels. Ran from Levitus for about 60 years, but data output only saved from about year 40.|
|KDS75 | Branched from GFDL50 at year 45 (re-zeroed), but with Kial Stewart's 75 level scheme. Has now run for 103 years. Years 90-100 have 5-daily output.|
| KDS75_wind | Short (5-year) Antarctic wind perturbation case, branched from KDS75 at year 40.|
| KDS75_PI | Paul Spence's Poleward Intensification wind experiment. Branched from KDS75 at year 70, will run until year 100 with 5-daily output for the last decade|
| KDS75_UP | Paul Spence's Increased winds case. Branched from KDS75 at year 70, will run until year 100 with 5-daily output for the last decade. (In Progress) |

## ACCESS-OM2-025 Preliminary Analysis

|** Run Name** | **Forcing** | ** Run ** | ** Status **|
|--------------|---------|-------------------------------------------------|-------------|
|025deg_jra55_ryf_spinup1 | JRA55 RYF9091| This is our initial 0.25° test. Ran for a decade before sea ice build-up overwhelmed us!  | Aborted after 10 years.| 
|025deg_jra55_ryf_spinup2 | JRA55 RYF9091| This is our initial 0.25° test with the sea ice parameter fixed. Less sea ice buildup, but there seems to be a problem with salinity conservation. It seems we are not doing runoff properly ... | Stopped at 50 years.| 
|025deg_jra55_ryf_spinup3 | JRA55 RYF9091| Third attempt at 0.25° test. This run is very unstable, and we think it might be something to do with runoff. Will try to fix this and start again. | Up to 8 years.| 
|025deg_jra55_ryf | JRA55 RYF9091| Latest attempt at 0.25° test.  | Started 5/8/17| 



## ACCESS-OM2 Preliminary Analysis

|** Run Name** | **Forcing** | ** Run ** | ** Status **|
|--------------|---------|-------------------------------------------------|-------------|
|1deg_jra55_ryf_spinup1 | JRA55 RYF9091| A short 10 year spinup with first pre-release code. Had bugs in runoff and salt fluxes.| Aborted after 10 years.| 
|1deg_jra55_ryf | JRA55 RYF9091| second attempt at 1° test. | Up to 50 years.| 

