# Make Your Own Database

The COSIMA cookbook uses a database to access information about experiments and to help with loading model output. We maintain a default database for ACCESS-OM2 experiments, but there are occasions when you might want to make your own database. This tutorial outlines the process of making your own private database.

**Requirements:** We recommend that you use the most recent `conda/analysis3` (or later) kernel on NCI (or your own up-to-date cookbook installation).

In [1]:
%matplotlib inline
import cosima_cookbook as cc

**First, create a database session** using the inbuilt `create_session` function. To do this, you need to specify a path for the database - choose a location where you have write permission (that is, not the example that I have given here):

In [2]:
db = '/g/data/e14/rmh561/access-om2/archive/databases/cc_database_nummix.db'
session = cc.database.create_session(db)

Note that you need to create the database session every time you start up your notebook; you can then update this database however many times you like.

**Now you are ready to build a database.** First, select which *experiments* you want to include in your database. For these purposes, an *experiment* is a directory containing output from a single simulation. (If you use a higher level directory you won't be able to distinguish between experiments.) 

My example below constructs a list of three experiment directories; I have chosen three cases with differing resolution. The database will be built to index all netcdf files in each directory.

In [3]:
dir_list=[# 1-degree runs
          #   '/g/data/e14/rmh561/access-om2/archive/1deg_jra55_ryf_kds50',
          #   '/g/data/e14/rmh561/access-om2/archive/1deg_jra55_ryf_kds75',
          #   '/g/data/e14/rmh561/access-om2/archive/1deg_jra55_ryf_kds100',
          #   '/g/data/e14/rmh561/access-om2/archive/1deg_jra55_ryf_kds135',
          #   '/g/data/e14/rmh561/access-om2/archive/1deg_jra55_ryf_gfdl50',
          # # 1/4-degree runs
          #   '/g/data/e14/rmh561/access-om2/archive/025deg_jra55_ryf_rediGM',
          #   '/g/data/e14/rmh561/access-om2/archive/025deg_jra55_ryf_norediGM',
          #   '/g/data/e14/rmh561/access-om2/archive/025deg_jra55_ryf_noGM',
          #   '/g/data/e14/rmh561/access-om2/archive/025deg_jra55_ryf_rediGM_kb1em5',
          #   '/g/data/e14/rmh561/access-om2/archive/025deg_jra55_ryf_rediGM_kbvar',
          #   '/g/data/e14/rmh561/access-om2/archive/025deg_jra55_ryf_kds75',
            '/g/data/e14/rmh561/access-om2/archive/025deg_jra55_ryf_norediGM_smoothkppbl',
          # # 1/10-degree runs
          #   '/g/data/e14/rmh561/access-om2/archive/01deg_jra55_ryf', # diathermal diags run (some snapshots 2nd year)
          #   '/g/data/ik11/outputs/access-om2-01/01deg_jra55v13_ryf9091', # original 01-degree run 
          #   '/g/data/ik11/outputs/access-om2-01/01deg_jra55v13_ryf9091_k_smag_iso3' # k_smag run
            ]
cc.database.build_index(dir_list,session,update=True,prune=True)

  0%|          | 0/36 [00:00<?, ?it/s]

  3%|          | 1/36 [00:00<00:04,  7.72it/s]

Indexing experiment: 025deg_jra55_ryf_norediGM_smoothkppbl


  8%|          | 3/36 [00:00<00:03,  8.82it/s]

 17%|          | 6/36 [00:00<00:02, 10.70it/s]

 22%|          | 8/36 [00:00<00:02, 11.17it/s]

 28%|          | 10/36 [00:00<00:02, 10.41it/s]

 36%|          | 13/36 [00:01<00:02, 10.65it/s]

 42%|          | 15/36 [00:01<00:02, 10.05it/s]

 47%|          | 17/36 [00:01<00:01, 10.87it/s]

 53%|          | 19/36 [00:01<00:01, 10.25it/s]

 58%|          | 21/36 [00:01<00:01,  9.49it/s]

 61%|          | 22/36 [00:02<00:01,  9.37it/s]

 67%|          | 24/36 [00:02<00:01, 10.25it/s]

 72%|          | 26/36 [00:02<00:01,  9.79it/s]

 78%|          | 28/36 [00:03<00:02,  3.41it/s]

 81%|          | 29/36 [00:03<00:01,  4.18it/s]

 83%|          | 30/36 [00:04<00:01,  5.00it/s]

 86%|          | 31/36 [00:04<00:00,  5.72it/s]

 92%|          | 33/36 [00:04<00:00,  5.03it/s]

 94%|          | 34/36 [00:04<00:00,  5.76it/s]

 97%|          | 35/36 [00:04<00:00,  6.49it/s]

100%|          | 36/36 [00:05<00:00,  7.05it/s]

100%|          | 36/36 [00:05<00:00,  7.11it/s]




36

Note that this operation may take a little while first time through, but is relatively painless to update -- **provided that you have the `update=True` flag switched on**. Now you have your own database - remember to specify your own database when you load model output, or else it will look for your experiment in the default database.

## Using the database
To know how to effectively use this database, please see the companion tutorial: `COSIMA_CookBook_Tutorial`. Alternatively, here is a sample that shows how you might load a variable from an experiment in your database.

In [None]:
expt =  '025deg_jra55v13_iaf_gmredi6'
variable = 'ke_tot'
darray = cc.querying.getvar(expt, variable, session)
annual_average = darray.resample(time='A').mean(dim='time')
annual_average.plot()

If you want to know more about the inbuilt functions used above, you can use the help function at any time, for example:

In [None]:
help(cc.database.create_session)

In [None]:
help(cc.querying.getvar)