# Introduction into the pyaerocom web tools

This notebook gives an introduction on how to setup and run the AeroCom evaluation tools that are used to create the json files used in the several [AeroVal](https://aerocom-evaluation.met.no/) evaluation interfaces.

## Note

It is recommended to checkout [setup_and_intro.ipynb](setup_and_intro.ipynb) and make sure all is in place to use pyaerocom with access to PPI.

## Setting up the configuration for the analysis

In [1]:
import os
from warnings import filterwarnings
filterwarnings('ignore')

In [2]:
import pyaerocom as pya
pya.__version__

'0.10.0'

In the following, a whole configuration setup is specified, see comments for details.

In [3]:
# Output directory (where json files are stored): NOTE: this should point to the *data/json* Gitlab repo that you should have cloned
OUT_BASEDIR = os.path.abspath('../../data/json/')

# ID of project (please use this ID, as this is linked with the URL later on and will make sure to write into the correct GitLab repo, under data/json/{PROJ_ID})
# IMPORTANT NOTE: for the workshop, please all use this project ID
PROJ_ID = 'workshop2021'

# ID of experiment (will be name of subdirectory under data/json/{PROJ_ID}/{EXP_ID}) and used for experiment navigation in the web interface.
# IMPORTANT NOTE: PLEASE CHANGE THIS FOR YOUR OWN EXPERIMENT (e.g. Group1, Group2, ...)
EXP_ID = 'example1'

# Directory where colocated NetCDF files are stored (this is not relevant for the website, so it can be set flexibly)
COLDATA_BASEDIR = os.path.abspath('./coldata')

### Make sure `OUT_BASEDIR` is set correctly relative to *web* repo

In [4]:
from pathlib import Path
path = Path(OUT_BASEDIR)
assert path.exists()
assert path.name == 'json'
assert path.parent.name == 'data'
assert 'web' in os.listdir(path.parent.parent)

### Create analysis setup class 

In [5]:
stp = pya.web.AerocomEvaluation(proj_id=PROJ_ID, exp_id=EXP_ID, 
                                exp_name='Tutorial experiment for pyaerocom workshop',
                                out_basedir=OUT_BASEDIR,
                                basedir_coldata=COLDATA_BASEDIR)
print(stp)


Pyaerocom AerocomEvaluation
---------------------------
Project ID: workshop2021
Eperiment ID: example1
Experiment name: Tutorial experiment for pyaerocom workshop
colocation_settings: (will be updated for each run from model_config and obs_config entry)
  save_coldata: True
  _obs_cache_only: False
  obs_vars: None
  obs_vert_type: None
  model_vert_type_alt: None
  read_opts_ungridded: None
  obs_ts_type_read: None
  model_use_vars: None
  model_add_vars: None
  model_keep_outliers: True
  model_to_stp: False
  model_id: None
  model_name: None
  model_data_dir: None
  obs_id: None
  obs_name: None
  obs_data_dir: None
  obs_keep_outliers: False
  obs_use_climatology: False
  obs_add_meta: []
  gridded_reader_id: {'model': 'ReadGridded', 'obs': 'ReadGridded'}
  start: None
  stop: None
  ts_type: None
  filter_name: None
  remove_outliers: True
  apply_time_resampling_constraints: None
  min_num_obs: None
  resample_how: None
  var_outlier_ranges: None
  var_ref_outlier_ranges: None

The most important things to define for the analysis are:

- obs_config: dictionary of dictionaries containing observations to be used
- model_config: dictionary of dictionaries containing models to be used
- colocation_settings: (see above) most of these can be left untouched and below we show the essential information

### Observation setup

The `obs_config` entry defines observations to be used, below we define 1 set of observations, Aeronet (AOD and Angstrom Exponent) and EBAS (scattering and absorption coefficients). In the end, these setups are assigned to the evaluation class that we just created.

In [6]:
obs_cfg = {
    # key is name as it appears in web interface, value contains setup 
    'Aeronet' : {
        'obs_id'        : 'AeronetSunV3Lev2.daily', # ID of obsnetwork
        'obs_vars'      : ['ang4487aer', 'od550aer'], # list of variables (Angstrom Exponent, 440-870nm, and AOD at 550 nm)
        'obs_vert_type' : 'Column', # this is needed, choose from Column or Surface
        'obs_filters'   : {'altitude' : [0, 1000]},
        'ignore_station_names' : 'DRAGON*'
    }
}

stp['obs_config'] = obs_cfg

### Defining models to be used for evaluation

In [7]:
model_cfg = {
    'Aerocom-Median' : {'model_id' : 'AEROCOM-MEDIAN-2x3-GLISSETAL2020-1_AP3-CTRL'},
    'EC-Earth'    : {'model_id' : 'EC-Earth3-AerChem-met2010_AP3-CTRL2019'}
}

stp['model_config'] = model_cfg

## Colocation setup

In the following we define essential settings for colocation of model / obs / var. Note: these can be overwritten in each individual model or obs config entry where needed.

In [8]:
DEFAULT_COLOCATION_SETTINGS = dict(
    start = 2010, 
    stop = 2011,
    ts_type = 'daily', # desired output frequency of colocated data objects
    colocate_time = False, # if True and if input "ts_type" is lower resolution than highest available in model and obs, then model and obs are first colocated in higher res. before resampling to "ts_type"
    weighted_stats = True, # only relevant if models are evaluated against gridded satellite data
    apply_time_resampling_constraints = True,
    min_num_obs = pya.const.OBS_MIN_NUM_RESAMPLE,
    reanalyse_existing = True, # relevant for re-runs. If True, pre-existing colocated data files are re-used for computation of json files 
    remove_outliers=True, # remove outliers during colocation
    harmonise_units=True, # harmonise units before colocation (e.g. if obs is in ug m-3 and model is in kg m-3). Will crash if unit conversion cannot be done (e.g. obs in ug m-3 and model in nmole mole-1).
    model_keep_outliers=True, # if True, and remove_outliers is True, then only obs outliers are removed  (default behaviour)
)

stp.update(**DEFAULT_COLOCATION_SETTINGS)

In [9]:
print(stp)


Pyaerocom AerocomEvaluation
---------------------------
Project ID: workshop2021
Eperiment ID: example1
Experiment name: Tutorial experiment for pyaerocom workshop
colocation_settings: (will be updated for each run from model_config and obs_config entry)
  save_coldata: True
  _obs_cache_only: False
  obs_vars: None
  obs_vert_type: None
  model_vert_type_alt: None
  read_opts_ungridded: None
  obs_ts_type_read: None
  model_use_vars: None
  model_add_vars: None
  model_keep_outliers: True
  model_to_stp: False
  model_id: None
  model_name: None
  model_data_dir: None
  obs_id: None
  obs_name: None
  obs_data_dir: None
  obs_keep_outliers: False
  obs_use_climatology: False
  obs_add_meta: []
  gridded_reader_id: {'model': 'ReadGridded', 'obs': 'ReadGridded'}
  start: 2010
  stop: 2011
  ts_type: daily
  filter_name: None
  remove_outliers: True
  apply_time_resampling_constraints: True
  min_num_obs: {'yearly': {'monthly': 3}, 'monthly': {'daily': 7}, 'daily': {'hourly': 6}, 'hourl

In [10]:
stp.var_mapping = pya.web.web_naming_conventions.VAR_MAPPING

## Running the evaluation

The following command will run the evaluation, i.e. it will, in a nested loop, 

- go through all observation network entries in `stp.obs_config` 
- go through all models in `stp.model_config`
- colocate each obs / var / model entry and saves the corresponding `ColocatedData` object(s) as NetCDF under `COLDATA_BASEDIR`
- based on the colocated NetCDF files, compute all json files relevant for the web interface
- and stores these json files in a dedicated subdirectory under `OUT_BASEDIR`

In [11]:
stp.run_evaluation()


Running analysis:
Obs. names: ['Aeronet']
Model names: ['Aerocom-Median', 'EC-Earth']
Remove outliers: True
Harmonise units: True
Delete existing json files before reanalysis: True
Reanalyse existing colocated NetCDF files: True
Run only colocation (no json files computed): False
Raise exceptions if they occur: False

Running colocation of Aerocom-Median against Aeronet
PREPARING colocation of AEROCOM-MEDIAN-2x3-GLISSETAL2020-1_AP3-CTRL vs. AeronetSunV3Lev2.daily
The following variable combinations will be colocated
MODEL-VAR	OBS-VAR
ang4487aer	ang4487aer
od550aer	od550aer
Running AEROCOM-MEDIAN-2x3-GLISSETAL2020-1_AP3-CTRL / AeronetSunV3Lev2.daily (ang4487aer, ang4487aer)
Updating ts_type from daily to monthly (highest available in model AEROCOM-MEDIAN-2x3-GLISSETAL2020-1_AP3-CTRL)
Input filters {'altitude': [-1000000.0, 1000.0]} result in unchanged data object
Input filters {'latitude': [-89.0, 89.0], 'longitude': [-178.5, 178.5]} result in unchanged data object
WRITE: ang4487aer_RE

['/home/jonasg/MyPyaerocom/ws21/pyaerocom-meetings/Feb2021_Workshop/coldata/EC-Earth3-AerChem-met2010_AP3-CTRL2019/ang4487aer_REF-Aeronet_MOD-EC-Earth_20100101_20101231_monthly_None.nc',
 '/home/jonasg/MyPyaerocom/ws21/pyaerocom-meetings/Feb2021_Workshop/coldata/EC-Earth3-AerChem-met2010_AP3-CTRL2019/od550aer_REF-Aeronet_MOD-EC-Earth_20100101_20101231_monthly_None.nc']

This should create a bunch of colocated NetCDF files in the colocated data directory: 

In [12]:
COLDATA_BASEDIR

'/home/jonasg/MyPyaerocom/ws21/pyaerocom-meetings/Feb2021_Workshop/coldata'

And based on those it computes all required json files, which are stored in:

In [13]:
f'{OUT_BASEDIR}/{PROJ_ID}/{EXP_ID}'

'/home/jonasg/MyPyaerocom/ws21/data/json/workshop2021/example1'

## Launching the local version of the web interface

If you have cloned all required repos correctly, then the folder containing the [data](https://gitlab.met.no/aerocom-evaluation/data) Gitlab repo (here: ws21) should also contain the [web](https://gitlab.met.no/aerocom-evaluation/web) Gitlab repo. And you should have php installed locally.

If this is the case (and the analysis is finished), you should be able to checkout your results in a local version of the web interface.

**From within the web repo, call:**
```bash
php -S localhost:8000
```

Then open the link, click on one of the projects (e.g. AeroCom) and in the URL replace "project=aerocom" with "project=workshop2021". Then you should see your experiment in the experiment menu. This should be the URL:

http://localhost:8000/main.php?project=workshop2021