# Introduction into the pyaerocom web tools using local obs and model data

This notebook gives an introduction on how to setup and run the AeroCom evaluation tools that are used to create the json files used in the several [AeroVal](https://aerocom-evaluation.met.no/) evaluation interfaces.

## Note

This notebook uses a local copy of obsdata, see [here]() for instructions how to get those data on your computer.

It is recommended to checkout [setup_and_intro.ipynb](setup_and_intro.ipynb) and [example_webeval](example_webeval.ipynb) and make sure all is in place to use pyaerocom with access to PPI.

## Setting up the configuration for the analysis

In [1]:
import os
from warnings import filterwarnings
filterwarnings('ignore')

In [2]:
import pyaerocom as pya
pya.__version__

'0.10.0'

In the following, a whole configuration setup is specified, see comments for details.

In [3]:
# Output directory (where json files are stored): NOTE: this should point to the *data* Gitlab repo that you should have cloned
OUT_BASEDIR = os.path.abspath('../../data/json/')

# ID of project (please use this ID, as this is linked with the URL later on and will make sure to write into the correct GitLab repo, under data/json/{PROJ_ID})
# IMPORTANT NOTE: for the workshop, please all use this project ID
PROJ_ID = 'workshop2021'

# ID of experiment (will be name of subdirectory under data/json/{PROJ_ID}/{EXP_ID}) and used for experiment navigation in the web interface.
# IMPORTANT NOTE: PLEASE CHANGE THIS FOR YOUR OWN EXPERIMENT (e.g. Group1, Group2, ...)
EXP_ID = 'example2'

# Directory where colocated NetCDF files are stored (this is not relevant for the website, so it can be set flexibly)
COLDATA_BASEDIR = os.path.abspath('./coldata')

### Make sure `OUT_BASEDIR` is set correctly relative to *web* repo

In [4]:
from pathlib import Path
path = Path(OUT_BASEDIR)
assert path.exists()
assert path.name == 'json'
assert path.parent.name == 'data'
assert 'web' in os.listdir(path.parent.parent)

### Create analysis setup class 

In [5]:
stp = pya.web.AerocomEvaluation(proj_id=PROJ_ID, exp_id=EXP_ID, 
                                exp_name='Tutorial experiment 2 for pyaerocom workshop',
                                out_basedir=OUT_BASEDIR,
                                basedir_coldata=COLDATA_BASEDIR)

### Observation setup

#### Set and check access to local obsdata 

Specify directory where the unpacked [tarball](https://github.com/metno/pyaerocom-meetings/tree/master/Feb2021_Workshop#speedup---create-a-local-copy-of-relevant-obsdata) lies locally:

In [6]:
# PLEASE MODIFY
OBS_BASEDIR = '/home/jonasg/MyPyaerocom/ws21/obslocal'

GHOST_EEA_DAILY_LOCAL = os.path.join(OBS_BASEDIR, 'GHOST/data/EEA_AQ_eReporting/daily')
GHOST_EBAS_DAILY_LOCAL =  os.path.join(OBS_BASEDIR, 'GHOST/data/EBAS/daily')

# make sure the directories exist
assert os.path.exists(GHOST_EEA_DAILY_LOCAL)
assert os.path.exists(GHOST_EBAS_DAILY_LOCAL)

#### Define `obs_config` in `AerocomEvaluation` class with local copy of GHOST dataset

In [7]:
obs_cfg = {
    # key is name as it appears in web interface, value contains setup 
    'G-EBAS-d-rural'     : {
        'obs_id'        :'GHOST.EBAS.daily',
        'obs_data_dir'  : GHOST_EBAS_DAILY_LOCAL,
        'obs_vars'      : ['concpm10', 'concpm25'], # list of variables (Angstrom Exponent, 440-870nm, and AOD at 550 nm)
        'obs_vert_type' : 'Surface', # this is needed, choose from Column or Surface
        'obs_filters'   : {'altitude' : [0, 1000], 
                           'set_flags_nan' : True, # Invalidate flagged data
                           'station_classification'  :   ['background'],
                           'area_classification'     :   ['rural','rural-near_city',
                                                          'rural-regional', 'rural-remote']
                          }
    },
    'G-EEA-d-rural'     : {
        'obs_id'        :'GHOST.EEA.daily',
        'obs_data_dir'  : GHOST_EEA_DAILY_LOCAL,
        'obs_vars'      : ['concpm10', 'concpm25'], # list of variables (Angstrom Exponent, 440-870nm, and AOD at 550 nm)
        'obs_vert_type' : 'Surface', # this is needed, choose from Column or Surface
        'obs_filters'   : {'altitude' : [0, 1000], 
                           'set_flags_nan' : True, # Invalidate flagged data
                           'station_classification'  :   ['background'],
                           'area_classification'     :   ['rural','rural-near_city',
                                                          'rural-regional', 'rural-remote']
                          }
    }
}

stp['obs_config'] = obs_cfg

### Defining models to be used for evaluation

Uses local copy of modeldata located here:

In [8]:
model_cfg = {
    'IFS-CTRL'     : {
        'model_id' : 'ECMWF_CNTRL',
        #'model_data_dir' : '<path to modeldata if you have it locally>'
    },
    'IFS-OSUITE'     : {
        'model_id' : 'ECMWF_OSUITE',
        #'model_data_dir' : '<path to modeldata if you have it locally>'
    }
}

stp['model_config'] = model_cfg

## Colocation setup

In the following we define essential settings for colocation of model / obs / var. Note: these can be overwritten in each individual model or obs config entry where needed.

In [9]:
DEFAULT_COLOCATION_SETTINGS = dict(
    start = 2019, 
    stop = None,
    ts_type = 'daily', # desired output frequency of colocated data objects
    colocate_time = False, # if True and if input "ts_type" is lower resolution than highest available in model and obs, then model and obs are first colocated in higher res. before resampling to "ts_type"
    weighted_stats = True, # only relevant if models are evaluated against gridded satellite data (no example provided)
    apply_time_resampling_constraints = True,
    # set conservative min_num_obs requirement (ca 75% coverage at daily and weekly levels)
    min_num_obs = {'monthly' : {'daily': 22} # at least 22 days per month
    }, # resampling
    reanalyse_existing = False, # relevant for re-runs. If True, pre-existing colocated data files are re-used for computation of json files 
    remove_outliers=True, # remove outliers during colocation
    harmonise_units=True, # harmonise units before colocation (e.g. if obs is in ug m-3 and model is in kg m-3). Will crash if unit conversion cannot be done (e.g. obs in ug m-3 and model in nmole mole-1).
    model_keep_outliers=True, # if True, and remove_outliers is True, then only obs outliers are removed  (default behaviour)
)

stp.update(**DEFAULT_COLOCATION_SETTINGS)

In [10]:
print(stp)


Pyaerocom AerocomEvaluation
---------------------------
Project ID: workshop2021
Eperiment ID: example2
Experiment name: Tutorial experiment 2 for pyaerocom workshop
colocation_settings: (will be updated for each run from model_config and obs_config entry)
  save_coldata: True
  _obs_cache_only: False
  obs_vars: None
  obs_vert_type: None
  model_vert_type_alt: None
  read_opts_ungridded: None
  obs_ts_type_read: None
  model_use_vars: None
  model_add_vars: None
  model_keep_outliers: True
  model_to_stp: False
  model_id: None
  model_name: None
  model_data_dir: None
  obs_id: None
  obs_name: None
  obs_data_dir: None
  obs_keep_outliers: False
  obs_use_climatology: False
  obs_add_meta: []
  gridded_reader_id: {'model': 'ReadGridded', 'obs': 'ReadGridded'}
  start: 2019
  stop: None
  ts_type: daily
  filter_name: None
  remove_outliers: True
  apply_time_resampling_constraints: True
  min_num_obs: {'monthly': {'daily': 22}}
  resample_how: None
  var_outlier_ranges: None
  var

In [11]:
stp.var_mapping = pya.web.web_naming_conventions.VAR_MAPPING

In [12]:
stp.run_evaluation()


Running analysis:
Obs. names: ['G-EBAS-d-rural', 'G-EEA-d-rural']
Model names: ['IFS-CTRL', 'IFS-OSUITE']
Remove outliers: True
Harmonise units: True
Delete existing json files before reanalysis: True
Reanalyse existing colocated NetCDF files: False
Run only colocation (no json files computed): False
Raise exceptions if they occur: False

Running colocation of IFS-CTRL against G-EBAS-d-rural
PREPARING colocation of ECMWF_CNTRL vs. GHOST.EBAS.daily
The following variable combinations will be colocated
MODEL-VAR	OBS-VAR
concpm10	concpm10
concpm25	concpm25
Running ECMWF_CNTRL / GHOST.EBAS.daily (concpm10, concpm10)
Deactivating file search by vertical code for ECMWF_CNTRL, since filenames do not include information about vertical code (probably AeroCom 2 convention)
Did not find concpm10 field but sconcpm10. Using the latter instead
Skip concpm10_REF-G-EBAS-d-rural_MOD-IFS-CTRL_20190101_20191231_daily_None.nc (file already exists)
Running ECMWF_CNTRL / GHOST.EBAS.daily (concpm25, concpm2

['/home/jonasg/MyPyaerocom/ws21/pyaerocom-meetings/Feb2021_Workshop/coldata/ECMWF_OSUITE/concpm25_REF-G-EEA-d-rural_MOD-IFS-OSUITE_20190101_20191231_daily_None.nc',
 '/home/jonasg/MyPyaerocom/ws21/pyaerocom-meetings/Feb2021_Workshop/coldata/ECMWF_OSUITE/concpm10_REF-G-EEA-d-rural_MOD-IFS-OSUITE_20190101_20191231_daily_None.nc']

## Looking at the results

If you have not done so already, launch the local webserver (from **web** repo):

```bash
php -S localhost:8000
```

And checkout:

http://localhost:8000/main.php?project=workshop2021