# TROPoe - The Basics

This notebook is indended to be an introduction to files produced by the Tropospheric Remotely Observed Profiling via Optimal Estimation (TROPoe) agorithm we use to retrieve temperature and humidity profiles from CLAMPS instrumentation. This algorithm uses optimal estimation to retrieve the thermodynamic state from infrared spetrometer and microwave radiometers.

The Atmospheric Emitted Radiance Interferometer (AERI) measures downwelling infrared radiance from 3-19 µm at high spectral resolution. Profiles of temperature and water vapor are retrieved from these observations, as well as cloud properties and trace gas information. Two blackbody targets maintain calibration to better than 1%. AERI systems are essentially identical between the CLAMPS facilities.

The Microwave Radiometer measures downwelling microwave radiance from 22 to 60 GHz in 10-20 channels (depending on model/configuration). Profiles of temperature and water vapor are retrieved from these observations. The MWR has lower vertical resolution than the AERI, but is able to get some information through clouds. These measurements are used in a thermodynamic retreival algortihm (either AERIoe or TROPoe). Both CLAMPS are equiped with a RPG HATPRO microwave radiometer.

The files that TROPoe produces have a lot of information in them, but most users of these files only care about a few components of these files. This notebook will introduce the basic file structure of a TROPoe file, parse out the most commonly used variables, and provide some example figures to show best practices when plotting these type of data.

For more information on optimal estimation retrievals using these instruments, here are some references:

- Maahn, M., D. D. Turner, U. Löhnert, D. J. Posselt, K. Ebell, G. G. Mace, and J. M. Comstock, 2020: Optimal Estimation Retrievals and Their Uncertainties: What Every Atmospheric Scientist Should Know. Bull. Amer. Meteor. Soc., 101, E1512–E1523, https://doi.org/10.1175/BAMS-D-19-0027.1.

- Turner, D. D., and U. Löhnert, 2014: Information Content and Uncertainties in Thermodynamic Profiles and Liquid Cloud Properties Retrieved from the Ground-Based Atmospheric Emitted Radiance Interferometer (AERI). J. Appl. Meteor. Climatol., 53, 752–771, https://doi.org/10.1175/JAMC-D-13-0126.1.

- Turner, D. D., and W. G. Blumberg, 2019: Improvements to the AERIoe Thermodynamic Profile Retrieval Algorithm. IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing, 12, 1339–1354, https://doi.org/10.1109/JSTARS.2018.2874968.



In [1]:
%matplotlib widget 
from ipywidgets import interact

from datetime import datetime

import cmocean
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import numpy as np

from utils import timeheight

## Basic file structure

First lets open one of the datasets from our THREDDs server, as in the introduction notebook: `clampstropoe10.aeri_mwr.v2.C1.20190920.001005.cdf`

The filename is split up into multiple parts and is loosely based on the ARM file name standard, though slightly modified to make it more apparent what instruments are included in the retrieval. Here is how to interpret it:

- `clampstropoe10`: This means that this is a TROPoe file is produced from CLAMPS observations. The 10 means the retrieval was run at 10 minute resolution 
- `aeri_mwr`: This means that this retrieval includes both AERI and MWR observations
- `v2`: This is the version of the retrieval. Always make sure you have the most up to date version of a retrieval since there are often multiple rounds of QC
- `C1`: This means the observations were from CLAMPS1 
- `20190920.001005`: This is the first observation date/time in the file


In [2]:
nc = Dataset('https://data.nssl.noaa.gov/thredds/dodsC/FRDD/CLAMPS/clamps/clamps1/processed/clampstropoe10.aeri_mwr.v2.C1/clampstropoe10.aeri_mwr.v2.C1.20190920.001005.cdf')


As mentioned before, if we print out the header of this netcdf, there are a large number of variables included in the file:

In [3]:
nc

<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF3_CLASSIC data model, file format DAP2):
    algorithm_code: TROPoe Retrieval Code
    algorithm_author: Dave Turner, Earth System Research Laboratory / NOAA dave.turner@noaa.gov
    algorithm_comment1: TROPoe is a physical-iterative algorithm that retrieves thermodynamic profiles from a wide range of ground-based remote sensors.  It was primarily designed to use either infrared spectrometers or microwave radiometers as the primary instrument, and include observations from other sources to improve the quality of the retrieved profiles
    algorithm_comment2: Original code was written in IDL and is described by the "AERIoe" papers listed below
    algorithm_comment3: Code was ported to python by Joshua Gebauer with contributions from Tyler Bell (both at the University of Oklahoma)
    algorithm_version: 0.2.36
    algorithm_reference1: DD Turner and U Loehnert, Information Content and Uncertanties in Thermodynamic Profiles and Liquid 

### Important meteorological variables

In reality, most users of these data are interested primarily in the follwing variables and their associated dimensions:

**base_time**: Epoch time -- ()

**time_offset**: Time offset from base_time -- ('time',)

**hour**: Time -- ('time',)

**qc_flag**: Manual QC flag -- ('time',)

**height**: height -- ('height',)

**temperature**: temperature -- ('time', 'height')

**waterVapor**: water vapor mixing ratio -- ('time', 'height')

**sigma_temperature**: 1-sigma uncertainty in temperature -- ('time', 'height')

**sigma_waterVapor**: 1-sigma uncertainty in water vapor mixing vapor -- ('time', 'height')

**converged_flag**: convergence flag -- ('time',)

**rmsr**: root mean square error between AERI obs in the observation vector and the forward calculation -- ('time',)

**rmsa**: root mean square error between observation vector and the forward calculation -- ('time',)

**rmsp**: root mean square error between prior T/q profile and the retrieved T/q profile -- ('time',)

**pressure**: derived pressure -- ('time', 'height')

**theta**: potential temperature -- ('time', 'height')

**thetae**: euivalent potential temperature -- ('time', 'height')

**rh**: relative humidity -- ('time', 'height')

**dewpt**: dew point temperature -- ('time', 'height')

**dindices**: derived indices -- ('time', 'index_dim')

**sigma_dindices**: 1-sigma uncertainties in the derived indices -- ('time', 'index_dim')

**lat**: latitude -- ()

**lon**: longitude -- ()

**alt**: altitude -- ()

There are quite a few different dimensions to these netcdfs, and that is due to the large amount of diagnosic information included in the file. We cover some of these other dimensions in the 12_TROPoe_Advanced notebook, but for now the most important are `time`, `height`, and `index_dim`. The `time` and `height` dims are pretty self explanatory, but the `index_dim` may be a little confusing. This dimension is for the derived indices (PWV, PBLH, surface inversion height, surface inversion magnitude, and LCL) contained in the `dindices` variable. The uncertainty of these variables are also calculated through Monte-Carlo simulation and stored in `sigma_dindices`. See below for which index corresponds to each variable:

In [4]:
nc['dindices']

<class 'netCDF4._netCDF4.Variable'>
float32 dindices(time, index_dim)
    long_name: derived indices
    units: units depends on the index; see comments below
    comment0: This field is derived from the retrieved fields
    comment1: A value of -999 indicates that this inded could not be computed (typically because the value was aphysical)
    field_0_name: pwv
    field_0_units: cm
    field_1_name: pblh
    field_1_units: km AGL
    field_2_name: sbih
    field_2_units: km AGL
    field_3_name: sbim
    field_3_units: C
    field_4_name: lcl
    field_4_units: km AGL
    _ChunkSizes: [1 5]
unlimited dimensions: time
current shape = (143, 5)
filling off

### Plotting data 
Now we'll see how to plot some sample data. 

First we need to decode the times. This is a pretty simple way to decode times from most of the CLAMPS datasets

In [5]:
times = np.array([datetime.utcfromtimestamp(d) for d in nc['base_time'][0]+nc['time_offset'][:]])

Now we'll plot some simple profiles of the temperature and water vapor mixing ratio, along with their uncertainty. One of the nice things about the TROPoe retrieval is the uncertainty is explicitly calculated by taking into account all the sources of uncerta

In [7]:

fig, (ax1, ax2) = plt.subplots(1, 2)
fig.set_figheight(5)
fig.set_figwidth(7.5)


@interact(ind=(0, nc.dimensions['time'].size-1))
def doplot(ind):
    t_op = nc['temperature'][:][ind]
    t_err = nc['sigma_temperature'][:][ind, :]
    ax1.cla()
    ax1.plot(nc['temperature'][:][ind], nc['height'][:], color='maroon')
    ax1.fill_betweenx(nc['height'][:], t_op+t_err, t_op-t_err, color='maroon', alpha=.2)
    ax1.set_ylim(0, 3)
    ax1.set_xlim(0, 35)
    ax1.grid()
    ax1.set_xlabel("Temperature [C]")
    ax1.set_ylabel("Height [km]")

    
    w_op = nc['waterVapor'][:][ind]
    w_err = nc['sigma_waterVapor'][:][ind]
    ax2.cla()
    ax2.plot(nc['waterVapor'][:][ind], nc['height'][:], color='C0')
    ax2.fill_betweenx(nc['height'][:], w_op+w_err, w_op-w_err, color='C0', alpha=.2)
    ax2.set_ylim(0, 3)
    ax2.set_xlim(0, 15)   
    ax2.grid()
    ax2.set_xlabel("WVMR [g/kg]")
    ax2.set_ylabel("Height [km]")
    
    plt.suptitle(times[ind])

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

interactive(children=(IntSlider(value=71, description='ind', max=142), Output()), _dom_classes=('widget-intera…

Now we will use the `timeheight` function located in `utils.py` to create a time height figure of the temperature and water vapor mixing ratio. 

In [8]:
# Get the times and make sure they are sorted
t = [datetime.utcfromtimestamp(d) for d in (nc['base_time'][:]+nc['time_offset'][:])]
h = nc['height'][:]

# Create the figure 
fig, (temp_ax, wvmr_ax) =  plt.subplots(2, sharex=True)
fig.set_figheight(7.5)
fig.set_figwidth(12.5)

X, Y = np.meshgrid(t, h)

timeheight(X, Y, nc['temperature'][:].T, 't', temp_ax, zmin=0, zmax=3, datamin=0, datamax=35)
timeheight(X, Y, nc['waterVapor'][:].T, 'q', wvmr_ax, zmin=0, zmax=3, datamin=0, datamax=18)

temp_ax.set_title(f"Temperature -- {times[0]:%Y-%m-%d}")
wvmr_ax.set_title(f"WVMR -- {times[0]:%Y-%m-%d}")

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

  c = ax.pcolormesh(time, height, data, vmin=datamin, vmax=datamax, cmap=cm, **kwargs)


Text(0.5, 1.0, 'WVMR -- 2019-09-20')

You might notice something strange happening between 10Z and 15Z. Additionally, it looks a little 'streaky' above .5 km or so. We'll investigate this a little more in the advanced notebook. 