# Lab 2b: getting started with APCEMM

<a target="_blank" href="https://colab.research.google.com/github/contrailcirrus/2024-06-contrails-workshop/blob/main/labs/apcemm/APCEMM.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

**Tristan Abbott (tristan.abbott@breakthroughenergy.org)**

This lab provides an introduction to the Aircraft Plume Chemistry Emission and Microphysics Model (APCEMM), an intermediate-complexity contrail model developed at the [MIT Laboratory for Aviation and the Environment](https://lae.mit.edu/), and demonstrates how to use a [pycontrails](https://py.contrails.org/) interface to APCEMM to easily run APCEMM using real-world flight and meteorology data.

## Building APCEMM

APCEMM (https://github.com/MIT-LAE/APCEMM) is written in C++ and must be compiled before use. The build process is straightforward but can take up to an hour if dependencies (managed by vcpkg) also have to be built. The first cell below contains commands for building APCEMM locally, with the source code pinned to the most recent commit hash tested in the pycontrails interface. The second cell downloads and unzips the result from building APCEMM in Colab.

Lab attendees should run the second cell to download the pre-built version. (This approach is somewhat fragile--the pre-built version may not work if different Colab instances run on sufficiently different architectures--but building APCEMM from scratch would take up most of the lab.)

In [None]:
# build APCEMM locally (~45 minutes)
!git clone https://github.com/MIT-LAE/APCEMM ~/APCEMM && \
    cd ~/APCEMM && \
    git reset --hard 9d8e1ee && \
    git submodule update --init --recursive && \
    mkdir build && \
    cd build && \
    cmake ../Code.v05-00 && \
    cmake --build .

In [None]:
# download results from building APCEMM in Colab (<5 minutes)
!cd ~ && wget https://storage.googleapis.com/2024-06-contrails-workshop/apcemm/APCEMM.zip && unzip APCEMM.zip && rm APCEMM.zip

## Installing pycontrails

This command installs pycontrails plus all optional dependencies besides `jupyter`, which conflicts with Colab requirements. It also installs
- `pyarrow`: to read parquet files

In [None]:
!pip install "pycontrails[ecmwf,gcp,gfs,pyproj,sat,vis,zarr]"
!pip install pyarrow

## Download required data

This command downloads meteorology and flight data from a public cloud bucket. Downloading the data rather than reading it directly from cloud storage avoids the need for users to authenticate using a Google Cloud account.

In [None]:
!cd ~ && \
    wget https://storage.googleapis.com/2024-06-contrails-workshop/apcemm/iagos.pq && \
    wget https://storage.googleapis.com/2024-06-contrails-workshop/apcemm/era5.zarr.zip && \
    unzip era5.zarr.zip && \
    rm era5.zarr.zip

## Case 1: APCEMM with idealized meteorology

We will use an idealized case (a contrail that forms in an ISSR of finite depth and limited time duration) to demonstrate the workflow for a single APCEMM simulation.

### Step 1a: constructing meteorology input file

APCEMM expects meteorology data to be provided in a netCDF file that contains a timeseries of atmospheric profiles along the Lagrangian trajectory of an advected contrail segment. Note that the trajectory of the advected segment must be estimated *before* running APCEMM. Unlike CoCiP, APCEMM does not internally track changes to contrail position over time.

APCEMM requires pressure at t = 0 plus time-varying temperature, RH over ice, segment-normal wind shear, and vertical velocity* as input. We will use the pressure and temperature profiles from the [International Standard Atmosphere](https://en.wikipedia.org/wiki/International_Standard_Atmosphere), a simple step function for RHi, and constant value of 0.01 1/s and 0 m/s for segment-normal shear and vertical velocity. 

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

In [None]:
z = np.linspace(0, 20, 41)  # altitude (km)
t = np.linspace(0, 4, 49)  # elapsed time (hours)

In [None]:
zz, tt = np.meshgrid(z, t, indexing="ij")

In [None]:
p = np.where(
    z < 11.0,
    101325*(1 - 6.5*z/288.15)**(9.80/(6.5e-3*287)),
    22632*np.exp(-9.80*(z - 11)*1e3/(287*216.65))
)
T = np.where(zz < 11.0, 288.15 - 6.5*zz, 216.65)
rhi = np.where((tt <= 0.5) & (zz > 9) & (zz < 11), 1.2, 0.2)
shear = np.full_like(zz, 0.01)
w = np.full_like(zz, 0.0)

In [None]:
ds = xr.Dataset(data_vars = {
    "pressure": (("altitude",), p/100, {"units": "hPa"}),
    "temperature": (("altitude", "time"), T, {"units": "K"}),
    "relative_humidity_ice": (("altitude", "time"), rhi*100, {"units": "percent"}),
    "shear": (("altitude", "time"), shear, {"units": "1/s"}),
    "w": (("altitude", "time"), w, {"units": "m/s"})
}, coords = {
    "altitude": ("altitude", z, {"units": "km"}),
    "time": ("time", t, {"units": "hours"})
})  

In [None]:
plt.figure(figsize=(12.8, 4.8))
plt.subplot(121)
ds["relative_humidity_ice"].plot()
plt.subplot(122)
ds["temperature"].plot()

### Step 1b: constructing input YAML file

Most APCEMM input parameters are configured in a YAML file. The file format is largely self-describing, and examples distributed with APCEMM include explanatory comments. We will generate the YAML file using some pycontrails utilities that expose many (but, for now, not all) of the YAML file parameters.

The pycontrails utilities require that the user provide
- the initial exhaust plume location (we will pick an arbitrary location)
- meteorological conditions at the point of emission (we will derive these from the idealized meteorology dataset)
- aircraft performance and emissions parameters (we will use nominal values)

Other YAML parameters are set to reasonable default values but can be overriden by the user. Note that the default time step for APCEMM numerics is set to 1 minute, a conservative value that is likely shorter than required.

In [None]:
from pycontrails.models.apcemm import utils
from pycontrails.models.apcemm.inputs import APCEMMInput
from pycontrails.physics import thermo

In [None]:
iz = 20  # form contrail at 10 km
theta = T[:,0]*(p[0]/p)**(287/1000)
params = APCEMMInput(
    # required parameters
    longitude=0,
    latitude=45,
    day_of_year=1,
    hour_of_day=12,
    air_pressure=p[iz],
    air_temperature=T[iz,0],
    rhw=rhi[iz,0]*thermo.e_sat_ice(T[iz,0])/thermo.e_sat_liquid(T[iz,0]),
    normal_shear=shear[iz,0],
    brunt_vaisala_frequency=np.sqrt((theta[iz+1] - theta[iz-1])/(z[iz+1] - z[iz-1])/theta[iz]),
    nox_ei=10e-3,
    co_ei=1e-3,
    hc_ei=0.6e-3,
    so2_ei=1.2e-3,
    nvpm_ei_m=0.008e-3,
    soot_radius=20e-9,
    fuel_flow=0.7,
    aircraft_mass=1e5,
    true_airspeed=260.0,
    n_engine=2,
    wingspan=35,
    core_exit_temp=550,
    core_exit_area=1,
    # optional parameters
    max_age=np.timedelta64(4, "h"),  # stop after no more than 4 hours
    dt_input_met=np.timedelta64(5, "m"),  # must match time step of met input file
    dt_apcemm_nc_output=np.timedelta64(10, "m"),  # frequency of netcdf output files
)

### Step 2: write input files to disk and run APCEMM

In [None]:
import xarray as xr

In [None]:
import os

In [None]:
rundir = os.path.expanduser("~/APCEMM_run/case_1")
os.makedirs(rundir, exist_ok=True)

with open(os.path.join(rundir, "input.yaml"), "w") as f:
    yaml = utils.generate_apcemm_input_yaml(params)
    f.write(yaml)

ds.to_netcdf(os.path.join(rundir, "input.nc"))

In [None]:
!cd ~/APCEMM_run/case_1 && ~/APCEMM/build/APCEMM input.yaml

### Step 3: view output

The APCEMM simulation created two types output files inside a subdirectory called `out`:
- `Micro000000.out`: output from the "early plume model"; i.e., the parameterization of the aircraft exhaust plume and downwash vortex, formatted as a CSV file
- `ts_aerosol_case0_HHMM.nc`: output from a finite volume model of the contrail cross-section initialized from the early plume model, formatted as netCDF files with HHMM replaced by the hour and minute of the simulation when each file was written.

In [None]:
import pandas as pd

The early plume model resolves the transient spike in relative humidity as the exhaust plume mixes with ambient air:

In [None]:
df = pd.read_csv(os.path.join(rundir, "out", "Micro000000.out"), skiprows=[1]).rename(columns=lambda x: x.strip())

In [None]:
plt.plot(df["Time [s]"], df["RH_w [-]"], "b-", label="over water")
plt.plot(df["Time [s]"], df["RH_i [-]"], "k-", label="over ice")
plt.xlabel("Elapsed time (s)")
plt.ylabel("Relative humidity (nondim.)")
plt.gca().set_xscale("log")
plt.gca().axhline(y=1, color="gray", zorder=-1)
plt.legend(loc="upper left", frameon=False)

The finite-volume model simulates the evolution of the contrail that forms from the exhaust plume. It relaxes the Gaussian plume assumption used by CoCiP...

In [None]:
plt.figure(figsize=(12.8, 9.6))
plt.subplot(221)
ds = xr.open_dataset(os.path.join(rundir, "out", "ts_aerosol_case0_0010.nc"), decode_cf=False)
ds["IWC"].plot(cmap="Blues_r", vmin=0)
plt.annotate("00:10", xy=(0.98, 0.98), xycoords="axes fraction", va="top", ha="right", color="white")

plt.subplot(222)
ds = xr.open_dataset(os.path.join(rundir, "out", "ts_aerosol_case0_0030.nc"), decode_cf=False)
ds["IWC"].plot(cmap="Blues_r", vmin=0)
plt.annotate("00:30", xy=(0.98, 0.98), xycoords="axes fraction", va="top", ha="right", color="white")

plt.subplot(223)
ds = xr.open_dataset(os.path.join(rundir, "out", "ts_aerosol_case0_0100.nc"), decode_cf=False)
ds["IWC"].plot(cmap="Blues_r", vmin=0)
plt.annotate("01:00", xy=(0.98, 0.98), xycoords="axes fraction", va="top", ha="right", color="white")

plt.subplot(224)
ds = xr.open_dataset(os.path.join(rundir, "out", "ts_aerosol_case0_0230.nc"), decode_cf=False)
ds["IWC"].plot(cmap="Blues_r", vmin=0)
plt.annotate("02:30", xy=(0.98, 0.98), xycoords="axes fraction", va="top", ha="right", color="white")

... and explicitly simulates the evolution of the ice crystal size distribution.

In [None]:
plt.figure()

ds = xr.open_dataset(os.path.join(rundir, "out", "ts_aerosol_case0_0010.nc"), decode_cf=False)
plt.plot(ds["r"]*1e6, ds["Overall size distribution"]/1e6, label="00:10")

ds = xr.open_dataset(os.path.join(rundir, "out", "ts_aerosol_case0_0030.nc"), decode_cf=False)
plt.plot(ds["r"]*1e6, ds["Overall size distribution"]/1e6, label="00:30")

ds = xr.open_dataset(os.path.join(rundir, "out", "ts_aerosol_case0_0100.nc"), decode_cf=False)
plt.plot(ds["r"]*1e6, ds["Overall size distribution"]/1e6, label="01:00")

ds = xr.open_dataset(os.path.join(rundir, "out", "ts_aerosol_case0_0230.nc"), decode_cf=False)
plt.plot(ds["r"]*1e6, ds["Overall size distribution"]/1e6*100, label="02:30 (x100)")

plt.xlabel(r"Radius ($\mu$m)")
plt.ylabel(r"Density (particles/$\mu$m)")
plt.legend(loc="upper right", frameon=False)

## Case 2: APCEMM on a real-world flight

We'll select an IAGOS flight between San Diego and Frankfurt in early February 2019 and use the pycontrails APCEMM interface to run APCEMM simulations initialized at a couple of waypoints. The interface is designed to be similar to other pycontrails models: you load meteorology data into a [MetDataset](https://py.contrails.org/api/pycontrails.MetDataset.html#pycontrails.MetDataset), create an [APCEMM]() model, and call the model's `eval` method on a [Flight](https://py.contrails.org/api/pycontrails.core.flight.html#pycontrails.core.flight.Flight) instance.

In [None]:
import cartopy.crs as ccrs
import matplotlib.dates as mdates

from pycontrails.core import MetDataset, Flight
from pycontrails.models.apcemm import APCEMM
from pycontrails.models.issr import ISSR
from pycontrails.models.humidity_scaling import HistogramMatching
from pycontrails.models.ps_model import PSFlight

### Step 1: load meteorology and flight data

The required data is staged in a public cloud bucket for this lab. After loading the data, we'll use the pycontrails [ISSR](https://py.contrails.org/api/pycontrails.models.issr.html#pycontrails.models.issr.ISSR) model to quickly compute and plot ERA5 RHi at the two waypoints where we'll run APCEMM.

In [None]:
ds = xr.open_zarr("~/era5.zarr")
met = MetDataset(ds, provider="ECMWF", dataset="ERA5", product="reanalysis")

In [None]:
df = pd.read_parquet("~/iagos.pq")
flight = Flight(data=df, attrs={"flight_id": "0"}).resample_and_fill("1min")

In [None]:
model = ISSR(met=met, humidity_scaling=HistogramMatching())
result = model.eval(flight)

In [None]:
waypoints = [180, 300]

In [None]:
ax = plt.subplot(111, projection=ccrs.PlateCarree())
ax.coastlines()
ax.set_global()
ax.plot(flight["longitude"], flight["latitude"], "b-", transform=ccrs.Geodetic())
for idx in waypoints:
    ax.plot(flight["longitude"][idx], flight["latitude"][idx], "r.", transform=ccrs.Geodetic())

In [None]:
plt.plot(df["time"], df["rhi"], "k-", label="IAGOS")
plt.plot(result["time"], result["rhi"], "b-", label="ERA5")
plt.legend(loc="best", frameon=False)
plt.xlabel("Time")
plt.ylabel("RHi (nondim)")
plt.gca().xaxis.set_major_locator(plt.MaxNLocator(5))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter("%m-%d %H:%M"))
for idx in waypoints:
    plt.gca().axvline(flight["time"][idx], color="red", zorder=-1)

### Step 2: create and evaluate an `APCEMM` model

This model automates the steps we did by hand before running the idealized APCEMM case earlier in the lab. Specifically, it
- runs an aircraft performance model (in this case [PSFlight](https://py.contrails.org/api/pycontrails.models.ps_model.PSFlight.html)) to compute performance and emissions parameters,
- combines results from the aircraft performance model and meteorology data to create input YAML files, 
- runs a [DryAdvection](https://py.contrails.org/notebooks/advection.html) model to estimate the Lagrangian trajectories of advected contrail segments,
- uses the computed trajectories plus the MetDataset passed to the model to create netCDF meteorology files in the format expected by APCEMM,
- creates run directories (under in `~/.cache/pycontrails/apcemm` by default, though this can be changed by passing a custom [DiskCacheStore](https://py.contrails.org/api/pycontrails.DiskCacheStore.html#pycontrails-diskcachestore) to the model) and writes YAML and netCDF input files to disk,
- runs APCEMM simulations, and
- does light postprocessing of APCEMM output.

To limit runtime, we'll set the maximum simulation duration to 1 hour and increase the APCEMM timestep to 10 minutes.

In [None]:
model = APCEMM(
    apcemm_path=os.path.expanduser("~/APCEMM/build/APCEMM"),
    met=met,
    max_age=np.timedelta64(2, "h"),
    aircraft_performance=PSFlight(),
    humidity_scaling=HistogramMatching(),
    apcemm_input_params=dict(
        dt_apcemm_nc_output=np.timedelta64(10, "m"),
        dt_apcemm_transport=np.timedelta64(10, "m"),
        dt_apcemm_coagulation=np.timedelta64(10, "m"),
        dt_apcemm_ice_growth=np.timedelta64(10, "m")
    ) 
)

In [None]:
result = model.eval(flight, waypoints=waypoints, n_jobs=1)

### Step 3: examine model output

Lagrangian trajectories of advected contrail segments are stored in `model.trajectories`. (Note that Lagrangian trajectories are computed for all waypoints, not just waypoints where APCEMM simulations are initialized, but we'll plot them only for waypoints where we ran simulations.)

In [None]:
model.trajectories

In [None]:
df = model.trajectories.dataframe

ax = plt.subplot(111, projection=ccrs.PlateCarree())
ax.coastlines()
ax.set_extent([-110, -50, 30, 65], crs=ccrs.Geodetic())
ax.plot(flight["longitude"], flight["latitude"], "b-", transform=ccrs.Geodetic(), label="Flight trajectory")
for i, waypoint in enumerate(waypoints):
    label = "Contrail segment trajectories" if i == 0 else ""
    head = flight.dataframe[flight.dataframe.index == waypoint]
    tail = df[df["waypoint"] == waypoint]
    traj = pd.concat((head, tail))
    ax.plot(traj["longitude"], traj["latitude"], "r-", transform=ccrs.Geodetic(), label=label)
ax.legend(loc="upper left")

The output from `model.eval` stores quantities calculated for APCEMM input files plus the status of APCEMM simulations initialized at each waypoint.

In [None]:
result

"Incomplete" indicates that a persistent contrail formed but did not dissipate before the maximum simulation time was reached.

In [None]:
result.dataframe[result.dataframe["waypoint"] == 180]["status"]

"NoWaterSaturation" indicates that no contrail formed because the exhaust plume never reached saturation over liquid water while mixing with ambient air.

In [None]:
result.dataframe[result.dataframe["waypoint"] == 300]["status"]

"NoSimulation" indicates that no APCEMM simulation was initialize at the waypoint:

In [None]:
result.dataframe[result.dataframe["waypoint"] == 400]["status"]

Output from the APCEMM early plume model is stored in a DataFrame in `model.vortex`:

In [None]:
model.vortex

In [None]:
plt.figure(figsize=(12.8, 4.8))

df = model.vortex[model.vortex["waypoint"] == 180]
elapsed_time = (df["time"] - df["time"].min())/np.timedelta64(1, "s")
plt.subplot(121)
plt.plot(elapsed_time, df["RH_w [-]"], "b-", label="over water")
plt.plot(elapsed_time, df["RH_i [-]"], "k-", label="over ice")
plt.xlabel("Elapsed time (s)")
plt.ylabel("Relative humidity (nondim.)")
plt.gca().set_xscale("log")
plt.gca().axhline(y=1, color="gray", zorder=-1)
plt.title("Waypoint 180 (persistent contrail)")
plt.legend(loc="upper left", frameon=False)

df = model.vortex[model.vortex["waypoint"] == 300]
elapsed_time = (df["time"] - df["time"].min())/np.timedelta64(1, "s")
plt.subplot(122)
plt.plot(elapsed_time, df["RH_w [-]"], "b-", label="over water")
plt.plot(elapsed_time, df["RH_i [-]"], "k-", label="over ice")
plt.xlabel("Elapsed time (s)")
plt.ylabel("Relative humidity (nondim.)")
plt.gca().set_xscale("log")
plt.gca().axhline(y=1, color="gray", zorder=-1)
plt.title("Waypoint 300 (no contrail formation)")
plt.legend(loc="upper left", frameon=False)

Finally, paths to netCDF output from the finite-volume contrail cross-section model are stored in `model.contrail`:

In [None]:
model.contrail

In [None]:
plt.figure(figsize=(12.8, 9.6))

plt.subplot(221)
df = model.contrail.iloc[1]
ds = xr.open_dataset(df["path"], decode_cf=False)
ds["IWC"].plot(cmap="Blues_r", vmin=0)
plt.annotate(df["time"], xy=(0.98, 0.98), xycoords="axes fraction", va="top", ha="right", color="white")

plt.subplot(222)
df = model.contrail.iloc[4]
ds = xr.open_dataset(df["path"], decode_cf=False)
ds["IWC"].plot(cmap="Blues_r", vmin=0)
plt.annotate(df["time"], xy=(0.98, 0.98), xycoords="axes fraction", va="top", ha="right", color="white")

plt.subplot(223)
df = model.contrail.iloc[7]
ds = xr.open_dataset(df["path"], decode_cf=False)
ds["IWC"].plot(cmap="Blues_r", vmin=0)
plt.annotate(df["time"], xy=(0.98, 0.98), xycoords="axes fraction", va="top", ha="right", color="white")

plt.subplot(224)
df = model.contrail.iloc[13]
ds = xr.open_dataset(df["path"], decode_cf=False)
ds["IWC"].plot(cmap="Blues_r", vmin=0)
plt.annotate(df["time"], xy=(0.98, 0.98), xycoords="axes fraction", va="top", ha="right", color="white")

## Concluding remarks

- APCEMM is more expensive to run than CoCiP, but simulations can easily be parallelized across waypoints. If you're running on a large machine, the `n_jobs` parameter can be used in the `APCEMM` constructor or in `APCEMM.eval` to run multiple simulations in parallel.

- The pycontrails APCEMM interface is relatively immature. If you're interested in using it and it's missing features you need or you think you've found a bug, please get in touch or open an issue on [GitHub](https://github.com/contrailcirrus/pycontrails/issues). (And remember: you do not have to use pycontrails to use APCEMM!)

- APCEMM does not provide contrail radiative forcing as an output, though it provides all of the quantities needed to compute contrail radiative forcing offline, and this is a feature we would like to add to the interface eventually. If it's something you'd like to be able to use, please let us know.