In [1]:
import scipy
import xarray as xr
import grib2io
import pandas as pd
import datetime
from glob import glob
import numpy as np
from corner import quantile
import matplotlib.pyplot as plt
import matplotlib.colors as colors
import matplotlib.cm as cm
import matplotlib
matplotlib.rcParams.update({
 "savefig.facecolor": "w",
 "figure.facecolor" : 'w',
 "figure.figsize" : (8,6),
 "text.color": "k",
 "legend.fontsize" : 20,
 "font.size" : 30,
 "axes.edgecolor": "k",
 "axes.labelcolor": "k",
 "axes.linewidth": 3,
 "xtick.color": "k",
 "ytick.color": "k",
 "xtick.labelsize" : 25,
 "ytick.labelsize" : 25,
 "ytick.major.size" : 12,
 "xtick.major.size" : 12,
 "ytick.major.width" : 2,
 "xtick.major.width" : 2,
 "font.family": 'STIXGeneral',
 "mathtext.fontset" : "cm"})
from PIL import Image
import statsmodels.api as sm
from statsmodels.distributions.empirical_distribution import ECDF
from mpl_toolkits.basemap import Basemap
from pyproj import Proj
from metpy.units import units
from mpl_toolkits.axes_grid1.inset_locator import inset_axes

## Data Assimilation and Coalescence

The process of combining observations and short-range forecasts to obtain an initial condition for NWP is called data assimilation. In the most general terms, data assimilation techniques attempt to combine theoretical models with observations in the "best way possible" to improve both sets of data.  This amounts to rectifying errors in both the forecast (magnitude, location and timing of event) and the necessarily sparse observations of the true system state: "for large scale systems, it is almost impossible to experimentally measure the full state of the system at a given time. For example, imagine simulating the atmospheric or oceanic flow, then you need to measure the velocity, temperature, density, etc. at every location corresponding to your numerical grid," (Ahmed+2020).

Data assimilation proceeds sequentially in time, applying a correction to the forecast based on a set of observed data and estimated errors that are present in both the observations and the forecast itself. The model organizes and propagates forward the information from previous observations. As new observations are added, that data is used to modify the model state to be as consistent as possible with them (i.e., data) *and* the previous information --> a sort of ouroboros of analysis and forecast.

![Pasted image 20231206164305.png](attachment:c0655903-1d21-4f90-89b5-40fa09a81749.png)


The very first step of the data assimilation cycle (which we don't see as it happens pre-blend) is the production of something like URMA -- taking irregularly spaced observations (both direct and indirect) of atmospheric state variables and estimating them onto a regular grid (e.g., spatial analysis of which there are several techniques that can be applied). Once gridded, these can be modified by the background field. And then applied to numerical models as a supplement for the ICs. 

In essence, this amounts to a pre-process for NWP but can also be used to combine predictions from several models + corrections from contemporaneous observations == the blend! In particular, correcting for the errors in the forecasts associated with both position and amplitude is difficult; "the standard measures of forecast skill, such as root-meansquare (r.m.s.) error and anomaly correlation all measure forecast error as the difference between a forecast and ananalysis at the same point in space and time. They are incapable of identifying a phase error as such. For example, a forecast of an intense, fast-moving feature can have a low skill score because of a small phase error that a human forecaster may consider minor. Hence, there is a need for an objective skill evaluation method that accounts for the presence of both phase and amplitude errors" (Nehrkorn+2003). 

Related is the treatment of probabilistic forecasts generated from ensembles of perturbed models. In the blend, there are ~17 models used to construct the value distributions for each variable (temp, precip, etc.). Some of these models are themselves made of an ensemble of models with different perturbation states (so in reality, 200+ realizations). In QM, each of these ensemble members would be mapped to the CDF constructed from the entire sample, but this can introduce issues originating from errors in the magnitude, timing, and location of weather events as predicted by the model. Furthermore, while QM (theoretically) provides really great probabilitic predictions for weather events, it's difficult to know how to communicate more deterministic estimates from an ensemble. 

An improvement we hope to make to the blend is to somehow efficiently combine the physically meaningful information encoded in the entire distribution of predictions from the ensemble. One way to do that is via coalescence, which seeks to quantify the uncertainity for coherent structures that can suffer multiple types of error in an ensemble of forecasts (as in above, it is necessary to correct for both phase and amplitude errors). "Consider, as an illustration, an ensemble of “one dimensional” fronts that contain position and amplitude errors, as shown in (the Figure below). If we were to ask what the mean front is, clearly the simplest solution is to take the mean of these fields, that is, calculate the mean vector. That would be terribly wrong, of course, because the mean simply does not look like any front in the ensemble. Coalescence instead calculates the mean amplitude field by marginalizing relative position errors. The method invokes an “N-body” type solution where each member in the ensemble gravitates to the others. In so doing, all of them discover a mean position where the amplitude mean is meaningful," (Ravela+2012)

![image.png](attachment:671d84a8-e458-4bbb-a715-0f31818f300a.png)

Here, I'll attempt to walk through some examples outlined in the handy beginner-level demo Ahmed+2020 and then expand to the technique of coalesence as defined in Ravela+2012.

## Data Assimilation as in Ahmed+2020

As a starting point in Ahmed+2020, we begin by defining a state vector **u** that evolves over time which describes all of the information about the actual state of (in our case) the atmosphere. The time evolution of the state is governed by a set of dynamical equations (which, unfortunately for us and is really the impetus of this entire process, are very very sensitive to the input initial conditions). **u$_t$** is then the vector of true values of the state. The background **u$_b$** contains prior information about the state of the system currently. Of course, we cannot observe all of the aspects of the system so the collection of information we actually know about the system at a specific time is **w**(t). In fact, we usually cannot directly observe most of the physical quantities of the state (e.g., temperature, humidity) and instead observe them indirectly via radar measurements. In light of this, **w** is related to **u$_t$** via a mapping *h* (observation operator) between state space and measurement space (plus some measurement noise/errors.)

The objective of DA is to combine our prior knowledge of the system state (**u$_b$**) and observations (**w**) to approximate the true state of the system (**u$_t$**). This approximation is the analysis **u$_a$**. 

To demonstrate how this works, we'll use an example dynamic system (called Lorenz 63) that has been well tested. We'll walk through the 3D variational DA method (though not an exhaustive derivation) and how we can optimize the analysis from simple assumptions about the background and measurement errors.

## Correcting Position and Amplitude Errors with Field Alignment (Ravela+2006)

As an example, I'll walkthrough the two 1D examples of forecasts with amplitude and position errors. In the first example, the forecast ensemble has a position error but every ensemble member has the _same_ offset from the true position, in addition to amplitude errors. In the second example, the ensemble again has position and amplitude errors but each member has a unique position offset. 

The demo in Ravela+2006 first demonstrates that typical data assimilation techniques that consider only amplitude errors are unfit for these problems.

example 1:
![image.png](attachment:52e4ae98-08b6-432b-b2b8-605abbe1a025.png)


example 2:

![image.png](attachment:0cfe90d0-16b0-454b-9a18-de35805b8ef6.png)


"The analysis from the procedure just discussed is shown. It is clear that the analysis (solid line) looks like neither the forecast ensemble nor the truth. It’s somewhere in between, being pulled by the observations in some places and the background at others. It has replaced a single front with a bimodal front of far weaker strength," (Ravela+2006).


"To address the position error problem, we reformulate the classical quadratic objective in a way that allows position adjustments in addition to amplitude adjustments. The key step in this new approach is to explicitly represent and minimize position errors. Therefore, we introduce auxiliary control variables (displacements) that are estimated along with amplitudes. The displacement variables are defined at each node of the grid representing the state and specify a deformation of the grid."