# Adaptive Waveform Inversion as Extended Source Inversion

## Overview

Adaptive Waveform Inversion (AWI) is (in some ways) similar to the approach we worked out to the single trace transmission problem. It is based on an extension that's like one single trace transmission for every source-receiver pair: that is, every such pair gets its own wavelet, and the goal is to concentrate the wavelet near $t=0$ as much as possible. There are several notable differences. First, AWI presumes that a "true" wavelet, that generates the data, is known. So it's possible to focus on an "adaptive" kernel that produces the extended wavelet by convolution with the known "true" wavelet. This adaptive kernel really should be $\delta(t)$ at the correct model, so multiplication by $t$ is really an annihilator, without having to make a "small support" assumption. We could have done this as well, but of course we got an estimate of the wavelet for our trouble, without having to assume a known "true" wavelet. Second, AWI deals with multidimensional models, and many traces, not just one. 

There is some understanding to be gained by applying the methodology we developed to this larger scale problem. First, I'll review AWI as it is presented in the literature, along with a computational approach that makes it practical and an example. Next comes recasting AWI into the penalty form that we used in studying the single trace transmission problem, and a re-interpretation of a normalization as a preconditioner. Applied to transmission problems, I believe that AWI is not immune from stagnation at useless estimates - that was the motivation for Surface Source Extension inversion, which is a generalization. Finally, I do not believe that any of these techniques are any better than FWI when applied to reflection data. So there are two interesting negative results to be had. However we will also be able to explain exatly why AWI works, when it does work.

 ## Warner-Guasch formulation of AWI
 
The version of AWI introduced by Michael Warner and Lluis Guasch (SEG 2014 and *Geophysics* 2016) assumes that seismic waves are governed by linear acoustics, and that each shot is associated to an isotropic point source with known location and wavelet. That is, the pressure and velocity fields $p({\bf x},t;{\bf x}_s)$, ${\bf v}({\bf x},t;{\bf x}_s)$ for the shot location ${\bf x}_s$ depend on the bulk modulus $\kappa({\bf x})$, buoyancy $\beta({\bf x})$ (reciprocal of the density $\rho({\bf x})$), and wavelet $w(t;{\bf x}_s)$ through the acoustic system
 $$
 \frac{\partial p}{\partial t} = - \kappa \nabla \cdot {\bf v} +
w(t) \delta({\bf x}-{\bf x}_s);
\frac{\partial {\bf v}}{\partial t} = - \beta \nabla p; 
p, {\bf v} = 0 \mbox{ for }  t \ll 0.
$$
The forward map or modeling operator is $S[\kappa,\rho]w = \{p({\bf x}_r,t;{\bf x}_s)\}$, in which shot and receiver positions ${\bf x}_s, {\bf x}_r$ define the acquisition geometry.

For now, assume that the data $d({\bf x}_r,t;{\bf x}_s)$ is the output of the modeling operator for "true" bulk modulus, buoyancy, and wavelet $\kappa_*, \beta_*, w_*$: that is, $d = S[\kappa_*, \beta_*]w_*$.

The extended modeling operator ${\bar S}$ maps extended sources $\bar{w}({\bf x}_r,t;{\bf x}_s)$ to the same sampling of the pressure field. That is, the extended source depends on the receiver location as well as the source location (so there is one acoustic system for each source *and* receiver position - a lot of wave equations!). 

AWI assumes that the extended sources are time convolutions of the (known) exact source with a kernel $u({\bf x}_r,t;{\bf x}_s)$: $\bar{w} = u * w_*$ - the asterisk denotes convolution in time. Since linear acoustics is time-translation invariant, its solution commutes with time convolution, that is,
$$
\bar{S}[\kappa,\beta]\bar{w} = u*S[\kappa,\beta]w_*.
$$
With this set-up, the object of inversion can be formulated as:

Given $d$, find $\kappa, \beta$ and $\bar{w}$ so that $u({\bf x}_r,t;{\bf x}_s) = \delta(t)$ (so $\bar{w}=w_*$) and $\bar{S}[\kappa,\beta]\bar{w} \approx d$. 

Warner and Guasch assume (implicitly) that there is always a (near-)zero-residual solution of the extended inversion problem. That is, they assume that for any $\kappa, \beta$, there is a $\bar{w}$ for which $\bar{S}[\kappa,\beta]\bar{w} \approx d$. If $\kappa \approx \kappa_*, \beta \approx \beta_*$, then $\bar{w}$ should be $\approx w_*$, so the adaptive kernel $u$ should be approximately $\delta(t)$ and independent of ${\bf x}_s,{\bf x}_r$. Such a $u$ is in the null space of multiplication by $t$. Thue a first version of the AWI algorithm:

1. Given $\kappa,\beta$, solve the problem $\bar{S}[\kappa,\beta]\bar{w} = d$ for $\bar{w}$. 
2. Deconvolve $w_*$ from $\bar{w}$ to obtain $u$ for which $\bar{w}=u * w_*$. 
3. Compute the objective 
$$
J_[\kappa,\beta] = \int dx_s dx_r \left(\frac{\int dt |tu|^2}{\int dt |u|^2}\right)
$$
and its gradient, then update $\kappa,\beta$ by some gradient descent method. (The role of the normalization per source and receiver by the $L^2$ norm of $u$ will be explained later.)

There are several things wrong with this algorithm. First, it involves way too many wave equation solves. Second, it may be sensitive to the amplitude of $\bar{w}$. Third, it depends on the assumption that you can fit the data exactly (which may or may not be the case, see examples to follow). Warner and Guasch fix the first two problems. The third is solved by relaxing the fit assumption via a penalty formulation (next section).

To solve the first problem, observe that the extended pressure field $\bar{p}$ is the convolution of the Green's function $G[\kappa,\beta]$ (solution of the acoustic system above with $w(t,{\bf x}_s) = \delta(t)$) with the source function $w(t,{\bf x}_s)$. Since this source function is independent of the receiver position, only one wave equation needs to be solved for each source position, same as for non-extended modeling. Accordingly,
$$
\bar{S}[\kappa,\beta]\bar{w} = G * \bar{w}
$$
Convolution is far cheaper than wave equation solve: thus the cost of computing the forward map is reduced to one wave equation solve per source, and one convolution per source and receiver.

[COMPUTATIONAL EXAMPLE]

On top of that, denote the convolution inverse of $G$ by $\check{G}$, and the convolution inverse of $w_*$ by $\check{w_*}$. Neither of these things may actually have a convolution inverse, strictly speaking. A solution of a regularized least-squares problem can play the same role. However these things are also cheaply approximated via discrete Fourier transform. 

Then
$$
\bar{w} = \check{G} * d, u =  \check{w_*}*\bar{w} = \check{w_*}*\check{G}*d
$$
and the first two steps in the above AWI algorithm can be replaced by inexpensive convolutions. This is the second version of the AWI algorithm. It is inexpensive enough to be employed on 3D data at field scale.


