# Full waveform inversion

## What is FWI?

Seismic imaging comprises a suite of minimally destructive techniques for measuring subsurface properties with applications in areas including hydrocarbon and mineral exploration, civil engineering and medical imaging. Such inversion problems constitute some of the most computationally demanding problems in industrial and academic research.

<tr>
    <td> <img src="figures/survey-ship-diagram.png" alt="Drawing" style="width: 450px;"/> </td>
    <td> <img src="figures/Marmousi3D.png" alt="Drawing" style="width: 450px;"/> </td>
</tr>

**Left:** Sketch of offshore seismic survey. **Right:** Example model result for $v_p$.

## The FWI algorithm

The aim of FWI is find a model that minimises some measure of the misfit between a dataset predicted by a model and an observed dataset - this measure is called the *objective function*.

A simple geometric analogy, in which the model has just two parameters, is to regard the misfit as being represented by the local height of a two-dimensional error surface, and the two model parameters as representing the $x$ and $y$-coordinates of a point on this surface. FWI then involves starting at some point on this surface, and trying to find the bottom of the deepest valley by heading downhill in a sequence of finite steps. To do this, we have to discover which way is downhill, and how far to step. In real FWI, the model has not just two parameters, but many millions, but the analogy is still appropriate. The algorithm proceeds as follows:
1. Calculate the direction of the local gradient $\nabla_\mathbf{m}$ of the objective function f with respect to the model parameters - this points uphill
    - Using the *starting model* $\mathbf{m}$ and a known *source* $\mathbf{s}$, calculate the forward *wavefield* $\mathbf{u}$ everywhere in the model including the *predicted data* $\mathbf{p}$ at the receivers.
    - At the receivers, subtract the observed data d from the predicted data to obtain the *residual data* $\delta\mathbf{d}$.
    - Treating the receivers as virtual sources, back-propagate the residual data into the model, to generate the residual wavefield $\delta\mathbf{u}$.
    - Scale the residual wavefield by the local slowness $1/c$, and differentiate it twice in time.
    - At every point in the model, cross-correlate the forward and scaled residual wavefields, and take the zero lag in time to generate the *gradient* for one source.
    - Do this for every source, and stack together the results to make the global gradient.
2. Find the step length - how far is the bottom of the hill?
    - Take a small step and a larger step directly downhill, and calculate the objective function at the current model and in these two new models.
    - Assume a linear relationship between changes in the model and changes in the residual data so that there will be a parabolic relationship between changes in the model and changes in the objective function, then fit a parabola through these three points.
    - The lowest point on this parabola represents the optimal step length (assuming a locally linear relationship).
    - Step downhill by the required amount, and update the model.
3. Do it all over again
    - Use the new model as the starting model, and repeat steps 1. and 2.
    - Repeat this process until the model is 'good enough', that is the model is no longer changing (to some numerical tolerance), or we run out of time, money or patience.

This is the basic algorithm. There are several ways to enhance and improve it, but nearly all of these involve a greater computational cost (which is already high).

## Forward (and backward) modelling

From the description of the FWI workflow above, it's clear that modelling the propagation of waves in the medium is an important part of the algorithm. For this modelling, a whole range of wave-equations (isotropic-acoustic, elastic, viscoelastic, anisotropic-viscoacoustic etc. etc.) and numerical techniques (Finite difference, finite (spectral) element) are available.

Which choices are 'better' depends on a huge number of factors:
- What problem are we solving?
- How good is our field/lab data and what information does it contain?
- How big is our computer/How much time/money do we have? i.e. What computational resources are available?
- What codes do we have?

For the purpose of this tutorial series, we'll focus on solving the isotropic acoustic wave-equation via the finite difference. To build our wave-propagator we'll utilize the domain specific language *Devito*.

## What is a domain specific language (DSL) and what is Devito?

From [Wikipedia](https://en.wikipedia.org/wiki/Domain-specific_language):

*A domain-specific language (DSL) is a computer language specialized to a particular application domain. This is in contrast to a general-purpose language (GPL), which is broadly applicable across domains. There are a wide variety of DSLs, ranging from widely used languages for common domains, such as HTML for web pages, down to languages used by only one or a few pieces of software, such as MUSH soft code. DSLs can be further subdivided by the kind of language, and include domain-specific markup languages, domain-specific modeling languages (more generally, specification languages), and domain-specific programming languages. Special-purpose computer languages have always existed in the computer age, but the term "domain-specific language" has become more popular due to the rise of domain-specific modeling. Simpler DSLs, particularly ones used by a single application, are sometimes informally called mini-languages.*

Devito for example, is a DSL designed for solving partial differential equations via the finite difference method (or more generally performing stencil based computations on structured grids). A few more bits of information re. Devito:
- Devito is an [open source](https://github.com/devitocodes/devito) developed in the Department of Earth Science and Engineering at Imperial College.
- It is in fact a DSL and **compiler**:
 - The DSL heavily subclasses (makes lots of use of) `SymPy` and the compiler *converts* the symbolic specification of the mathematics problem into optimized c-code.
- Devito can compile c-code suitable for use on Super-computers and GPUs out of the box.

Before talking about anything more, lets dive in and see Devito in action.

## Introduction to the Devito DSL and finite difference re-cap



--------------------------------------------- old -----------------------------

## FWI in practice - the challenges and computational demands

## What are DSLs and what is Devito?

## The FWI algorithm

## The wave-equation + Devito implementation

Setting up-models and how we do this in Devito

#### Various sub-steps e.g. explaining and defining the objective function

## Local inversion - Steepest Descent

## Calculating and interpreting the gradient

## Test case + running the algorithm