# Documentation for SeisAcoustic 

## Theory 

### Acoustic wave eqaution
The first order acoustic wave equation can be expressed as
$$\begin{align}
\rho \frac{\partial v_z}{\partial t} &=  \frac{\partial }{\partial z} \left( p_z + p_x \right)  \\
\rho \frac{\partial v_x}{\partial t} &=  \frac{\partial }{\partial x} \left( p_z + p_x \right) \\
\frac{\partial p_z}{\partial t} &= k \frac{\partial v_z}{\partial z} + S(t)\\
\frac{\partial p_x}{\partial t} &= k \frac{\partial v_x}{\partial x} \\
\end{align},$$
where $\rho$ is density, $k$ is bulk modulus, the pressure $p$ is split into two unphysical components $p_z$ and $p_x$, where $p = p_z + p_x$. These two components are used to account for the PML absorbing boundary conditions. s(t) is the pressure source term applied to the sum of the last two equations.

### Born approximation

Assure the density term is constant, the scatter wavefield is produced by the perturbation of bulk modulus $k$, so the new wavefield is governed by the following equations:
$$
\begin{align}
\rho \frac{\partial (v_z + \Delta v_z)}{\partial t} &=  \frac{\partial }{\partial z} \left( p_{z} + p_x + \Delta p_z  + \Delta p_x \right) \\
\rho \frac{\partial (v_x+ \Delta v_x)}{\partial t} &=  \frac{\partial }{\partial x} \left( p_{z} + p_x + \Delta p_z  + \Delta p_x \right) \\
\frac{\partial (p_{z} + \Delta p_{z})}{\partial t} &=(k+\Delta k) \frac{\partial (v_z + \Delta v_z)}{\partial z} + s(t)\\
\frac{\partial (p_{x} + \Delta p_x)}{\partial t} &= (k+\Delta k) \frac{\partial (v_x + \Delta v_x)}{\partial x} \\
\end{align}
$$
Subtracted by the original wave equation without model parameters pertubations and ignore the higher order terms, we get the scattered wave field equations as
$$
\begin{align}
\rho \frac{\partial \Delta v_z}{\partial t} &=  \frac{\partial }{\partial z} \left(\Delta p_z  + \Delta p_x \right) \\
\rho \frac{\partial \Delta v_x}{\partial t} &=  \frac{\partial }{\partial x} \left(\Delta p_z  + \Delta p_x \right) \\
\frac{\partial  \Delta p_{z}}{\partial t} &= k \frac{\partial  \Delta v_z}{\partial z} + \Delta k \frac{\partial v_z}{\partial z}\\
\frac{\partial \Delta p_x}{\partial t} &= k \frac{\partial  \Delta v_x}{\partial x} + \Delta k \frac{\partial v_x}{\partial x} \\
\end{align}
$$
combine the last two equations, we get
$$
\frac{\partial (\Delta p_z + \Delta p_x)}{\partial t} = k (\frac{\partial  \Delta v_z}{\partial z} + \frac{\partial  \Delta v_x}{\partial x} ) + \Delta k (\frac{\partial v_z}{\partial z} + \frac{\partial v_x}{\partial x}) \\
$$
We know that $$\frac{\partial (p_z + p_x)}{\partial t} = k( \frac{\partial v_z}{\partial z} + \frac{\partial v_x}{\partial x}) + S(t)$$
if we $p = p_z + p_x$ and using the equations $$ \frac{\partial v_z}{\partial z} + \frac{\partial v_x}{\partial x} = \frac{ \frac{\partial (p_z + p_x)}{\partial t} - S(t)}{k} $$

So the born approximation is obtained as
$$
\begin{align}
\rho \frac{\partial \Delta v_z}{\partial t} &=  \frac{\partial }{\partial z} \left(\Delta p_z  + \Delta p_x \right) \\
\rho \frac{\partial \Delta v_x}{\partial t} &=  \frac{\partial }{\partial x} \left(\Delta p_z  + \Delta p_x \right) \\
\frac{\partial  \Delta p_{z}}{\partial t} &= k \frac{\partial  \Delta v_z}{\partial z} +  \frac{ \frac{\partial p}{\partial t} - S(t)}{k} \Delta k \\
\frac{\partial \Delta p_x}{\partial t} &= k \frac{\partial  \Delta v_x}{\partial x}  \\
\end{align}
$$
$\frac{ \frac{\partial p}{\partial t} - S(t)}{k}$ is the so-called source-side wavefield and $\Delta k$ is the model parameter permutation.


### Name convention

1. Struct: camel case, for exmaple the struct for model paramters are named as *ModelParams*.
2. function and variable names use lower-case litters and dash line is used to separate words, for example *fd_coefficients*

### The location of sources and receivers are specified by grid index, and the index is start from 1.

## Discretization of Acoustic wave equation 

We use stagger grid finite difference to discretize the acoustic wave equation, the particle $v_z$ is put on the stagger grid $[i+\frac{1}{2},j,k]$, $v_x$ is put on the stagger grid $[i, j+\frac{1}{2}, k]$, and the pressure component is put on the grid $[i,j,k+\frac{1}{2}]$. Where $i,j,k$ are the index in vertical, horizontal, time directions, respectively.

### updating ${\bf v}_z$

We move the density term to the right of the equation and expressed as buoyancy.
\begin{equation}
\frac{\partial v_z}{\partial t} + \gamma_z v_z = b \frac{\partial p}{\partial z}
\end{equation}

The partial derivative operator is approximated by finite difference and the center of both side of the equation is on $[i+\frac{1}{2},j,k+\frac{1}{2}]$. Typically, the time step is much smaller than the spatial grid size, The finite difference is second order in time direction, and the spatial derivative is derived in any user-defined order.

$$
\frac{v_z[i+\frac{1}{2},j,k+1] - v_z[i+\frac{1}{2},j,k]}{dt} + \gamma_z[i+\frac{1}{2}]
\frac{v_z[i+\frac{1}{2},j,k+1] + v_z[i+\frac{1}{2},j,k]}{2}  = b[i+\frac{1}{2},j] \frac{1}{dz} \left( \sum_{n=1}^N  a_n (p[i+n,j,k+\frac{1}{2}] - p[i-n+1,j,k+\frac{1}{2}])  \right)
$$

Times $2dt$ on both side of the equation, we get
$$
2(v_z[i+\frac{1}{2},j,k+1] - v_z[i+\frac{1}{2},j,k]) + dt\, \gamma_z[i+\frac{1}{2}]
(v_z[i+\frac{1}{2},j,k+1] + v_z[i+\frac{1}{2},j,k])  = 2b[i+\frac{1}{2},j] \frac{dt}{dz} \left( \sum_{n=1}^N  a_n (p[i+n,j,k+\frac{1}{2}] - p[i-n+1,j,k+\frac{1}{2}])  \right)
$$

The left side only keep the terms involved with $v_z[i+\frac{1}{2},j,k+1]$, and all the other terms are moved to the right hand side.

$$
(2+ dt\, \gamma_z[i+\frac{1}{2}])v_z[i+\frac{1}{2},j,k+1] =  (2- dt\, \gamma_z[i+\frac{1}{2}])v_z[i+\frac{1}{2},j,k] + 2b[i+\frac{1}{2},j]\frac{dt}{dz}
\left( \sum_{n=1}^N  a_n (p[i+n,j,k+\frac{1}{2}] - p[i-n+1,j,k+\frac{1}{2}])  \right)
$$

The final form of the equation become
$$
v_z[i+\frac{1}{2},j,k+1] =  \frac{2- dt\, \gamma_z[i+\frac{1}{2}]}{(2+ dt\, \gamma_z[i+\frac{1}{2}])}v_z[i+\frac{1}{2},j,k] + \frac{2b[i+\frac{1}{2},j] \, dt}{(2+ dt\, \gamma_z[i+\frac{1}{2}])}
\left(\frac{1}{dz} \sum_{n=1}^N  a_n (p[i+n,j,k+\frac{1}{2}] - p[i-n+1,j,k+\frac{1}{2}])  \right)
$$

#### Details about coding:
we create a vector ${\bf M}_{v_z}^{v_z}$ length of $N_z$, which save the elements of pre-computed $\frac{2- dt\, \gamma_z[i+\frac{1}{2}]}{(2+ dt\, \gamma_z[i+\frac{1}{2}])}$, another vector called ${\bf M}_p^{v_z}$ length of $N_z \cdot N_x$, which save the elements of pre-computed $\frac{2b[i+\frac{1}{2},j]}{(2+ dt\, \gamma_z[i+\frac{1}{2}])}\frac{dt}{dz}$. In our implementation, the wavefield are savd in a long vector. 

For the spatial derivative, we use a 5 point grid, fourth-order finite difference as an example. We write them into matrix times vector form, which facilitate us to derive its associated adjoint operator.
$$
\begin{bmatrix}
\frac{\partial p}{\partial z}[1+\frac{1}{2}] \\
\frac{\partial p}{\partial z}[2+\frac{1}{2}] \\
\frac{\partial p}{\partial z}[3+\frac{1}{2}] \\
\frac{\partial p}{\partial z}[4+\frac{1}{2}] \\
\frac{\partial p}{\partial z}[5+\frac{1}{2}] \\
\end{bmatrix}
= \begin{bmatrix}
-a_1 & a_1 & a_2 & & \\
-a_2 & -a_1 & a_1 & a_2 & \\
& -a_2 & -a_1 & a_1 & a_2 \\
& & -a_2 & -a_1 & a_1 \\
& & & -a_2 & -a_1 \\
\end{bmatrix}
\begin{bmatrix}
p[1] \\
p[2] \\
p[3] \\
p[4] \\
p[5] \\
\end{bmatrix}
$$

In above expressions, we assum the value of wavefield outside the computation area is zero. The adjoint operator is given as
$$
\begin{bmatrix}
p[1] \\
p[2] \\
p[3] \\
p[4] \\
p[5] \\
\end{bmatrix}
= - \begin{bmatrix}
a_1 & a_2 & & & \\
-a_1 & a_1 & a_2 &  & \\
-a_2 & -a_1 & a_1 & a_2 &  \\
& -a_2& -a_1 & a_1 & a_2 \\
& & -a_2 & -a_1 & a_1 \\
\end{bmatrix}
\begin{bmatrix}
\frac{\partial p}{\partial z}[1+\frac{1}{2}] \\
\frac{\partial p}{\partial z}[2+\frac{1}{2}] \\
\frac{\partial p}{\partial z}[3+\frac{1}{2}] \\
\frac{\partial p}{\partial z}[4+\frac{1}{2}] \\
\frac{\partial p}{\partial z}[5+\frac{1}{2}] \\
\end{bmatrix}
$$

### updating ${\bf v}_x$

With respect to the second equation
$$
\frac{\partial v_x}{\partial t} + \gamma_x v_x = b \frac{\partial p}{\partial x}
$$

We just give the final form of the discretized equation as 
$$
v_x[i,j+\frac{1}{2},k+1] =  \frac{2- dt\, \gamma_x[j+\frac{1}{2}]}{(2+ dt\, \gamma_x[j+\frac{1}{2}])}v_x[i,j+\frac{1}{2},k] + \frac{2b[i,j+\frac{1}{2}]}{(2+ dt\, \gamma_x[j+\frac{1}{2}])}\frac{dt}{dx}
\left( \sum_{n=1}^N  a_n (p[i,j+n,k+\frac{1}{2}] - p[i,j-n+1,k+\frac{1}{2}]) \right)
$$

#### Details about coding:
we create a vector ${\bf M}_{v_x}^{v_x}$ length of $N_x$, which save the elements of pre-computed $\frac{2- dt\, \gamma_x[j+\frac{1}{2}]}{(2+ dt\, \gamma_x[j+\frac{1}{2}])}$, another vector called ${\bf M}_p^{v_x}$ length of $N_z \cdot N_x$, which save the elements of pre-computed $\frac{2b[i,j+\frac{1}{2}]}{(2+ dt\, \gamma_x[j+\frac{1}{2}])}\frac{dt}{dx}$.

For the spatial derivative, we use a 5 point grid, fourth-order finite difference as an example. We write them into matrix times vector form, which facilitate us to derive its associated adjoint operator.
$$
\begin{bmatrix}
\frac{\partial p}{\partial x}[1+\frac{1}{2}] \\
\frac{\partial p}{\partial x}[2+\frac{1}{2}] \\
\frac{\partial p}{\partial x}[3+\frac{1}{2}] \\
\frac{\partial p}{\partial x}[4+\frac{1}{2}] \\
\frac{\partial p}{\partial x}[5+\frac{1}{2}] \\
\end{bmatrix}
= \begin{bmatrix}
-a_1 & a_1 & a_2 & & \\
-a_2 & -a_1 & a_1 & a_2 & \\
& -a_2 & -a_1 & a_1 & a_2 \\
& & -a_2 & -a_1 & a_1 \\
& & & -a_2 & -a_1 \\
\end{bmatrix}
\begin{bmatrix}
p[1] \\
p[2] \\
p[3] \\
p[4] \\
p[5] \\
\end{bmatrix}
$$

In above expressions, we assum the value of wavefield outside the computation area is zero. The adjoint operator is given as
$$
\begin{bmatrix}
p[1] \\
p[2] \\
p[3] \\
p[4] \\
p[5] \\
\end{bmatrix}
= - \begin{bmatrix}
a_1 & a_2 & & & \\
-a_1 & a_1 & a_2 &  & \\
-a_2 & -a_1 & a_1 & a_2 &  \\
& -a_2& -a_1 & a_1 & a_2 \\
& & -a_2 & -a_1 & a_1 \\
\end{bmatrix}
\begin{bmatrix}
\frac{\partial p}{\partial x}[1+\frac{1}{2}] \\
\frac{\partial p}{\partial x}[2+\frac{1}{2}] \\
\frac{\partial p}{\partial x}[3+\frac{1}{2}] \\
\frac{\partial p}{\partial x}[4+\frac{1}{2}] \\
\frac{\partial p}{\partial x}[5+\frac{1}{2}] \\
\end{bmatrix}
$$

The last equation is give as 
$$
\frac{\partial p}{\partial t} = k(\frac{\partial v_z}{\partial z} + \frac{\partial v_x}{\partial x}) + \delta(z-z_0, x-x_0)f(t)
$$

where $f(t)$ is the source signature and $(z_0,x_0)$ indicate the location of source.

To accomodate the PML obsorbing boundary conditions, the pressure component is splited into two unphysical terms $p_z$ and $p_x$, and $p = p_z + p_x$. We get two equations for pressure component.

$$
\begin{align}
\frac{\partial p_z}{\partial t} + \gamma_z p_z &= k \frac{\partial v_z}{\partial z} \\
\frac{\partial p_x}{\partial t} + \gamma_x p_x &= k \frac{\partial v_x}{\partial x} +  \delta(z-z_0, x-x_0)f(t) \\
\end{align}
$$

As the source term is additive, so it can be added to either one of the above equations.

### updating ${\bf p}_z$

$$
\frac{\partial p_z}{\partial t} + \gamma_z p_z = k \frac{\partial v_z}{\partial z}
$$

Where $k$ is bulk modulus, and $k = \rho v^2$

The partial derivative operator is approximated by finite difference and the center of both side of the equation is on $[i,j,k+1]$. 

$$
\frac{p_z[i,j,k+\frac{3}{2}] - p_z[i,j,k+\frac{1}{2}]}{dt} + \gamma_z[i]
\frac{p_z[i,j,k+\frac{3}{2}] + p_z[i,j,k+\frac{1}{2}]}{2} = k[i,j] \frac{1}{dz} \left( \sum_{n=1}^N  a_n (v_z[i+n-\frac{1}{2},j,k+1] - v_z[i-n+\frac{1}{2},j,k+1]) \right)
$$

Times $2dt$ on both side of the equation, we get
$$
2(p_z[i,j,k+\frac{3}{2}] - p_z[i,j,k+\frac{1}{2}]) + dt \, \gamma_z[i]
(p_z[i,j,k+\frac{3}{2}] + p_z[i,j,k+\frac{1}{2}]) = 2k[i,j] \frac{dt}{dz} \left( \sum_{n=1}^N  a_n (v_z[i+n-\frac{1}{2},j,k+1] - v_z[i-n+\frac{1}{2},j,k+1]) \right)
$$

After some simple operations, we can get
$$
(2+dt \, \gamma_z[i])p_z[i,j,k+\frac{3}{2}] = (2-dt \, \gamma_z[i]) p_z[i,j,k+\frac{1}{2}] + 2k[i,j] \frac{dt}{dz} \left( \sum_{n=1}^N  a_n (v_z[i+n-\frac{1}{2},j,k+1] - v_z[i-n+\frac{1}{2},j,k+1]) \right)
$$

the final form for this equation is given as 
$$
p_z[i,j,k+\frac{3}{2}] = \frac{2-dt \, \gamma_z[i]}{2+dt \, \gamma_z[i]} p_z[i,j,k+\frac{1}{2}] + \frac{2k[i,j]}{2+dt \, \gamma_z[i]}\frac{dt}{dz} \left( \sum_{n=1}^N  a_n (v_z[i+n-\frac{1}{2},j,k+1] - v_z[i-n+\frac{1}{2},j,k+1]) \right)
$$


#### Details about coding:
we create a vector ${\bf M}_{p_z}^{p_z}$ length of $N_z$, which save the elements of pre-computed $\frac{2- dt\, \gamma_z[i]}{(2+ dt\, \gamma_z[i])}$, another vector called ${\bf M}_{v_z}^{p_z}$ length of $N_z \cdot N_x$, which save the elements of pre-computed $\frac{2k[i,j]}{(2+ dt\, \gamma_z[i])}\frac{dt}{dz}$.

For the spatial derivative, we use a 5 point grid, fourth-order finite difference as an example. We write them into matrix times vector form, which facilitate us to derive its associated adjoint operator.
$$
\begin{bmatrix}
\frac{\partial v_z}{\partial z}[1] \\
\frac{\partial v_z}{\partial z}[2] \\
\frac{\partial v_z}{\partial z}[3] \\
\frac{\partial v_z}{\partial z}[4] \\
\frac{\partial v_z}{\partial z}[5] \\
\end{bmatrix}
= \begin{bmatrix}
a_1 & a_2 & & &\\
-a_1& a_1 & a_2 & & \\
-a_2& -a_1 & a_1 & a_2 & \\
& -a_2 & -a_1 & a_1 & a_2 \\
& & -a_2 & -a_1 & a_1 \\
\end{bmatrix}
\begin{bmatrix}
v_z[1+\frac{1}{2}] \\
v_z[2+\frac{1}{2}] \\
v_z[3+\frac{1}{2}] \\
v_z[4+\frac{1}{2}] \\
v_z[5+\frac{1}{2}] \\
\end{bmatrix}
$$

In above expressions, we assum the value of wavefield outside the computation area is zero. The adjoint operator is given as
$$
\begin{bmatrix}
v_z[1+\frac{1}{2}] \\
v_z[2+\frac{1}{2}] \\
v_z[3+\frac{1}{2}] \\
v_z[4+\frac{1}{2}] \\
v_z[5+\frac{1}{2}] \\
\end{bmatrix}
= - \begin{bmatrix}
-a_1 & a_1 & a_2 & & \\
-a_2 & -a_1 & a_1 & a_2 &   \\
& -a_2 & -a_1 & a_1 & a_2  \\
& & -a_2 & -a_1 & a_1  \\
& & & -a_2 & -a_1  \\
\end{bmatrix}
\begin{bmatrix}
\frac{\partial v_z}{\partial z}[1] \\
\frac{\partial v_z}{\partial z}[2] \\
\frac{\partial v_z}{\partial z}[3] \\
\frac{\partial v_z}{\partial z}[4] \\
\frac{\partial v_z}{\partial z}[5] \\
\end{bmatrix}
$$

### updating ${\bf p}_x$
Similarly, we can get the discretized equations for the last equation

$$
\frac{\partial p_x}{\partial t} + \gamma_x p_x = k \frac{\partial v_x}{\partial x} +  \delta(z-z_0, x-x_0)f(t) \\
$$

$$
\frac{p_x[i,j,k+\frac{3}{2}] - p_x[i,j,k+\frac{1}{2}]}{dt} + \gamma_x[j]
\frac{p_x[i,j,k+\frac{3}{2}] + p_z[i,j,k+\frac{1}{2}]}{2} = k[i,j] \frac{1}{dx} \left( \sum_{n=1}^N  a_n (v_x[i,j+n-\frac{1}{2},k+1] - v_x[i,j-n+\frac{1}{2},k+1]) \right) + \delta(z-z_0, x-x_0) f(t)
$$

Time $2\,dt$ on both side of the equations, we get
$$
2(p_x[i,j,k+\frac{3}{2}] - p_x[i,j,k+\frac{1}{2}]) + dt \, \gamma_x[j]
(p_x[i,j,k+\frac{3}{2}] + p_x[i,j,k+\frac{1}{2}]) = 2 k[i,j] \frac{dt}{dx} \left( \sum_{n=1}^N  a_n (v_x[i,j+n-\frac{1}{2},k+1] - v_x[i,j-n+\frac{1}{2},k+1]) \right) + 2 dt\, \delta(z-z_0, x-x_0)f(t)
$$

Move the terms include $p_x[i,j,k+\frac{3}{2}]$ to the left side of the equation
$$
(2+dt \, \gamma_x[j])p_x[i,j,k+\frac{3}{2}] = (2-dt \, \gamma_x[j]) p_x[i,j,k+\frac{1}{2}] + 2k[i,j] \frac{dt}{dx} \left( \sum_{n=1}^N  a_n (v_x[i,j+n-\frac{1}{2},k+1] - v_x[i,j-n+\frac{1}{2},k+1]) \right) + 2 dt\, \delta(z-z_0, x-x_0)f(t)
$$

divided by the the term $(2+ dt\, \gamma_x[j])$
$$
p_x[i,j,k+\frac{3}{2}] = \frac{2-dt \, \gamma_x[j]}{2+dt \, \gamma_x[j]} p_x[i,j,k+\frac{1}{2}] + \frac{2k[i,j]}{2+dt \, \gamma_x[j]}\frac{dt}{dx} \left( \sum_{n=1}^N  a_n (v_x[i,j+n-\frac{1}{2},k+1] - v_x[i,j-n+\frac{1}{2},k+1]) \right) + dt \,\delta(z-z_0, x-x_0)f(t) 
$$

Outside of the PML area, $\gamma_x$ is always equal to zeros, so we have 
$$
\frac{2\,dt}{2+dt\, \gamma_x[j]}\,\delta(z-z_0, x-x_0)f(t)  = dt \,\delta(z-z_0, x-x_0)f(t) 
$$


#### Details about coding:
we create a vector ${\bf M}_{p_x}^{p_x}$ length of $N_x$, which save the elements of pre-computed $\frac{2- dt\, \gamma_x[j]}{(2+ dt\, \gamma_x[j])}$, another vector called ${\bf M}_{v_x}^{p_x}$ length of $N_z \cdot N_x$, which save the elements of pre-computed $\frac{2k[i,j]}{(2+ dt\, \gamma_x[j])}\frac{dt}{dx}$.

For the spatial derivative, we use a 5 point grid, fourth-order finite difference as an example. We write them into matrix times vector form, which facilitate us to derive its associated adjoint operator.
$$
\begin{bmatrix}
\frac{\partial v_x}{\partial x}[1] \\
\frac{\partial v_x}{\partial x}[2] \\
\frac{\partial v_x}{\partial x}[3] \\
\frac{\partial v_x}{\partial x}[4] \\
\frac{\partial v_x}{\partial x}[5] \\
\end{bmatrix}
= \begin{bmatrix}
a_1 & a_2 & & &\\
-a_1& a_1 & a_2 & & \\
-a_2& -a_1 & a_1 & a_2 & \\
& -a_2 & -a_1 & a_1 & a_2 \\
& & -a_2 & -a_1 & a_1 \\
\end{bmatrix}
\begin{bmatrix}
v_x[1+\frac{1}{2}] \\
v_x[2+\frac{1}{2}] \\
v_x[3+\frac{1}{2}] \\
v_x[4+\frac{1}{2}] \\
v_x[5+\frac{1}{2}] \\
\end{bmatrix}
$$

In above expressions, we assum the value of wavefield outside the computation area is zero. The adjoint operator is given as
$$
\begin{bmatrix}
v_x[1+\frac{1}{2}] \\
v_x[2+\frac{1}{2}] \\
v_x[3+\frac{1}{2}] \\
v_x[4+\frac{1}{2}] \\
v_x[5+\frac{1}{2}] \\
\end{bmatrix}
= - \begin{bmatrix}
-a_1 & a_1 & a_2 & & \\
-a_2 & -a_1 & a_1 & a_2 &   \\
& -a_2 & -a_1 & a_1 & a_2  \\
& & -a_2 & -a_1 & a_1  \\
& & & -a_2 & -a_1  \\
\end{bmatrix}
\begin{bmatrix}
\frac{\partial v_x}{\partial x}[1] \\
\frac{\partial v_x}{\partial x}[2] \\
\frac{\partial v_x}{\partial x}[3] \\
\frac{\partial v_x}{\partial x}[4] \\
\frac{\partial v_x}{\partial x}[5] \\
\end{bmatrix}
$$

We define the the state field vector at two neighbouring time step as 
$$
{\bf u}^{k} = \begin{bmatrix}
{\bf v}_z^{k} \\
{\bf v}_x^{k} \\
{\bf p}_z^{k+\frac{1}{2}} \\
{\bf p}_x^{k+\frac{1}{2}} \\
\end{bmatrix}, \,\,\, 
{\bf u}^{k+1} = \begin{bmatrix}
{\bf v}_z^{k+1} \\
{\bf v}_x^{k+1} \\
{\bf p}_z^{k+1+\frac{1}{2}} \\
{\bf p}_x^{k+1+\frac{1}{2}} \\
\end{bmatrix},
$$

With these expressions, one step forward modeling is given as 
$$
\begin{bmatrix}
{\bf v}_z^{k+1} \\
{\bf v}_x^{k+1} \\
{\bf p}_z^{k+1+\frac{1}{2}} \\
{\bf p}_x^{k+1+\frac{1}{2}} \\
\end{bmatrix}
=\begin{bmatrix}
\bf{I}   & \bf{0}   & \bf{0} & \bf{0}   \\
\bf{0}   & \bf{I}   & \bf{0} & \bf{0}   \\
{\bf M}_{v_z}^{p_z} & \bf{0} & {\bf M}_{p_z}^{p_z} & \bf{0}   \\ 
\bf{0} & {\bf M}_{v_x}^{p_x} & \bf{0} & {\bf M}_{p_x}^{p_x} \\ 
\end{bmatrix}
\begin{bmatrix}
{\bf M}_{v_z}^{v_z} & \bf{0} & {\bf M}_{p}^{v_z} & {\bf M}_{p}^{v_z} \\
\bf{0} & {\bf M}_{v_x}^{v_x} & {\bf M}_{p}^{v_x} & {\bf M}_{p}^{v_x} \\
\bf{0}   & \bf{0}   & \bf{I}   & \bf{0}   \\
\bf{0}   & \bf{0}   & \bf{0}  & \bf{I}    \\
\end{bmatrix}
\begin{bmatrix}
{\bf v}_z^{k} \\
{\bf v}_x^{k} \\
{\bf p}_z^{k+\frac{1}{2}} \\
{\bf p}_x^{k+\frac{1}{2}}
\end{bmatrix}
$$

Or simplified as $${\bf u}^{k+1} = {\bf L} {\bf u}^{k}$$, where $${\bf L} = {\bf L_p}{\bf L_v}$$

The multi-step forward modelling can be expressed as, here I just use four time steps as an exmaple which make the derivation of inversion processe much easier.
$$
\begin{bmatrix}
{\bf u}^1 \\
{\bf u}^2 \\
{\bf u}^3 \\
{\bf u}^4 \\
\end{bmatrix}
=
\begin{bmatrix}
{\bf I} & & & \\
 & {\bf I} & & \\
 & &  {\bf I}& \\
 & & {\bf L}& {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
 & {\bf I} & & \\
 & {\bf L} &  {\bf I}& \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
{\bf L} & {\bf I} & & \\
 & &  {\bf I}& \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
& {\bf I} & & \\
 & &  {\bf I}& \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf f}_1 \\
{\bf f}_2 \\
{\bf f}_3 \\
{\bf f}_4 \\
\end{bmatrix}
$$

The synthetic data at one time step ${\bf d}^{k}= {\bf S} {\bf u}^{k}$, where ${\bf S}$ is the sampling operator which depends on the loation of receivers. For FWI, we estimate the model parameters by minimizing the cost function as $$
J = \frac{1}{2}|| {\bf d} - {\bf d}_{obs}||_2^2$$, where ${\bf d}_{obs}$ is the observed data.
The next part is to deriving the gradient of the objective function with repect to one model parameter $m_i$.
$$\frac{\partial J}{\partial m_i} = (\frac{\partial {\bf u}}{\partial m_i})^T \frac{\partial J}{\partial {\bf u}}$$

where $\frac{\partial J}{\partial {\bf u}} = {\bf S}^T ({\bf d} - {\bf d}_{obs})$. Usually, people name the difference between synthetic data and the observed data as residue ${\bf r}$, so we have $$
\frac{\partial J}{\partial {\bf u}} = {\bf S}^T {\bf r}
$$

The we derivae the derivative of $\frac{\partial {\bf u}}{\partial m_i}$
$$
\begin{align}
\frac{\partial {\bf u}}{\partial m_i} &= \begin{bmatrix}
{\bf 0} & & & \\
 & {\bf 0} & & \\
 & &  {\bf 0}& \\
 & & \frac{\partial {\bf L}}{\partial m_i}& {\bf 0} \\
\end{bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
 & {\bf I} & & \\
 & {\bf L} &  {\bf I}& \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
{\bf L} & {\bf I} & & \\
 & &  {\bf I}& \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
& {\bf I} & & \\
 & &  {\bf I}& \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf f}_1 \\
{\bf f}_2 \\
{\bf f}_3 \\
{\bf f}_4 \\
\end{bmatrix}                  \\
&+ \begin{bmatrix}
{\bf I} & & & \\
 & {\bf I} & & \\
 & &  {\bf I}& \\
 & & {\bf L}& {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf 0} & & & \\
 & {\bf 0} & & \\
 & \frac{\partial {\bf L}}{\partial m_i} & {\bf 0}& \\
 & & & {\bf 0} \\
\end {bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
{\bf L} & {\bf I} & & \\
 & &  {\bf I}& \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
& {\bf I} & & \\
 & &  {\bf I}& \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf f}_1 \\
{\bf f}_2 \\
{\bf f}_3 \\
{\bf f}_4 \\
\end{bmatrix}                 \\
&+ \begin{bmatrix}
{\bf I} & & & \\
 & {\bf I} & & \\
 & &  {\bf I}& \\
 & & {\bf L}& {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
 & {\bf I} & & \\
 & {\bf L} &  {\bf I}& \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
\frac{\partial {\bf L}}{\partial m_i} & {\bf I} & & \\
 & &  {\bf I}& \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
& {\bf I} & & \\
 & &  {\bf I}& \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf f}_1 \\
{\bf f}_2 \\
{\bf f}_3 \\
{\bf f}_4 \\
\end{bmatrix}
\end{align}
$$

The above expression can be simplified as 
$$
\begin{align}
\frac{\partial {\bf u}}{\partial m_i} &= \begin{bmatrix}
{\bf 0} & & & \\
 & {\bf 0} & & \\
 & &  {\bf 0}& \\
 & & \frac{\partial {\bf L}}{\partial m_i}& {\bf 0} \\
\end{bmatrix}
\begin{bmatrix}
{\bf u}_1 \\
{\bf u}_2 \\
{\bf u}_3 \\
{\bf f}_4 \\
\end{bmatrix}                  \\
&+ \begin{bmatrix}
{\bf I} & & & \\
 & {\bf I} & & \\
 & &  {\bf I}& \\
 & & {\bf L}& {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf 0} & & & \\
 & {\bf 0} & & \\
 & \frac{\partial {\bf L}}{\partial m_i} & {\bf 0}& \\
 & & & {\bf 0} \\
\end {bmatrix}
\begin{bmatrix}
{\bf u}_1 \\
{\bf u}_2 \\
{\bf f}_3 \\
{\bf f}_4 \\
\end{bmatrix}                 \\
&+ \begin{bmatrix}
{\bf I} & & & \\
 & {\bf I} & & \\
 & &  {\bf I}& \\
 & & {\bf L}& {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
 & {\bf I} & & \\
 & {\bf L} &  {\bf I}& \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf 0} & & & \\
\frac{\partial {\bf L}}{\partial m_i} & {\bf 0} & & \\
 & &  {\bf 0}& \\
 & & & {\bf 0} \\
\end {bmatrix}
\begin{bmatrix}
{\bf u}_1 \\
{\bf f}_2 \\
{\bf f}_3 \\
{\bf f}_4 \\
\end{bmatrix}
\end{align}
$$

$$
\begin{align}
\frac{\partial J}{\partial m_i} &= (\frac{\partial {\bf u}}{\partial m_i})^T \frac{\partial J}{\partial {\bf u}} \\
&= \begin{bmatrix}
{\bf u}_1^T & {\bf u}_2^T & {\bf u}_3^T & {\bf f}_4^T
\end{bmatrix}
\begin{bmatrix}
{\bf 0} & & & \\
 & {\bf 0} & & \\
 & &  {\bf 0}& (\frac{\partial {\bf L}}{\partial m_i})^T \\
 & &  & {\bf 0} \\
\end{bmatrix}
\begin{bmatrix}
{\bf S}^T {\bf r}_1 \\
{\bf S}^T {\bf r}_2 \\
{\bf S}^T {\bf r}_3 \\
{\bf S}^T {\bf r}_4 \\
\end{bmatrix}            \\
&+ \begin{bmatrix}
{\bf u}_1^T & {\bf u}_2^T & {\bf f}_3^T & {\bf f}_4^T
\end{bmatrix}
\begin{bmatrix}
{\bf 0} & & & \\
 & {\bf 0} & (\frac{\partial {\bf L}}{\partial m_i})^T & \\
 & &  {\bf 0}&  \\
 & &  & {\bf 0} \\
\end{bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
 & {\bf I} & & \\
 & &  {\bf I}& {\bf L}^T \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf S}^T {\bf r}_1 \\
{\bf S}^T {\bf r}_2 \\
{\bf S}^T {\bf r}_3 \\
{\bf S}^T {\bf r}_4 \\
\end{bmatrix}            \\
&+ \begin{bmatrix}
{\bf u}_1^T & {\bf f}_2^T & {\bf f}_3^T & {\bf f}_4^T
\end{bmatrix}
\begin{bmatrix}
{\bf 0} & (\frac{\partial {\bf L}}{\partial m_i})^T   & & \\
 & {\bf 0} & & \\
 & &  {\bf 0}& \\
 & &  & {\bf 0} \\
\end{bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
 & {\bf I} & {\bf L}^T & \\
 & &  {\bf I}&  \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf I} & & & \\
 & {\bf I} & & \\
 & &  {\bf I}& {\bf L}^T \\
 & & & {\bf I} \\
\end {bmatrix}
\begin{bmatrix}
{\bf S}^T {\bf r}_1 \\
{\bf S}^T {\bf r}_2 \\
{\bf S}^T {\bf r}_3 \\
{\bf S}^T {\bf r}_4 \\
\end{bmatrix}            \\
\end{align}
$$

We record the adjoint wavefield as ${\bf b}^k$, so the gradient can be simplified as 
$$
\begin{align}
\frac{\partial J}{\partial m_i} &= (\frac{\partial {\bf u}}{\partial m_i})^T \frac{\partial J}{\partial {\bf u}} \\
&= \begin{bmatrix}
{\bf u}_1^T & {\bf u}_2^T & {\bf u}_3^T & {\bf f}_4^T
\end{bmatrix}
\begin{bmatrix}
{\bf 0} & & & \\
 & {\bf 0} & & \\
 & &  {\bf 0}& (\frac{\partial {\bf L}}{\partial m_i})^T \\
 & &  & {\bf 0} \\
\end{bmatrix}
\begin{bmatrix}
{\bf S}^T {\bf r}_1 \\
{\bf S}^T {\bf r}_2 \\
{\bf S}^T {\bf r}_3 \\
{\bf b}_4 \\
\end{bmatrix}            \\
&+ \begin{bmatrix}
{\bf u}_1^T & {\bf u}_2^T & {\bf f}_3^T & {\bf f}_4^T
\end{bmatrix}
\begin{bmatrix}
{\bf 0} & & & \\
 & {\bf 0} & (\frac{\partial {\bf L}}{\partial m_i})^T & \\
 & &  {\bf 0}&  \\
 & &  & {\bf 0} \\
\end{bmatrix}
\begin{bmatrix}
{\bf S}^T {\bf r}_1 \\
{\bf S}^T {\bf r}_2 \\
 {\bf b}_3 \\
 {\bf b}_4 \\
\end{bmatrix}            \\
&+ \begin{bmatrix}
{\bf u}_1^T & {\bf f}_2^T & {\bf f}_3^T & {\bf f}_4^T
\end{bmatrix}
\begin{bmatrix}
{\bf 0} & (\frac{\partial {\bf L}}{\partial m_i})^T   & & \\
 & {\bf 0} & & \\
 & &  {\bf 0}& \\
 & &  & {\bf 0} \\
\end{bmatrix}
\begin{bmatrix}
{\bf S}^T {\bf r}_1 \\
 {\bf b}_2 \\
 {\bf b}_3 \\
 {\bf b}_4 \\
\end{bmatrix}            \\
\end{align}
$$

So the final form is 
$$\frac{\partial J}{\partial m_i} = {\bf u}_3^T (\frac{\partial L}{\partial m_i})^T {\bf b}_4 
+ {\bf u}_2^T (\frac{\partial L}{\partial m_i})^T {\bf b}_3 
+ {\bf u}_1^T (\frac{\partial L}{\partial m_i})^T {\bf b}_2 $$

In [None]:
h