# 1.7:  Uncertainty estimation with the bootstrap - Earthquake location

*Andrew Valentine & Malcolm Sambridge - Research School of Earth Sciences, The Australian National University - Last updated Oct. 2020*

<!--<badge>--><a href="https://colab.research.google.com/github/anu-ilab/JupyterPracticals/blob/main/S1.7 - Bootstrap error propagation_earthquake.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a><!--</badge>-->

This practical explores the use of the Bootstrap method for uncertainty estimation. Recall that both linear theory error estimation as well as Monte Carlo error propagation required nowledge of the size of the data errors in the form of a data covariance matrix.   The bootstrap can be used to estimate error in a solution without knowledge of  size of errors in the data. Instead it can be applied by assuming that the data errors, or data residuals more usually, are independently, identically distributed, IID. This can be a reasonable assumption if data error correlation is minimal.

Here we apply the bootstrap to a nonlinear inverse problem that requires an iterative solution. 

As an example, we will consider is earthquake location.

Here we make use of a ready made python script to iteratively update an earthquake location for the $(x,y,z)$ and origin time, $t$.  We use a homogeneous crustal Earth model with wave speed, v=5.8 km/s. 


First load some libraries.

In [None]:
# -------------------------------------------------------- #
#                                                          #
#     Uncomment below to set up environment on "colab"     #
#                                                          #
# -------------------------------------------------------- #

# !pip install -U anu-inversion-course
# !git clone https://github.com/anu-ilab/JupyterPracticals
# %cd JupyterPracticals

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
from scipy import stats
import numpy as np
import math
import pickle
from anu_inversion_course import plotcovellipse as pc
from anu_inversion_course import eqlocate as eq


Now read in the seismic station locations and the arrival times of the earthquake at these stations. Also we read in a border for a plot of the iterative solution.

In [2]:
pickle_eq = open("Datasets/loctim.pickle","rb")
[la,lo,el,ts] = pickle.load(pickle_eq)

vp0=5.8 # seismic wavespeed of halfspace Earth model
vp = np.ones(la.shape[0])

# load border.xy
pickle_b = open("Datasets/border.pickle","rb")
[borderx,bordery] = pickle.load(pickle_b)

**Task 1** 

Use the earthquake location routine *eqlocate.py* to find the best fit location from the data.

The format of the routine eqlocate is as follows:
```

sols, res =eq.eqlocate(x0,y0,z0,ts,la,lo,el,vp,tol) # here sols are the iterative solutions found, 
                                                    # If solvedep=True is given as an argument it will solve for depth
                                                    # Otherwise depth is fixed at the input guess value.
Output of the routine is as follows:

res = the observed minus predicted arrival time of the final solution.
sols returns each iteration of the location procedure. Hence sols[-1] gives the final solution.

Parameters are arranged as follows:
tfinal = sols[-1,0] # First parameter is the origin time
xfinal = sols[-1,1] # Second parameter is the Longitude 
yfinal = sols[-1,2] # Third parameter is the Latitude 
zfinal = sols[-1,3] # Fourth parameter is the Depth 

```

In [3]:
# Try it here! You can insert more cells by selecting Cell > Insert Cell Above/Below from the menu
# bar, or by pressing Esc to enter command mode and then hitting A or B (for above/below). 


**Task 2** 

Describe how you would apply the bootstrap to calculate errors in the earthquake location parameters? 

##### Enter your answer here.  You can use Markdown to format text.


**Task 3** 

Use the bootstrap to calculate 5000 bootstrap solutions and then plot these about the original solution.

Answer: We perform 5000 bootstrap samples of the arrival time residuals and add these to the predicted arrival times, then relocate the event each time. 

In [4]:
# Try it here! You can insert more cells by selecting Cell > Insert Cell Above/Below from the menu
# bar, or by pressing Esc to enter command mode and then hitting A or B (for above/below). 



In [5]:
# Try it here! You can insert more cells by selecting Cell > Insert Cell Above/Below from the menu
# bar, or by pressing Esc to enter command mode and then hitting A or B (for above/below). 


Answer: A covariance matrix can be calculated from the ensemble of solutions.

**Task 4:** 

From the bootstrap output samples
$(x_1^i, y_2^i, z_3^i), (i=1,\dots, B)$ calculate the i) <span>**the mean bootstrap solution**</span>, ii) <span>**the model
co-variance**</span>, iii) <span>**the bias corrected solution**</span>,
and iv) <span>**the 95% confidence intervals**</span> for each of the unknowns. The bias correction is the mean of the differences between each Bootstrap solution and the estimator itself, which in this case is the best fit solution. This is subtracted from the best fit to produce the  <span>**the bias corrected solution**</span> as described in the notes.

The mean should look similar to the best fit values and
the bias should be small. The variance and confidence intervals
characterize the error in the estimated values of the unknowns.



In [6]:
# Try it here! You can insert more cells by selecting Cell > Insert Cell Above/Below from the menu
# bar, or by pressing Esc to enter command mode and then hitting A or B (for above/below). 



----