# Script 2:

This script uses the template event we built in the previous script to detect new earthquakes using template matching. It shows how to write the input arrays in the right format for FMF.

In [None]:
import sys
import os
sys.path.append(os.getcwd())

import h5py as h5
import numpy as np
import utils

from obspy.core import UTCDateTime as udt
import matplotlib.pyplot as plt
import fast_matched_filter as fmf
from time import time as give_time

You might have received a warning message complaining about the cuda library not being here. This happens if you could not compile the C/cuda code when installing Fast Matched Filter (FMF). This is not a problem if you do not have any Nvidia GPUs, but you might want to recompile FMF if you want to beneficiate from GPUs. 

## Load the data and the template that we have just created

In [None]:
# load the data from day 2013-03-17
data = utils.load_data('data_FMF_tutorial.h5')

# load the template event that we have just built
template = utils.load_template('template.h5', path='./output/')
print('template is a Python dictionary and has 3 new categories compared \
      to the template_metadata previsouly used \
      (the P- and S-wave moveouts and the waveforms):\n', template.keys())

## Prepare the inputs for FMF

FMF expects 6 arguments:
- template_array $(n_{\mathrm{templates}} \times n_{\mathrm{stations}} \times n_{\mathrm{components}} \times n_{\mathrm{samples-in-template}})$
- moveout_array $(n_{\mathrm{templates}} \times n_{\mathrm{stations}} \times n_{\mathrm{components}})$
- weight_array $(n_{\mathrm{templates}} \times n_{\mathrm{stations}} \times n_{\mathrm{components}})$
- data_array $(n_{\mathrm{stations}} \times n_{\mathrm{components}} \times n_{\mathrm{samples-in-data}})$
- matched_filter_step 
- architecture

Because this example only uses a single template, $n_{\mathrm{templates}} = 1$.


In [None]:
# format the inputs for fmf by adding a new dimension
# with 1 element (because this example only uses one template)
template_array = template['waveforms'][np.newaxis, :]
print('Shape of the template array: ', template_array.shape)
moveouts = np.hstack( (template['moveouts_S'].reshape(-1, 1),
                       template['moveouts_S'].reshape(-1, 1),
                       template['moveouts_P'].reshape(-1, 1)) )
moveout_array = moveouts[np.newaxis, :]
print('Shape of the moveout array: ', moveout_array.shape)
# fmf requires a weight matrix used to compute the weighted correlation
# coefficient sum
weight_array = np.ones_like(moveout_array, dtype=np.float32)
n_stations = weight_array.shape[1]
n_components = weight_array.shape[2]
# normalize so that the max value is 1 (optional)
weight_array /= np.float32(n_stations * n_components)
print('Shape of the weight array: ', weight_array.shape)

# fmf needs two extra arguments:
matched_filter_step = 1 # if set to 1, the sliding windows are taken every sample
architecture = 'cpu' # run fmf on GPUs (other option is 'cpu')

## Run FMF!

FMF takes care of the core task of template matching: computing the correlation coefficients between the template and the data.

In [None]:
t_start = give_time()
cc_sum = fmf.matched_filter(template_array,
                            moveout_array,
                            weight_array,
                            data['waveforms'],
                            matched_filter_step,
                            arch=architecture)
t_end = give_time()
print('{:.2f} s to run the matched filter search'.format(t_end-t_start))

## Save the correlation coefficient time series

In [None]:
# save the output
with h5.File('./output/cc_sum.h5', mode='w') as f:
    f.create_dataset('cc_sum', data=cc_sum, compression='gzip')

## Plot the correlation coefficients

Note that the max value should be one, since we extracted the template event from this day and we are using **matched_filter_step** = 1. Note also that this is only possible when there are no rounding errors between the times used to extract the windows and the moveouts given to FMF (which is a common error!). Remember that we took care of rounding our travel times before extracting the windows, in order to have consistent window shifts and moveouts.

In [None]:
# tune some plotting parameters
font = {'family': 'serif', 
        'size': 18}
plt.rc('font', **font)

In [None]:
# plot the cc time series
figsize = (28, 12)
plt.figure('cc_sum', figsize=figsize)
time = np.linspace(0., float(cc_sum.shape[1]) / data['metadata']['sampling_rate'], cc_sum.shape[1])
smart_plot = np.abs(cc_sum[0, :]) > 2.5 * np.std(cc_sum[0, :])
plt.plot(time[smart_plot], cc_sum[0, smart_plot], lw=0.5)
plt.axhline(1, lw=2, ls='--', color='k')
plt.xlabel('Time (s)')
plt.ylabel('Average correlation coefficient')
plt.xlim(time.min(), time.max())
plt.show()

If the plot looks odd, with missing values, this is because we discard most of the low values to make the plot more friendly for your computer.