# Deep learning-based event classification

For this next exercise, it's time to start building deep learning models ourselves.

The scenario is to read in waveforms of unknown content, and have a neural network evaluate they contain signal from an earthquake event, or just noise. To achieve this we need

1. to build a suitable neural networks, and
2. to train it.

For point 2, we need labelled data, and ideally as much of it as possible. We start out by downloading files with our training and testing data, containing 100k and 10k waveforms, respectively. In total it's around 2.5GB, so it might take a few minutes.

In [None]:
! wget --no-check-certificate "https://drive.usercontent.google.com/download?id=1jD0RetG2ZGZdFaZKF7O0y6Ogzzj9fgFz&confirm=t" -O "events_classification_Zonly_TRAIN.h5"
! wget --no-check-certificate "https://drive.usercontent.google.com/download?id=1jQ7Cf0W1dM5VgmOCB6gu08ddHdYmuh3F&confirm=t" -O "events_classification_Zonly_TEST.h5"

Import the usual libraries.

In [None]:
import numpy as np
import h5py
import scipy.signal
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt

## Load data from files

With the files in place, we can load the contents.

In case you are running this on a laptop, remenber that we need around 3GB to load it all in memory (so you might want to close some Chrome tabs first). 

First, the training data.

**To be nice to the memory of the machine, we can set the maximum number of events to load.** If set at `None` we load all events in the file, so if you run into trouble, consider setting it to e.g. 10000 and see how that goes. If everthing runs fine, you can raise it and try again.

In [None]:
# If you change this, also re-run the following two code cells.
max_events = None

In [None]:
with h5py.File('events_classification_Zonly_TRAIN.h5') as train_file:

    wf_dataset = train_file.get('waveforms')
    label_dataset = train_file.get('type') 

    if max_events is None:
        train_waveforms = wf_dataset[:]
        train_labels = label_dataset[:]
    else:
        train_waveforms = wf_dataset[:max_events]
        train_labels = label_dataset[:max_events]

# We need to adjust the shape of this one, for technical reasons
train_labels = np.expand_dims(train_labels, axis=-1)

# Normalise the waveforms
max_vals = np.max(np.abs(train_waveforms), axis=1, keepdims=True)
train_waveforms /= (max_vals + 1e-8)

# Check the shapes:
print('train_waveforms.shape:', train_waveforms.shape)
print('train_labels.shape:', train_labels.shape)

Then the test data:

In [None]:
with h5py.File('events_classification_Zonly_TEST.h5') as test_file:

    wf_dataset = test_file.get('waveforms')
    label_dataset = test_file.get('type') 

    if max_events is None:
        test_waveforms = wf_dataset[:]
        test_labels = label_dataset[:]
    else:
        test_waveforms = wf_dataset[:max_events]
        test_labels = label_dataset[:max_events]

test_labels = np.expand_dims(test_labels, axis=-1)

# Normalise 
max_vals = np.max(np.abs(test_waveforms), axis=1, keepdims=True)
test_waveforms /= (max_vals + 1e-8)

# Check the shapes:
print('test_waveforms.shape:', test_waveforms.shape)
print('test_labels.shape:', test_labels.shape)

## Build a classification model

Let's start assembling the neural network. Our first model will have inputs sequentially processed through a series of layers, so we can specify it as an ordered list of layers, and give it to `keras.Sequential`.

Because we want to detect patterns at any location in the waveforms, we choose convolutional layers (`Conv1D`) as the backbone of our network. To look at increasingly longer sequences of the input, we also downsample after each convolution, using `MaxPooling1D` layers.

Finally, we collect the detected patterns using a `GlobalMaxPooling` layer, and add a `Dense` layer with a single output at the end, which will be our class prediction. Note that we use the _sigmoid_ activation function, which ensures that the prediction will be between 0 (noise) and 1 (signal).

In [None]:
model = keras.Sequential(
    [
        keras.Input(shape=(train_waveforms.shape[1], train_waveforms.shape[2])),
        
        keras.layers.Conv1D(filters=16, kernel_size=3, activation='relu'),
        keras.layers.MaxPooling1D(2),
        
        keras.layers.Conv1D(filters=16, kernel_size=3, activation='relu'),
        keras.layers.MaxPooling1D(2),
        
        keras.layers.Conv1D(filters=16, kernel_size=3, activation='relu'),
        
        keras.layers.GlobalMaxPooling1D(),
        keras.layers.Dense(1, activation='sigmoid')
    ]
)

To print the structure of our network, we kan use `summary()`:

In [None]:
model.summary()

Cool. In order to train the model, we need a success criterion, also known as the _loss function_. This will be high if we are making bad predictions, and zero if we are making perfect predictions. 

The `compile()` function can take a lot of arguments, but now we just specify the loss function typically used for binary classification, and add that we want to compute the accuracy of predictions during training, too.

In [None]:
model.compile(
    loss="binary_crossentropy",
    metrics=["accuracy"]
)

## Train the model

Now for the computationally intensive bit: Minimising the loss funtion. 

Two important options here:

- **epochs:** The number of times to iterate over the entire dataset
- **batch_size:** How many events to group together when running gradient descent

Further options are described in the [documentation](https://keras.io/api/models/model_training_apis/#fit-method).

In [None]:
model.fit(
    train_waveforms,
    train_labels,
    validation_data=(test_waveforms, test_labels),
    batch_size=128,
    epochs=2,
    verbose=1
)

During training, we ideally want to see the loss value going down, and converge to a low value. Let's investigate a little first, and then tune our settings later.

## Evaluate results

With our model trained and ready to go, it's time to evaluate performance!

First we can compute the accuracy on the independent training data (accuracy of 0 means all predictions are wrong, 0.5 equals random guessing, and 1 means all predictions are correct):

In [None]:
model.evaluate(test_waveforms, test_labels)

We can also plot a few events and visually inspect predictions.

In [None]:
def plot_prediction(event_index):

    fig = plt.figure(figsize=(12,5))

    # Select from the test data.
    waveform = test_waveforms[event_index, :, :]
    waveform = np.expand_dims(waveform, axis=0)

    true_label = test_labels[event_index]
    
    prediction = model(waveform)[0]
    
    true_type = 'Noise' if true_label < 0.5 else 'Earthquake'
    pred_type = 'Noise' if prediction < 0.5 else 'Earthquake'
    
    plt.plot(np.arange(waveform.shape[1]), waveform[0, :, 0])

    plt.title(f'True label: {true_type}, predicted: {pred_type} (confidence: {prediction}')


In [None]:
plot_prediction(event_index=0)

### Exercise

Now it's your turn to improve on the model! Things to try:

- Increasing the epoch number
- Adding more `Conv1D` layers to the model
- Tuning the options of the `Conv1D` layer: Increasing filters, kernel sizes, stride lengths... See options [here](https://keras.io/api/layers/convolution_layers/convolution1d/)
- Adding or substituting in other types of layers

For the full list of layers to try, check the documentation: https://keras.io/api/