# Applying deep learning to the automatic classification of seismic traces

Hello, eveyone! Welcome to the lab on deep learning! Deep learning has achieved unprecedented success in many areas of studies and resulted in a large amount of AI-driven products. <br>
<br>
In this lab exercise, you are going to use [TensorFlow](https://www.tensorflow.org/) to train a deep neural network that can automatically classify the seismic P-wave receiver functions into two categories: good and bad. You will see that with TensorFlow, it is fairly straightforward to implement deep learning, and with deep learning, it is fairly easy to echieve a prediction accuracy of ~91%. <br>
<br>
After finishing this lab exercise, you can expect to be able to:<br>
1. use TensorFlow to implement deep learning;
2. train your first deep learning model for classification of seismic traces; <br>
<br>

Author: Jiajia Sun at University of Houston, 04/16/2019

## 1. Import and prepare data
The way to import the seismic data is pretty the same as before. 

In [1]:
import matplotlib.pyplot as plt

In [2]:
import numpy as np
import h5py
with h5py.File("../Traces_qc.mat") as f:
    ampdata = [f[element[0]][:] for element in f["Data"]["amps"]]
    flag = [f[element[0]][:] for element in f["Data"]["Flags"]]
    ntr = [f[element[0]][:] for element in f["Data"]["ntr"]]
    time = [f[element[0]][:] for element in f["Data"]["time"]]
    staname = [f[element[0]][:] for element in f["Data"]["staname"]]
    
ampall = np.zeros((1,651))
flagall = np.zeros(1)
for i in np.arange(201):
    ampall = np.vstack((ampall, ampdata[i]))
    flagall = np.vstack((flagall, flag[i]))
amp_data = np.delete(ampall, 0, 0)
label_data = np.delete(flagall, 0, 0)

  from ._conv import register_converters as _register_converters


In [3]:
np.random.seed(42)
all_data = np.append(amp_data,label_data,1) # put all the seismic traces and their lables into one matrix.

Now, we need to prepare the data properly.

In [4]:
all_data_permute = all_data[np.random.permutation(all_data.shape[0]),:] 

In [5]:
X_train = all_data_permute[:10000,:-1]
y_train = all_data_permute[:10000,-1]

X_validation = all_data_permute[10000:,:-1]
y_validation = all_data_permute[10000:,-1]

In [6]:
# The following code performs one hot encoding.
tmpa = np.zeros((y_train.shape[0],2))
tmpa[np.arange(y_train.shape[0]), y_train.astype(int)] = 1
y_train_new = tmpa

tmpb = np.zeros((y_validation.shape[0],2))
tmpb[np.arange(y_validation.shape[0]), y_validation.astype(int)] = 1
y_validation_new = tmpb

Note that, in the above cell, I transformed the **y_train** and **y_validation** to **y_train_new** and **y_validation_new**. The reason is that, our output layer will consist of 2 neurons. Therefore, we need to convert 0, that is, the categorical number representing bad seismic traces, to a 1D array [1,0] Similarly, we convert 1, i.e., the categorical number representing the good seismic traces, to a 1D array [0,1]. <br>
<br>
This is called '**one hot encoding**' in machine learning community. This is very commonly adopted practice for deep learning when it comes to classification problems. If you want to learn more about one hot encoding, you can simply search 'one hot encoding' in Google, or go to this Scikit-Learn [webpage](http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html). <br>
<br>
For the purpose of this exercise, please do not worry too much about it. The simplest way to understand one hot encoding is that, we represent our categories (or, different classes), which are the expected outcomes of a deep neural network, using some sort of coding. For example, suppose we have four classes to predict. Then, using 'one hot encoding', we would encode our first class as [1,0,0,0], the second class as [0,1,0,0], the third class as [0,0,1,0], and the fourth class as [0,0,0,1]. <br>
<br>
One thing that you do need to pay close attention is that, your training data are now **[X_train, <font color = red>y_train_new</font>]**, and your validation data are now **[X_validation, <font color = red>y_validation_new</font>]**.

## 2. Create the computation graph

<font color = red>**Task 1:**</font> Explain what a compuation graph is in the context of TensorFlow and deep neural networks. <font color = red>**(5 points)**</font> <br>

In [None]:
# Answer to Task 1:


The first thing to do is to import TensorFlow into our workplace.

In [None]:
import tensorflow as tf

Now, let us define our neural networks. We will be using a neural network with two hidden layers and one output layer. Specifically, there are 300 neurons in the first hidden layer, 100 neurons in the second hidden layer and 2 neurons in the output layer. <br>
<br>
How many neurons should we have for our input layer? <br> 
<font color=red>**651**</font>!

<font color = red>**Task 2:**</font> Please explain why we are having 651 neurons for our input layer. <font color = red>**(5 points)**</font> <br>

In [None]:
# Ansewr to Task 2:


In [None]:
n_inputs = 651
n_hidden1 = 300
n_hidden2 = 100
n_outputs = 2

<font color = red>**Task 3:**</font> Create two placeholders, one for X, the other for y. The placeholder X will hold the seismic amplitude values from each seismic trace and the placeholder y will hold the labels for all these seismic traces. <font color = red>**(10 points)**</font> <br>
<br>
**HINT:** Since we do not know exactly how many seismic traces will be fed into our neural networks, the shape of X could be *(None, n_inputs)* where None simply means that we do not know how many instances are coming at the moment. Now, for the shape of y, since we have used one-hot-encoding, its shape should be *(None, 2)*. And since we are using one-hot-encoding, its data type should be tf.float32. To learn more about tf.placehoder, please see this [webpage](https://www.tensorflow.org/api_docs/python/tf/placeholder).

In [None]:
# Answer to Task 3


<font color = red>**Task 4:**</font> Create the two hidden layers as well as the output layer using *tf.layers.dense*. <font color = red>**(15 points)**</font> <br>
<br>
**HINT:** If you forget how to use *tf.layers.dense*, please refer to the accompanying notework entitled 'TF_DNN_example.ipynb'.

In [None]:
# Answer to Task 4


Now, we need to define our cost function. 

<font color = red>**Task 5:**</font> Define the cost function. <font color = red>**(15 points)**</font> <br>
<br>
**HINT:** Because of the use of one-hot-encoding, you should use *[softmax_cross_entropy_with_logits](https://www.tensorflow.org/api_docs/python/tf/nn/softmax_cross_entropy_with_logits)* to define your cost function, as opposed to *[sparse_softmax_cross_entropy_with_logits](https://www.tensorflow.org/api_docs/python/tf/nn/sparse_softmax_cross_entropy_with_logits)*. 

In [None]:
# Answer to Task 5


<font color = red>**Task 6:**</font> Define the optimization scheme. We are going to use GradientDescentOptimizer with a learning rate of 0.1, and we want to minimize the cost function that was defined previously. <font color = red>**(15 points)**</font> <br>
<br>
**HINT:** This can be done by defining *optimizer* and *training*, the same way as in the accompanying notebook entitled 'TF_DNN_example.ipynb'.

In [None]:
# Answer to Task 6


Now, define how the predications are going to be evaluated.

In [None]:
compare = tf.equal(tf.argmax(output,1),tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(compare,tf.float32))

## 3. Execute the compuation graph

<font color = red>**Task 7:**</font> Define a global variable initializer that will intialize all the variables automatically for you once a TensorFlow session is run. Please name your initializer *init*. <font color = red>**(5 points)**</font> <br>

In [None]:
# Answer to Task 7


In [None]:
saver = tf.train.Saver()  # This is to save the model that you are going to train.

We are going to use mini-Batch Gradient Descent to train our neural networks. Therefore, we need to define our mini-batch size as well as how many epochs we would like to have. If you find these terminologies unfamiliar, please refer back to our lecture on gradient descent.

In [None]:
n_epochs = 30
batch_size = 50
n_batches = int(np.ceil(10000/batch_size))

Next, we define how to fetch the mini-batch from our training dataset. Note that, in the example notebook 'TF_DNN_example.ipynb', I used **.train.next_batch** to fetch the mini-batch from MNIST training dataset. However, this function is not generally applicable. So, we need to define our own way of fetching mini-batch.

In [None]:
def fetch_batch(epoch, batch_index, batch_size):
    np.random.seed(epoch * n_batches + batch_index) 
    indices = np.random.randint(10000, size=batch_size)  
    X_batch = X_train[indices] 
    y_batch = y_train_new[indices,:]
    return X_batch, y_batch

In [None]:
# Task 8: Open a new tf.Session (5 ponints)
with 
    # Task 9: Initialize all the variables (5 points)
    
    for epoch in np.arange(n_epochs):
        for batch_index in np.arange(n_batches):
            # Task 10: fetch the mini-batch using the fetch_batch function defined above (5 points)

            # Task 11: run the training using the mini-batch that you just fetched. (5 points)

        # Task 12: compute the predictioin accuracies on both training and validation data sets, and name them as 
        # accuracy_train and accuracy_test, respectively. (10 points) 
        # HINT: pay close attention to what you feed to the calucation of prediction accuracies on validation dataset.
        
        
        print(epoch, "Train accuracy:", accuracy_train, "Test accuracy:", accuracy_test)
        if epoch == 1:
            plt.plot(epoch, accuracy_train,'-ro',label='accuracy_train')
            plt.plot(epoch, accuracy_test,'-bo',label='accuracy_test')
        else:
            plt.plot(epoch, accuracy_train,'-ro')
            plt.plot(epoch, accuracy_test,'-bo')
    
    plt.legend(loc="lower right", fontsize=16)
    plt.show()
    
    save_patch = saver.save(sess,'./my_DNN_model.ckpt')

## 4. Make predictions

Let us compare the predictions on some of the seismic traces in the validation dataset with their true labels. Note: you do not need to do anything. Just clik the cells and run them.

In [None]:
with tf.Session() as sess:
    saver.restore(sess, "./my_DNN_model.ckpt") # or better, use save_path
    X_new_scaled = X_validation[:30,:]
    Z = output.eval(feed_dict={X: X_new_scaled})
    y_pred = np.argmax(Z, axis=1)

In [None]:
print("Predicted classes:", y_pred)
print("Actual classes:   ", y_validation.astype(int)[:30])

## 5. Visualize the computation graph 

Now, let us visualize the computation graph that you created before. Note: you do not need to do anything below. Simply, click the cell, and run it.

In [None]:
# the following code is copied and pasted from Aurelien Geron's book. This code was orginally writeen by A. Mordvintsev.
from IPython.display import clear_output, Image, display, HTML

def strip_consts(graph_def, max_const_size=32):
    """Strip large constant values from graph_def."""
    strip_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = strip_def.node.add() 
        n.MergeFrom(n0)
        if n.op == 'Const':
            tensor = n.attr['value'].tensor
            size = len(tensor.tensor_content)
            if size > max_const_size:
                tensor.tensor_content = b"<stripped %d bytes>"%size
    return strip_def

def show_graph(graph_def, max_const_size=32):
    """Visualize TensorFlow graph."""
    if hasattr(graph_def, 'as_graph_def'):
        graph_def = graph_def.as_graph_def()
    strip_def = strip_consts(graph_def, max_const_size=max_const_size)
    code = """
        <script>
          function load() {{
            document.getElementById("{id}").pbtxt = {data};
          }}
        </script>
        <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
        <div style="height:600px">
          <tf-graph-basic id="{id}"></tf-graph-basic>
        </div>
    """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

    iframe = """
        <iframe seamless style="width:1200px;height:620px;border:0" srcdoc="{}"></iframe>
    """.format(code.replace('"', '&quot;'))
    display(HTML(iframe))

In [None]:
show_graph(tf.get_default_graph())

## BONUS 
<font color = red>(**10 points**):</font> Now that you know how to implement deep learning using TensorFlow, try to change the network structure (e.g., the number of layers, and the number of neurons in each layer), and see if you can achieve a prediction accuracy of >92%. 

In [None]:
(answer to Bonus:)


## Acknowledgments
I would like to thank Ying Zhang for manually labeling all the seismic traces, and Prof. Aibing Li for making this data set available to the students in this class. <br>

<img src = "photo.png" width="400">

## Congratulations! You have now become one of those people who know how to use TensorFlow to do deep learning!