# Prediction Of Damages Caused By Floods & Earthqaukes Using ANN

### Abstract

Natural disasters are an increasing phenomenon clearly perceived and known to have a direct, life-altering impact on the welfare of the region it hits and its residents. Depending on where we live, hurricanes, earthquakes, floods, droughts, etc are a threat to lives, properties, productive assets & financial resources. The growing incidence of natural disasters is directly proportional to the increasing vulnerability of households and communities in affected regions. In this work, an artificial neural network has been used to predict the damages caused by natural disasters that can be felt at the community, city and state level as well as on an entire country. Artificial neural networks are mathematical models, inspired by a biological neural network process – the biological neuron. They are used for the modeling of various complex input and output relationships as well as to find and match patterns of any given data. This report results in the comparison of different machine learning algorithms currently used to increase the accuracy of predictions. Training various neural networks, damages occurred due to floods & earthquakes have been estimated using test data.

### Implementation

Natural disasters cause massive casualties, damages and leave many injured. Human beings cannot stop them but timely prediction and due safety measures can prevent loss of human lives and many precious objects can be saved. The main focus of this project is on the application of data-driven models in the context of real-time forecasting of the damages.

This section follows an implementation plan which includes Data Selection, Data Preprocessing and Visualization, Application of Artificial Neural Networks, and its performance evaluation.

#### Importing Packages

In [None]:
# External Packages
import pandas as pd
import numpy as np
import seaborn as sns
import datetime
import time

# Visualzation Packages
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import squarify
import plotly.plotly as py
import plotly.graph_objs as go
import plotly 
plotly.tools.set_credentials_file(username='psn1997', api_key='ffj08tmHIdZR3dUcbBIv')

# Encoding Packages
import category_encoders as ce  #Category Encoder
from sklearn.preprocessing import LabelEncoder  #Label Encoder

# Preprocessing Packages
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.preprocessing import MinMaxScaler

# Machine Learning Models
from sklearn import svm  #SVM Model
from sklearn.tree import DecisionTreeClassifier  #Decision Tree Classifier
from sklearn.ensemble import RandomForestClassifier  #Random Forest Classifier

# Artificial Neural Network Models
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers.core import Dense  #FFNN
from keras.layers.recurrent import LSTM  #RNN
from keras.layers import Embedding
from keras.optimizers import RMSprop
from rbflayer import RBFLayer, InitCentersRandom  #RBFN 

# Evaluation Packages
from sklearn.metrics import confusion_matrix
from sklearn.utils.multiclass import unique_labels
from sklearn.metrics import cohen_kappa_score
from sklearn.metrics import classification_report

#### Confusion Matrix Function 

A confusion matrix of size n x n associated with a classifier shows the predicted and actual classification, where n is the number of different classes. The prediction accuracy and classification error can be obtained from this matrix.
This function prints and plots the confusion matrix.

In [None]:
def plot_confusion_matrix(y_true, y_pred, classes, normalize=False, title=None, cmap=plt.cm.Blues):
    
    if not title:
        if normalize:
            title = 'Normalized confusion matrix'
        else:
            title = 'Confusion matrix, without normalization'

    # Compute confusion matrix
    cm = confusion_matrix(y_true, y_pred)
    # Only use the labels that appear in the data
    classes = classes
    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
        print("Normalized confusion matrix")
    else:
        print('Confusion matrix, without normalization')
        
    fig, ax = plt.subplots()
    im = ax.imshow(cm, interpolation='nearest', cmap=cmap)
    ax.figure.colorbar(im, ax=ax)
    # We want to show all ticks...
    ax.set(xticks=np.arange(cm.shape[1]),
           yticks=np.arange(cm.shape[0]),
           # ... and label them with the respective list entries
           xticklabels=classes, yticklabels=classes,
           title=title,
           ylabel='True label',
           xlabel='Predicted label')

    # Rotate the tick labels and set their alignment.
    plt.setp(ax.get_xticklabels(), rotation=45, ha="right",
             rotation_mode="anchor")

    # Loop over data dimensions and create text annotations.
    fmt = '.2f' if normalize else 'd'
    thresh = cm.max() / 2.
    for i in range(cm.shape[0]):
        for j in range(cm.shape[1]):
            ax.text(j, i, format(cm[i, j], fmt),
                    ha="center", va="center",
                    color="white" if cm[i, j] > thresh else "black")
    fig.tight_layout()
    return ax

#### Evaluation Function

**Precision:** What proportion of positive identifications was actually correct.<br> 
**Recall:** What proportion of actual positives was identified correctly.<br>
**F1 Score:** is needed when you want to seek a balance between Precision and Recall.<br>
**Cohen Kappa Score:** Kappa Score is a metric that compares an Observed Accuracy with an Expected Accuracy.<br>


In [None]:
def evaluate_model(y_true, pred):
    class_names = np.array(['0','1','2','3','4'])
    plot_confusion_matrix(y_test, pred,classes= class_names , title='Confusion matrix, without normalization')
    plt.show()
    print("Cohen Kappa Score: "+ str(cohen_kappa_score(y_test, pred)))
    print("Classification report \n" + str(classification_report(y_test, pred, target_names=class_names)))
    
    return

#### Read Data

Records of floods are obtained from the Storm Events Database (SED), maintained by National Oceanic and Atmospheric Administration's National Weather Service (NWS).

The dataset taken into consideration is from the year 2006 till 2018 and contains 16449 rows.

In [None]:
floods = pd.read_csv("Project\database.csv", index_col=0)
np.random.seed(0)

#### Visulization of Data

In [None]:
viz_floods = floods.copy()
freq_floods = viz_floods['State'].value_counts()
freq_floods = freq_floods.to_frame().reset_index()
freq_floods = freq_floods.rename(columns= {"index": "State", "State":"Frequency"})

Squarify Plot <br>
Flood Prone States of USA

In [None]:
cmap = matplotlib.cm.Blues
mini=freq_floods['Frequency'][0:10].min()
maxi=freq_floods['Frequency'][0:10].max()
norm = matplotlib.colors.Normalize(vmin=mini, vmax=maxi)
colors = [cmap(norm(value)) for value in freq_floods['Frequency'][0:10]]

squarify.plot(sizes=freq_floods['Frequency'][0:10], label=freq_floods['State'][0:10], alpha=.6, color=colors)
plt.axis('off')
plt.show()

Plotly <br>
USA Map for Flood Prone States

In [None]:
floods_gb=floods.groupby(['State', 'Code']).size()
floods_gb = floods_gb.to_frame().reset_index()
floods_gb = floods_gb.rename(columns= {0: "Frequency"})

In [None]:
for col in floods_gb.columns:
    floods_gb[col] = floods_gb[col].astype(str)

scl = [
    [0.0, '#E8EAF6'],
    [0.1, '#C5CAE9'],
    [0.2, '#9FA8DA'],
    [0.3, '#7986CB'],
    [0.4, '#5C6BC0'],
    [0.5, '#3F51B5'],
    [0.6, '#3949AB'],
    [0.7, '#303F9F'],
    [0.8, '#283593'],
    [0.9, '#1A237E'],
    [1.0, '#0c1359']
]


floods_gb['text'] = floods_gb['State']

data = [go.Choropleth(
    colorscale = scl,
    autocolorscale = False,
    locations = floods_gb['Code'],
    z = floods_gb['Frequency'].astype(float),
    locationmode = 'USA-states',
    text = floods_gb['text'],
    marker = go.choropleth.Marker(
        line = go.choropleth.marker.Line(
            color = 'rgb(255,255,255)',
            width = 2
        )),
    colorbar = go.choropleth.ColorBar(
        title = "Frequency")
)]

layout = go.Layout(
    title = go.layout.Title(
        text = ''
    ),
    geo = go.layout.Geo(
        scope = 'usa',
        projection = go.layout.geo.Projection(type = 'albers usa'),
        showlakes = True,
        lakecolor = 'rgb(255, 255, 255)'),
)

fig = go.Figure(data = data, layout = layout)
py.iplot(fig, filename = 'd3-cloropleth-map')

#### Preprocessing of Data
**Discarding Null Values:** Preprocessing on floods dataset involved discarding rows having null values, as these null values are either not supported by many machine learning models or these values caused the output to give a skewed accuracy. <br>
**Binary Encoding:** This technique is not as intuitive as the one-hot encoder. In this technique, first the categories are encoded as ordinal, then those integers are converted into binary code, then the digits from that binary string are split into separate columns. This encodes the data in fewer dimensions than one-hot. Category encoders are used to invoke binary encoding functions.

In [None]:
final_floods_drop = floods.copy()
final_floods_drop = final_floods_drop.dropna(subset=["Damage_Property"])
final_floods_drop = final_floods_drop.dropna(subset=["Range_Damage_Property"])
final_floods_drop = final_floods_drop.dropna(subset=["Flood_Cause"])
final_floods_drop = final_floods_drop.dropna(subset=["Begin_Lat"])
final_floods_drop = final_floods_drop.dropna(subset=["Begin_Lon"])
final_floods_drop = final_floods_drop.dropna(subset=["End_Lat"])
final_floods_drop = final_floods_drop.dropna(subset=["End_Lon"])
print(final_floods_drop.isnull().sum())

In [None]:
final_floods_ce = final_floods_drop.copy()
encoder = ce.BinaryEncoder(cols=['Flood_Cause'])
final_floods= encoder.fit_transform(final_floods_ce)

#### Application of Models

This project elaborates six models, which forms the core of our comprehensive comparative study to predict the possible damages caused due to natural disasters.

**Machine Learning Models**

In [None]:
X = final_floods[['Flood_Cause_0', 'Flood_Cause_1', 'Flood_Cause_2', 'Flood_Cause_3','Begin_Lat', 'Begin_Lon', 'End_Lat', 'End_Lon']].copy()
y = final_floods[['Range_Damage_Property']].copy()

In [None]:
# Split Data into Train & Test Data  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,random_state=0)
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)
print(y_train['Range_Damage_Property'].value_counts())
print(y_test['Range_Damage_Property'].value_counts())

**Support Vector Machine (SVM) Model**

In [None]:
# Create a SVM Model
clf = svm.SVC(kernel='linear')

# Train the model using the training sets
clf.fit(X_train, y_train)

# Predict the response for test dataset
y_pred = clf.predict(X_test)

# Print Model Accuracy
print(clf.predict(X_test))
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

# Evaluate Model
evaluate_model(y_test, clf.predict(X_test))

**Decision Tree Classifier**

In [None]:
# Create Decision Tree classifer
clf = DecisionTreeClassifier(criterion="entropy", splitter="best",max_depth=11)

# Train the model using the training sets
clf = clf.fit(X_train,y_train)

# Predict the response for test dataset
y_pred = clf.predict(X_test)

# Print Model Accuracy
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

# Evaluate Model
evaluate_model(y_test, y_pred)

**Random Forest Classifier**

In [None]:
# Create Random Forest classifer
clf=RandomForestClassifier(n_estimators=100,criterion='entropy',max_depth=10,min_samples_split=5,verbose=1)

# Train the model using the training sets
clf.fit(X_train,y_train)

# Predict the response for test dataset
y_pred=clf.predict(X_test)

# Print Model Accuracy
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

# Evaluate Model
evaluate_model(y_test, y_pred)

**Artificial Neural Network**

In [None]:
#Covert classes to categorical type

encoder = LabelEncoder()
encoder.fit(y_train)
encoded_Y = encoder.transform(y_train)

encoded_y_train = np_utils.to_categorical(encoded_Y)

encoder = LabelEncoder()
encoder.fit(y_test)
encoded_Y = encoder.transform(y_test)

encoded_y_test = np_utils.to_categorical(encoded_Y)

**Feed Forward Neural Network**

In [None]:
# Create Feed Forward Neural Network
model = Sequential()
model.add(Dense(16, activation='relu', input_dim=8))
model.add(Dense(16, activation='relu'))
model.add(Dense(5, activation='softmax'))

# Compile the Network
model.compile(loss='categorical_crossentropy', optimizer='SGD', metrics=['accuracy'])

# Train the model using the training sets
model.fit(X_train, encoded_y_train, batch_size=10, epochs=20, verbose=1, validation_data=(X_test, encoded_y_test))

# Print Model Accuracy
[test_loss, test_acc] = model.evaluate(X_test, encoded_y_test)
print("Evaluation result on Test Data : Loss = {}, accuracy = {}".format(test_loss, test_acc))
pred = model.predict_classes(X_test)
print(pred)

# Evaluate Model
evaluate_model(y_test, pred)

# Print Model Summary
print(model.summary())

**Recurrent Neural Network**

In [None]:
# Create Recurrent Neural Network
embed_dim = 128
lstm_out = 200
batch_size = 32

model = Sequential()
model.add(Embedding(2500, embed_dim,input_length = X.shape[1], dropout = 0.2))
model.add(LSTM(lstm_out, dropout_U = 0.2, dropout_W = 0.2))
model.add(Dense(16, activation='relu'))
model.add(Dense(5,activation='softmax'))

# Compile the Network
model.compile(loss = 'categorical_crossentropy', optimizer='SGD',metrics = ['accuracy'])

# Train the model using the training sets
model.fit(X_train, encoded_y_train, batch_size=10, epochs=20, verbose=1, validation_data=(X_test, encoded_y_test))

# Print Model Accuracy
[test_loss, test_acc] = model.evaluate(X_test,encoded_y_test)
print("Evaluation result on Test Data : Loss = {}, accuracy = {}".format(test_loss, test_acc))
pred = model.predict_classes(X_test)
print(pred)

# Evaluate Model
evaluate_model(y_test, pred)

# Print Model Summary
print(model.summary())

**Radial Basis Function Network**

In [None]:
# Create Radial Basis Function Network
if __name__ == "__main__":

    X=X_train.values
    y=encoded_y_train

    model = Sequential()
    rbflayer = RBFLayer(16,
                        initializer=InitCentersRandom(X), 
                        betas=1.0,
                        input_shape=(8,))
    model.add(rbflayer)
    model.add(Dense(16, activation='relu', input_dim=8))
    model.add(Dense(16, activation='relu'))
    model.add(Dense(5, activation='softmax'))

# Compile the Network
    model.compile(loss='mean_squared_error',
                  optimizer=RMSprop(), metrics=['accuracy'])
    
# Train the model using the training sets
    model.fit(X, y,
              batch_size=10,
              epochs=20,
              verbose=1)

    y_pred = model.predict(X)

In [None]:
# Print Model Accuracy
test_X=X_test.values
test_Y=encoded_y_test
[test_loss, test_acc] = model.evaluate(test_X,test_Y)
print("Evaluation result on Test Data : Loss = {}, accuracy = {}".format(test_loss, test_acc))

pred = model.predict_classes(X_test)
print(pred)
print("0.0: " + str(np.count_nonzero(pred == 0)))
print("1.0: " + str(np.count_nonzero(pred == 1)))
print("2.0: " + str(np.count_nonzero(pred == 2)))
print("3.0: " + str(np.count_nonzero(pred == 3)))
print("4.0: " + str(np.count_nonzero(pred == 4)))

# Evaluate Model
evaluate_model(y_test, pred)

# Print Model Summary
print(model.summary())