# Iceberg Recognition from Salelite Images Using Keras Deep NN

In this notebook I am going to develop a Deep Neural Network to recognise Iceberg in ocean from satelite images . For this image recognition problem I am going to use Keras deep learning features . Keras is high level deep NN library which can be built over a tensorflow or theano backend . Keras provides much more abstraction and flexibilty to built deep networks in a very simple way . 

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory

from subprocess import check_output
print(check_output(["ls", "../input"]).decode("utf8"))

# Any results you write to the current directory are saved as output.

The data for this problem is a image data in json format . We have to read the json data using pd.read_json( ) and we see it has two bands or channels and 1604 images in the training data .  The data can be downloaded form this [link](https://www.kaggle.com/c/statoil-iceberg-classifier-challenge/data) .  

In [None]:
train_df = pd.read_json('../input/train.json')
test_df = pd.read_json('../input/test.json')
train_df.head()

We need to preprocess the data to reshape it to make it in shape (1604,75,75,2) . We get the the training data in X_band .

In [None]:
X_band_1=np.array([np.array(band).astype(np.float32).reshape(75, 75) for band in train_df["band_1"]])
X_band_2=np.array([np.array(band).astype(np.float32).reshape(75, 75) for band in train_df["band_2"]])

In [None]:
X_band = np.zeros([1604,75,75,3])
for t in range(1604):
    X_band[t,:,:,0] = X_band_1[t]
    X_band[t,:,:,1] = X_band_2[t]
    X_band[t,:,:,2] = (X_band_1[t]+X_band_2[t])/2

Now we import all required keras packages . 

In [None]:
from keras import layers
from keras.layers import Input, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2D
from keras.layers import AveragePooling2D, MaxPooling2D, Dropout, GlobalMaxPooling2D, GlobalAveragePooling2D
from keras.models import Model
from keras.preprocessing import image
from keras.utils import layer_utils
from keras.utils.data_utils import get_file
from keras.applications.imagenet_utils import preprocess_input

from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
from keras.utils import plot_model

import keras.backend as K
K.set_image_data_format('channels_last')
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow

The deep NN model in keras can be built in 2 ways . One is using Sequential API and the other is Model API . I am using Model API . So we built the model thatconsistes of 3 layers of convolution layer and two fully connected layer . Each conv layer is made of (Conv2D,BatchNormalization,ReLU,MaxPooling2D) block. 

In [None]:
def Iceberg_model(input_shape):
    X_in = Input(input_shape)
    
    X = Conv2D(32,kernel_size=(5,5),input_shape=(75,75,3))(X_in)
    X = BatchNormalization()(X)
    X = Activation('relu')(X)
    X = MaxPooling2D(pool_size=(2,2))(X)
    
    X = Conv2D(32,kernel_size=(5,5))(X)
    X = BatchNormalization()(X)
    X = Activation('relu')(X)
    X = MaxPooling2D(pool_size=(2,2))(X)
    
    X = Conv2D(16,kernel_size=(5,5))(X)
    X = BatchNormalization()(X)
    X = Activation('relu')(X)
    X = MaxPooling2D(pool_size=(2,2))(X)
    
    X = Flatten()(X)
    X = Dense(128)(X)
    X = Activation('relu')(X)
    
    X = Dense(1)(X)
    X = Activation('sigmoid')(X)
    
    model = Model(inputs=X_in,outputs=X,name='Iceberg_model')
    return model

In Keras we have follow 4 simple steps to built a NN model . 
1.  create the model .
2.  compile the model .
3.  fit the model on the training data. 
4.  evaluate the model on the test data . 

For  creating the model we provide the shape of each training data (75,75,2) .

In [None]:
IcebergModel = Iceberg_model((75,75,3))

Compile the model using 'adam' optimizer , 'binary_crossentropy' loss and accuracy metric .

In [None]:
IcebergModel.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])

Now we fit the model with the training data having epochs = 20 and random batch_size = 128 . 

In [None]:
target = train_df['is_iceberg'].values
IcebergModel.fit(x=X_band,y=target,epochs=20,batch_size=128)

Evaluating the model on the traning data and it gets a 96.5 % accuarcy . 

In [None]:
IcebergModel.evaluate(x=X_band,y=target)

Classification report on the training data .  

In [None]:
from sklearn.metrics import classification_report
pred_label = IcebergModel.predict(x=X_band)
pred_label[pred_label>0.5]=1
pred_label[pred_label<=0.5]=0
print(classification_report(target,pred_label))

In [None]:
X_band_test_1=np.array([np.array(band).astype(np.float32).reshape(75, 75) for band in test_df["band_1"]])
X_band_test_2=np.array([np.array(band).astype(np.float32).reshape(75, 75) for band in test_df["band_2"]])
X_test = np.zeros([8424,75,75,3])
for t in range(8424):
    X_test[t,:,:,0] = X_band_test_1[t]
    X_test[t,:,:,1] = X_band_test_2[t]
    X_test[t,:,:,2] = (X_band_test_1[t]+X_band_test_2[t])/2

In [None]:
pred = IcebergModel.predict(x=X_test)

sub_df = pd.DataFrame()
sub_df['id'] = test_df['id']
sub_df['is_iceberg'] = pred
sub_df.to_csv('output.csv',index=False)