Detecting land types from satellite imagery



I have two solutions for this problem: 
    - using Custom Neural Network
    - using Pretrained ResNet Network

Before starting the classification algorithms, I just wanted to check if I have pictures that are completely white or black. At this step, I just wanted to make sure that I do not put into the algorithm completely wrong data.

In [10]:

#import libraries for pics manipulation,os
import os
from PIL import Image


for folder in os.listdir('images_original'):    
    list = os.listdir('images_original/'+folder)
    for pic in list:
        img=Image.open('images_original/'+folder+'/'+pic)
        # convert the pic in black and white and find min/max
        extrema = img.convert("L").getextrema()
        if extrema == (0, 0):
            print('pic is all black'+pic+list)
        elif extrema == (1, 1):
            print('pic is all white'+pic+list)


It looks like there were not only white or only black pictures.
Next, I will convert all the pictures from (28,28) to (32,32) pixels because ResNet does not take pictures less than (32,32) as an input. This dataset will be used just for Solution with Pretrained ResNet

In [52]:
from PIL import Image
import glob


files=glob.glob('images_original/*/*.png')
for file in files:

    im=Image.open(file)
    im2=im.resize((32,32),Image.ANTIALIAS)
    file=file.replace('\\','/')
    file=file.replace('images_original','images_resized')
    im2.save(file,'PNG',dpi=(100,100))

Next, I will construct the classification solution using a pretrained neural network (ResNet).It will just be trained for the demo for several epochs, because of time and processing power constraints. If more training time is available, there is the opportunity to add extra ImageAugmentation methods, try different optimizers, increase epochs.

In [28]:
#  Pretrained neural network (ResNet)
import numpy as np
import os
import keras
import matplotlib.pyplot as plt
from keras import models
from keras import layers
from keras.layers import Dense,GlobalAveragePooling2D,Flatten,Dropout,Input
#from keras.applications.vgg16 import VGG16
#from keras.applications.vgg16 import preprocess_input
from keras.applications.resnet50 import ResNet50,preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Model
from keras.optimizers import Adam
from keras import backend as K
from keras.utils import plot_model


pics_xlength=32
pics_ylength=32

train_data_dir = 'images_resized'
epochs=2
#batch size is very important, if too small training takes long
#if too long, memory issues
batch_size=64



#import pretrained resnet with 50 layers
base_model=ResNet50(weights='imagenet',include_top=False,
                 input_shape=(pics_xlength,pics_ylength,3)) 
#base_model.summary()



#add extra two dense layers, one for extra training and
#and one for the softmax function
model=models.Sequential()
model.add(base_model)
model.add(layers.Flatten())
model.add(layers.Dense(512,activation='relu'))
model.add(layers.BatchNormalization())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(6,activation='softmax'))

# will retrain last 4 layers in resnet model+ layers added   
for layer in base_model.layers[:-4]:
    layer.trainable=False
    
#just make sure that training is correct
#for layer in base_model.layers:    
#    print(layer, layer.trainable)
#for layer in model.layers:    
#    print(layer, layer.trainable)



#Augmentation available
#Preprocessing function for resnet50 mandatory
'''
train_datagen=ImageDataGenerator(
                        rotation_range=40,
                        width_shift_range=0.2,
                        height_shift_range=0.2,
                        fill_mode='nearest',
                        shear_range=0.2,
                        zoom_range=0.2,
                        horizontal_flip=True,
                        validation_split=0.2,
                        preprocessing_function=preprocess_input)
'''
# Lighter ImageData Generator
train_datagen=ImageDataGenerator(
                        validation_split=0.2,
                        preprocessing_function=preprocess_input)

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])


#Input from Directory. Divide in train and validate generator
train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(pics_xlength, pics_ylength),
    batch_size=batch_size,
    class_mode='categorical',
    subset='training')

validation_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(pics_xlength, pics_ylength),
    batch_size=batch_size,
    class_mode='categorical',
    subset='validation')

model.fit_generator(
    train_generator,
    steps_per_epoch = train_generator.samples//batch_size,
    validation_data=validation_generator,
    validation_steps = validation_generator.samples // batch_size,
    epochs = epochs)

Found 256803 images belonging to 6 classes.
Found 64197 images belonging to 6 classes.
Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x1ffc603be80>

In [30]:
model.save('resnet_2epochs_noaugmentation.h5')

Test the model for a random sample from image dataset. Looks like the algorithm predicts the correct answer with high probability

In [33]:
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.applications.resnet50 import decode_predictions


image = load_img('images_resized/building/3001.png', target_size=(32, 32))
image = img_to_array(image)
image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))
image=preprocess_input(image)
pred=model.predict(image)
print(pred)


[[9.5269519e-05 9.9987113e-01 4.2752369e-08 3.3256722e-05 3.1623904e-07
  1.1554472e-09]]


Create the confusion matrix and calculate the balanced accuracy score.
The balanced accuracy score takes into consideration the fact that data is not balanced ('Water' class has 120k samples, 'road' class has 8k samples )

In [50]:
from sklearn.metrics import balanced_accuracy_score, confusion_matrix

test_datagen=ImageDataGenerator(
                        preprocessing_function=preprocess_input)
test_generator = test_datagen.flow_from_directory(
    train_data_dir,
    target_size=(pics_xlength, pics_ylength),
    batch_size=batch_size,
    shuffle=False)
Y_pred = model.predict_generator(test_generator, test_generator.samples // batch_size+1)
y_pred = np.argmax(Y_pred, axis=1)
print('Confusion Matrix')
print(confusion_matrix(test_generator.classes, y_pred))
print('Balanced Accuracy Score')
print(balanced_accuracy_score(test_generator.classes, y_pred))

Found 321000 images belonging to 6 classes.
Confusion Matrix
[[ 71559     31   1628     14    154     11]
 [   218  11387     28    248      5     37]
 [  1425      2  47925     11    969     15]
 [    54    244     19   7865      7      3]
 [    42      3    587      5  56125     47]
 [     9     37     39      8     89 120150]]
Balanced Accuracy Score
0.9714045020677893


I also trained a Custom Convolutional Neural Network with 3 blocks of filters+
ReLU+MaxPooling, followed by a Dense layer. Batch Normalization was really important here, it increased accuracy by 5%. I also used a bit more ImageAugmentation tricks than before + Scaling. Size of input pictures is (28,28,3) for this network

In [4]:
# Custom Convolutional Neural Network
from keras import backend as K
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation,Dropout,Flatten,Dense
from keras.layers.normalization import BatchNormalization

#set pixel length and width
pics_xlength=28
pics_ylength=28


train_data_dir = 'images_original'
epochs=50
batch_size=64

# set input of the neural network
# (pics_xlength,pics_ylength,nr_of_channels(rgb))

if K.image_data_format() == 'channels_first':
    input_shape = (3, pics_xlength, pics_ylength)
else:
    input_shape = (pics_xlength, pics_ylength, 3)
    

# blocks of Conv+Activation+MaxPooling
# important params: nr of filters, activation function,pool_size
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(6))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

train_datagen=ImageDataGenerator(rescale=1. / 255,                                   
                        shear_range=0.2,
                        zoom_range=0.2,
                        horizontal_flip=True,
                        validation_split=0.2)

#Input from Directory. Divide in train and validate generator
train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(pics_xlength, pics_ylength),
    batch_size=batch_size,
    class_mode='categorical',
    subset='training')

validation_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(pics_xlength, pics_ylength),
    batch_size=batch_size,
    class_mode='categorical',
    subset='validation')

model.fit_generator(
    train_generator,
    steps_per_epoch = train_generator.samples//batch_size,
    validation_data=validation_generator,
    validation_steps = validation_generator.samples // batch_size,
    epochs = epochs)

Found 256803 images belonging to 6 classes.
Found 64197 images belonging to 6 classes.
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x1453b3ddda0>