# Identify Aircraft Based On Photos - Part 1
An attempt to identify aircraft models based on photos without relying on any additional information (i.e., no EXIF data, no tags, etc.).

- **For part 1**, I train an image classifier (a convolutional neural network, aka *CNN*) to distinguish between photos of Airbus **<a href="http://www.airbus.com/aircraftfamilies/passengeraircraft/a320family/a321/" target="_blank">A321</a>**s and **<a href="http://www.airbus.com/aircraftfamilies/passengeraircraft/a340family/" target="_blank">A340</a>**s.
 - The CNN comprises of 3 sets of (convolution + activation + max-pooling layers), followed by 2 fully-connected layers
 ![](https://docs.google.com/drawings/d/1B7g5OCWWrKzFPE_hOj0yvVsN6PBxiov54MjmGW8FNYE/pub?w=959&h=429)
- In subsequent parts, I'll be extending this work to recognize and distinguish between photos of 15+ different models of commercial aircraft.

## Dataset
All required photos of aircrafts are under the `train` and `test` subdirectories of the `data` directory. All photos used should have already been resized into the standard size, i.e. 1024 x 683 pixels and placed in aircraft model specific subdirectories of the `train` and `test` subdirectories.

- Train data : 1106 and 1052 photos for A321 and A340 models respectively.
- Test data  : 225 and 125 photos respectively.

To make up for having a tiny dataset to work with, photos are augmented using several random transformations. As an example, the image on the top is the original, 3 of (*potentially infinite*) transformations are shown below.
![](https://docs.google.com/drawings/d/1vQ6hsOmnHD15vC3m_g77HG7oirvK1PpHzMQP163fUTI/pub?w=827&h=321)

## Train

In [1]:
import logging
import numpy as np
import os

np.random.seed(730521) # For reproducibility, needs to be set before Keras is loaded
from imp import reload
from keras.layers import Convolution2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.models import Sequential
from keras.preprocessing.image import ImageDataGenerator

reload(logging)
logging.basicConfig(format="%(asctime)s: %(message)s", level=logging.INFO, datefmt="%H:%M:%S")

Using Theano backend.


In [2]:
#img_width, img_height = 256, 170 # Approximately 25% of the original
img_width, img_height = 150, 150

train_data_dir = "../data/train"
validation_data_dir = "../data/test"

nb_train_samples = 2000 # actual 2158
nb_validation_samples = 350
nb_epoch = 20

logging.info("Current PID : {}".format(os.getpid()))

21:36:32: Current PID : 4464


In [3]:
model_input_shape = (3, img_width, img_height)

model = Sequential()
model.add(Convolution2D(32, 3, 3, input_shape=(3, img_width, img_height)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Convolution2D(32, 3, 3))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Convolution2D(64, 3, 3))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation("relu"))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation("sigmoid"))

model.compile(loss="binary_crossentropy", optimizer="rmsprop", metrics=["accuracy"])

The train and test generators load the photos and indefinitely generate batches of augmented photos.

In [4]:
train_datagen = ImageDataGenerator(
    horizontal_flip=True,
    rescale=1./255,
    shear_range=0.2,
    zoom_range=0.2)

test_datagen = ImageDataGenerator(rescale=1./255)

In [5]:
labels = ["A321", "A340"]
train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    batch_size=32,
    classes=labels,
    target_size=(img_width, img_height),
    class_mode="binary")

Found 2158 images belonging to 2 classes.


In [6]:
validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    batch_size=32,
    classes=labels,
    target_size=(img_width, img_height),
    class_mode="binary")

Found 350 images belonging to 2 classes.


In [7]:
model.fit_generator(
    train_generator,
    nb_epoch=nb_epoch,
    nb_val_samples=nb_validation_samples,
    samples_per_epoch=nb_train_samples,
    validation_data=validation_generator,
    verbose=2)
model.save_weights("model_1_weights_{}_epochs.h5".format(nb_epoch))
logging.info("Done")

Epoch 1/20




152s - loss: 0.7563 - acc: 0.5531 - val_loss: 0.6758 - val_acc: 0.5600
Epoch 2/20
153s - loss: 0.6401 - acc: 0.6389 - val_loss: 0.6575 - val_acc: 0.5971
Epoch 3/20
157s - loss: 0.5772 - acc: 0.6926 - val_loss: 0.6560 - val_acc: 0.6057
Epoch 4/20
156s - loss: 0.5264 - acc: 0.7212 - val_loss: 0.8842 - val_acc: 0.4857
Epoch 5/20
151s - loss: 0.4926 - acc: 0.7626 - val_loss: 0.5853 - val_acc: 0.6771
Epoch 6/20
151s - loss: 0.4418 - acc: 0.7931 - val_loss: 0.6307 - val_acc: 0.6686
Epoch 7/20
152s - loss: 0.4085 - acc: 0.8158 - val_loss: 0.4522 - val_acc: 0.8086
Epoch 8/20
151s - loss: 0.3740 - acc: 0.8414 - val_loss: 0.4426 - val_acc: 0.7800
Epoch 9/20
152s - loss: 0.3216 - acc: 0.8650 - val_loss: 0.3453 - val_acc: 0.8771
Epoch 10/20
153s - loss: 0.2838 - acc: 0.8862 - val_loss: 0.6225 - val_acc: 0.6943
Epoch 11/20
153s - loss: 0.2666 - acc: 0.8877 - val_loss: 0.3435 - val_acc: 0.8486
Epoch 12/20
152s - loss: 0.2155 - acc: 0.9089 - val_loss: 0.5461 - val_acc: 0.8057
Epoch 13/20
153s - loss:

22:28:10: Done


[TIP] Next time specify overwrite=True in save_weights!


### Credits
- <a href="https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html" target="_blank">https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html</a>