# Identify Aircraft Based On Limited Number Of Photos - Part 1
This is an attempt to train a deep neural network to identify different models of commercial aircraft based on very few photos - without relying on any additional information (i.e., no EXIF data, no tags, etc.). The underlying technique can be used for other image classification tasks where we have a small training dataset.

I train a convolutional neural network (aka *CNN*) to distinguish between photos of different models of commercial aircraft. The CNN comprises of 3 sets of (convolution + activation + max-pooling layers), followed by 2 fully-connected layers, as shown below:
![](https://docs.google.com/drawings/d/1B7g5OCWWrKzFPE_hOj0yvVsN6PBxiov54MjmGW8FNYE/pub?w=959&h=429)

## Dataset
As of date, I have curated a dataset of 5920 photos of 39 different models of commercial aircraft ([TSV](https://docs.google.com/spreadsheets/d/1zSUNhlpGDKtngK271UMJobtXcwojHiewczJ5FLpX4Es/pub?output=tsv), [Web](https://docs.google.com/spreadsheets/d/1zSUNhlpGDKtngK271UMJobtXcwojHiewczJ5FLpX4Es/pubhtml)). They were obtained from the Yahoo Flickr Creative Commons (YFCC) dataset, Wikipedia and other sources.

The photos have been harvested from the indicated `ImageURL`s and placed in aircraft model specific subdirectories of the `train` and `test` subdirectories of the `data` directory. The script should have already resized the photos into the standard size, i.e. 1024 x 683 pixels.

#### Tiny Dataset
To make up for having a tiny dataset to work with, photos are augmented using several random transformations. As an example, the image on the top is the original, 3 of (*potentially infinite*) transformations are shown below.
![](https://docs.google.com/drawings/d/1vQ6hsOmnHD15vC3m_g77HG7oirvK1PpHzMQP163fUTI/pub?w=827&h=321)

## Train
I trained the classifier on 4 aircraft models for which I have the most number of photos, as shown below. I have added a few columns which serves as features for visual identification (but not used for training).

| Aircraft Model | Train | Test | NumberOfEngines | EngineMount | Wingtips | OverWingExits | T-tail |
|----------------|-------|------|-----------------|-------------|----------|---------------|--------|
| A321           | 1106  | 225  | 2               | Under Wing  | Small    | None          | No     |
| A340           | 1052  | 125  | 4               | Under Wing  | Medium   | None          | No     |
| B747           | 1169  | 150  | 4               | Under Wing  | Large    | 1             | No     |
| CRJ900         | 304   | 130  | 2               | Rear        | Medium   | 2             | Yes    |

In [1]:
import logging
import numpy as np
import os

np.random.seed(371250) # For reproducibility, needs to be set before Keras is loaded
from imp import reload
from keras.layers import Convolution2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.models import Sequential
from keras.preprocessing.image import ImageDataGenerator

reload(logging)
logging.basicConfig(format="%(asctime)s: %(message)s", level=logging.INFO, datefmt="%H:%M:%S")

Using Theano backend.


In [2]:
#img_width, img_height = 256, 170 # Approximately 25% of the original
img_width, img_height = 150, 150

train_data_dir = "../data/train"
validation_data_dir = "../data/test"

labels = ["A321", "A340", "B747", "CRJ900"]
nb_train_samples = 3700 # actual 3726
nb_validation_samples = 630
nb_epoch = 20

logging.info("Current PID : {}".format(os.getpid()))

17:06:31: Current PID : 2093


In [3]:
model_input_shape = (3, img_width, img_height)

model = Sequential()
model.add(Convolution2D(32, 3, 3, input_shape=(3, img_width, img_height)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Convolution2D(32, 3, 3))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Convolution2D(64, 3, 3))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation("relu"))
model.add(Dropout(0.5))
model.add(Dense(len(labels)))
model.add(Activation("sigmoid"))

model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

The train and test generators load the photos and indefinitely generate batches of augmented photos.

In [4]:
train_datagen = ImageDataGenerator(
    fill_mode="nearest",
    horizontal_flip=True,
    rescale=1./255,
    shear_range=0.2,
    zoom_range=0.2)

test_datagen = ImageDataGenerator(rescale=1./255)

In [5]:
train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    batch_size=32,
    classes=labels,
    target_size=(img_width, img_height),
    class_mode="categorical")

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    batch_size=32,
    classes=labels,
    target_size=(img_width, img_height),
    class_mode="categorical")

Found 3726 images belonging to 4 classes.
Found 630 images belonging to 4 classes.


In [6]:
model.fit_generator(
    train_generator,
    nb_epoch=nb_epoch,
    nb_val_samples=nb_validation_samples,
    samples_per_epoch=nb_train_samples,
    validation_data=validation_generator,
    verbose=2)
model.save_weights("model_1_weights_{}_epochs.h5".format(nb_epoch))
logging.info("Done")

Epoch 1/20




414s - loss: 1.2779 - acc: 0.3489 - val_loss: 1.4801 - val_acc: 0.2349
Epoch 2/20
372s - loss: 1.2190 - acc: 0.4055 - val_loss: 1.3702 - val_acc: 0.2921
Epoch 3/20
358s - loss: 1.1527 - acc: 0.4557 - val_loss: 1.3290 - val_acc: 0.3492
Epoch 4/20
373s - loss: 1.0743 - acc: 0.5220 - val_loss: 1.2505 - val_acc: 0.4079
Epoch 5/20
371s - loss: 0.9636 - acc: 0.6020 - val_loss: 1.0415 - val_acc: 0.5667
Epoch 6/20
390s - loss: 0.8492 - acc: 0.6592 - val_loss: 0.9655 - val_acc: 0.6127
Epoch 7/20
447s - loss: 0.7783 - acc: 0.6975 - val_loss: 0.8281 - val_acc: 0.6587
Epoch 8/20
415s - loss: 0.6852 - acc: 0.7338 - val_loss: 0.7147 - val_acc: 0.7365
Epoch 9/20
441s - loss: 0.6525 - acc: 0.7461 - val_loss: 0.6405 - val_acc: 0.7587
Epoch 10/20
442s - loss: 0.5973 - acc: 0.7689 - val_loss: 0.7014 - val_acc: 0.7413
Epoch 11/20
398s - loss: 0.5671 - acc: 0.7743 - val_loss: 0.5731 - val_acc: 0.7603
Epoch 12/20
372s - loss: 0.5270 - acc: 0.7971 - val_loss: 0.5835 - val_acc: 0.7810
Epoch 13/20
423s - loss:

19:18:11: Done


[TIP] Next time specify overwrite=True in save_weights!


### Credits
- <a href="https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html" target="_blank">https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html</a>