# Image Analysis
for the Happywhale Dataset in the Whale and Dolphin Detection Competition

In this notebook, I want to take a look at the different images in the dataset. Some of these images have features that need special attention. These include image size, multiple animals in one image, duplicates, and various things in the background. I will try to give hints on the influence and suggest a solution for one or the other feature. When you find additional images, that are good to know about, feel free to contact me 😉

thanks Andrada for the inspiration in https://www.kaggle.com/c/happy-whale-and-dolphin/discussion/308026

In [None]:
import os
import cv2
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")

# load the data
root = '../input/happy-whale-and-dolphin/'
dataset = pd.read_csv(root+'train.csv')
dataset['image_path'] = root+'train_images/'+dataset['image']
tst = pd.DataFrame({'image': os.listdir(root+'test_images/')})
tst['image_path'] = root+'test_images/'+tst['image']
dataset = dataset.append(tst)

# some functions for plotting the images
def get_path(image_name):
    pth = dataset.image_path[dataset.image==image_name].values[0]
    return pth

def plot(imgs, labels=None, figsize=(20,10)):
    n = len(imgs)
    cols = int(2 if n>=2 else 1)
    rows = int(n/2 if n>=2 else 1)
    fig, axs = plt.subplots(rows, cols, figsize=figsize)
    axs = np.array([axs])
    for ax in axs.flatten():
        i = list(axs.flatten()).index(ax)
        ax.imshow(plt.imread(get_path(imgs[i])))
        ax.axis('off')
        ax.set_title(imgs[i] if labels==None else labels[i])
    plt.show()

# Image Size
In the dataset there are the most different image sizes. The height and width is different for almost every image. Also, the difference between the smallest image, with 68x75 pixels, and the largest, with 5399x3599 pixels, should not be neglected. This is almost a difference of the factor 100! This information is very important to know before you start processing this dataset. For neural networks such differences can cause big problems. I would therefore recommend to bring all images to a uniform size.

In [None]:
imgs = ['4d817d2f3e6298.jpg','99c03bb23874cd.jpg']
labels = [imgs[i]+'\nSize: '+str(plt.imread(dataset.image_path[dataset.image==imgs[i]].values[0]).shape[:2]) for i in range(2)]
plot(imgs, labels=labels, figsize=(20,10))

# Duplicates
Some images seem as if they were identical, like a duplicate. Two examples of this are shown below. To be honest, the images are not really identical, even if the first look suggests it. In truth, the images are taken only moments after each other, so the differences are hardly noticeable. I think it is important to know about this phenomenon, but it should not have a measurable impact on learning behavior.

In [None]:
imgs = ['090d7f9228a6bc.jpg','bb875ffcb8d064.jpg',
        '7ad3a277f55107.jpg','d4d8ac80cb3a4b.jpg']
plot(imgs, figsize=(20,13))

# Annotations
Some images include annotations about the dates of the image, as well as the animal being viewed. These include the date of the photograph, as well as the location where it was found, and some other data. Such comments could significantly disturb the algorithm, but can also not be avoided so easily.

In [None]:
imgs = ['00a5f0c7e639ce.jpg','373a98f033ef87.jpg']
plot(imgs, figsize=(20,10))

# Multiple Individuals
In some images of the dataset, more than one animal can be observed. No wonder, because many of the species live together in herds and are rarely seen alone. However, this is not so tragic as long as the corresponding animal can be seen in the foreground of the image. But if there are many individuals in the image and it is not clear which one is meant here, it becomes very problematic for the algorithm to extract the corresponding information from the pixels.

As an example: In the right image there are many animals, but only one of them is stored with the corresponding individual_id in the dataset. This does not matter in this concrete example, because there are 87 more images of the animal. But if there is such an image for the test dataset, the algorithm will most likely not be able to specify a correct result.

In [None]:
imgs = ['33fc1508754452.jpg','197227f33561d5.jpg']
plot(imgs, figsize=(20,10))

# Other body parts
Most of the images in the dataset show the dorsal fin of the corresponding animal. Accordingly, it stands to reason that the neural network is most likely to be able to process such images and then also to assign them. Images showing other body parts of the creatures, such as the tail fin or even the complete animal, can cause problems in the algorithm. They show details of the animals, which are not included in most other images.

In [None]:
#imgs = ['2cfd2066e9df1c.jpg','03479f7a9301f8.jpg',
#        '4f43555e842ade.jpg','0086e36dfb15fd.jpg']
#plot_4_imgs(imgs,figsize=(20,14))
imgs = ['2cfd2066e9df1c.jpg','4f43555e842ade.jpg']
plot(imgs, figsize=(20,10))

# Night View
Some pictures look like they were taken with some kind of night vision camera, because they have this typical green color. When shooting with such a camera, only one color channel is used to represent the image, which is why all colors are represented as shades of green. What is striking here is that most of these pictures show beluga whales.

In [None]:
imgs = ['0039955230421d.jpg','49b10aa82994f6.jpg',
        '01bb74f1413cda.jpg','49f06f464824ae.jpg']
plot(imgs, figsize=(20,7))

Looking at the two images below, I realize that they could not have been taken with a night vision camera, as the annotations on the image are also green. The real problem with these images is that the ```matplotlib.pyplot.imread``` function does not handle gray scale images well. An alternative to this is the ```cv2.imread``` function, paired with a color conversion, which can correctly read the images as black and white images. The corresponding functions can also be used for all other images of the dataset without any problems.

In [None]:
imgs = ['01bb74f1413cda.jpg']
plt.figure(figsize=(10,5))
img = cv2.imread(get_path(imgs[0]))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(img)
plt.axis('off')
plt.show()

# Landscape Pictures
Some pictures show the beautiful nature in the habitats of whales and dolphins. What looks beautiful at first can quickly become a problem for the algorithm. If the main focus of the image is more on the background than on the actually interesting animal, this causes irritation. Sometimes the animal is so far away that it is even very difficult to recognize. The information that interests us in this case is hidden in just a few pixels of the image. For the network, such images are not easy to handle, but they cannot be simplified automatically either. 

Analogous to the landscape images, some shots show ice textures in the background. This is usually the case with animals that are native to icy waters, such as the killer whale. The irregular ice structures can interfere with the algorithm. On the other hand, it is good that the dark animals can be seen quite well next to the white ice.

In [None]:
#imgs = ['1287089c50aabe.jpg','25bea55efd69ae.jpg',
#        '4d5577bb182a72.jpg','2942870f6daf30.jpg']
imgs = ['2942870f6daf30.jpg','4d5577bb182a72.jpg',
        '1dc1a8f6925f40.jpg','2bc42058aa5024.jpg']
plot(imgs, figsize=(20,13))

# Image Lightning
There are many different lightning types in the dataset, some are taken in the dark, some while sunset etc. To compensate these lightning types I would recommend to convert the pictures into black and white photos. This would help generally for many topics in this dataset to provide errors.

In [None]:
#imgs = ['00afe60388a560.jpg','00e828b22fe365.jpg',
#        '292ceb0ffef4e8.jpg','2d597e33193464.jpg']
#plot(imgs, figsize=(20,12))
imgs = ['292ceb0ffef4e8.jpg','2d597e33193464.jpg']
plot(imgs, figsize=(20,10))

# Birds/Penguins and People
With some of the pictures we have another topic as the other ones before. There are other animals, like birds or penguins, or even people (tourists and scientists) in the field of view. This could be very problematic for the algorithm, because they provide contours that cannot be handled by the algorithm. This schould be kept in mind...

In [None]:
#imgs = ['4f5265c6192229.jpg','550d6e2843d83d.jpg',
#        '5bf1396d350169.jpg','9b0b44b19ba412.jpg']
#imgs = ['090d7f9228a6bc.jpg','538ead52ffa324.jpg',
#        '50da695b997738.jpg','12a7b25090e1b9.jpg']
imgs = ['550d6e2843d83d.jpg','5bf1396d350169.jpg',
        '538ead52ffa324.jpg','50da695b997738.jpg']
plot(imgs, figsize=(20,14))

# Buildings and Boats
analog to the landscape pictures there are also images with buildings and boats in the background, which could be problematic for the algorithm.

In [None]:
#imgs = ['398214a90dd3b3.jpg','3c15e996c183aa.jpg',
#        '57198326a6461d.jpg','2c54be7b88181a.jpg']
#plot(imgs, figsize=(20,13))
imgs = ['3c15e996c183aa.jpg','57198326a6461d.jpg']
plot(imgs, figsize=(20,10))

# Edited Images
Another thing are the cropped and rotated images in the dataset. I don't know why, but these pictures were edited, before they are saved in this dataset. For me it doesn't make any sens, but nevertheless it should not influence the training process. In this chapter I will also look at the horizontal flip. Some animals swim the left way, on another picture to the right. So they show their dorsal fin also from the other side. It is recommended to use the augmentation to handle both topics, the rotation and the horizontal flip.

In [None]:
imgs = ['0246806606bc80.jpg','2c1a75a9d2fa14.jpg']
plot(imgs, figsize=(20,10))

# Hidden Lens
You know that, when you want to take a picture and your finger is hiding the lens. This also happend to some of the photographers of the images in the dataset. I think, when the animal is nevertheless visible, these mistakes will not disturb the learning process.

In [None]:
imgs = ['1ecbddc0acaf11.jpg','d5b42024509635.jpg']
plot(imgs, figsize=(20,10))

# Screenshots
I don't know why, but these pictures are done as a screenshot. The processing of this data can be very difficult, because the image contains information, that we don't need. There are only a few of those pictures in the dataset. When you know them all, you can crop them manually if you want. I don't really know another way to handle these pictures.

In [None]:
imgs = ['35d677992a4f2e.jpg','008c5d3fc215ac.jpg']
plot(imgs, figsize=(20,10))

# Devices
I found two pictures with some divices on the animals. On the right picture it seems to be a transmittor for tracking the animal. These things are very irritating. Good to know: both pictures are included in the test data.

In [None]:
imgs = ['8c660e44867f8a.jpg','67e5fb9a6110b0.jpg']
plot(imgs, figsize=(20,10))

# ... and something else
I cannot explain but there is no animal in this picture. The image is offical classified as a gray whale with the individual ID **fc0f7c162cc0**. From this specific animal we have 72 more pictures, which are pretty good, so I would recommend to delete this image out of the dataset, to avoid failures in the algorithm.

In [None]:
imgs = ['cd5fe465c60cb9.jpg']
plot(imgs, figsize=(15,8))