In [None]:
%pylab notebook

from IPython.display import YouTubeVideo

# Let's define a function to show images...
import math
def _imshow(width, axes, *images):
    from matplotlib import pyplot as plt
    fig = plt.figure()

    height = math.ceil(len(images) / float(width))
    for i in range(0, len(images)):
        im = images[i]
        ax = fig.add_subplot(height, width, i + 1)
        cax = ax.imshow(im, cmap=plt.cm.cubehelix)
        
        if not axes:
            plt.axis('off')
        
    plt.show()

# Recap of Project

This project is about *shadow detection for mobile robots* - detecting shadows with an *active camera*. This is challenging because the majority of previous work revolves around using a background model to detect changed parts of a scene. This obviously can't work for active video!

## Hypothesis

The detection of shadows is possible by combining information about texture features (something that remains largely unchanged under shadow) with other information - including local maxima, smoothness, or other colourimetric features.

## Key Points

* Capturing multiple datasets
* Simple shadow detection with colour/brightness
* Texture feature investigation
* Clustering texture features into contiguous regions
* Combining 
* Machine learning techniques

## Datasets

### Static Camera (or "easy")

In [None]:
images = [imread("../data/static/%s/images/small-00044.png" % (i))
          for i in ["bobbly-slabs", "bricks", "smooth-slabs", "tarmac"]]  
_imshow(2, False, *images)

### Active Camera (or "slightly less easy")

In [None]:
images = [imread("../data/active/%s/images/small-00044.png" % (i))
          for i in ["grass-path", "seafront-gravel", "seafront-path"]]  
_imshow(2, False, *images)

### Ground Truth

The following video is an example of some of the initial techniques that I was using to generate ground truth. This involved using local brightness maxima and some morphological post-processing to reduce noise and fill holes (opening/closing). This is pretty much the point I'd reached at the time of the mid-project demonstration.

In [None]:
# a video in case of demo effect
YouTubeVideo("j5PmcK6b4dQ")

In [None]:
# So, for example, the ground truth looks like this...
ground_truth = imread("../data/static/smooth-slabs//ground-truth/small-00052.png")
image = imread("../data/static/smooth-slabs/images/small-00052.png")
_imshow(1, False, ground_truth, image)

## Texture

### Choice of Texture Feature

A large part of this project revolved around choosing texture features based on their shadow invariance. In the end, local binary patterns turned out to be the *best* texture feature - although, LBPs were chosen based on a combination of factors, rather than solely shadow invariance.

### Flaws in Analytical Method

The simplest way of comparing the shadow invariance of texture features seemed to be extracting two sets of features for an image pair - one image under shadow, and the other with no shadow - then using a simple metric for difference, such as mean squared error (essentially, Euclidean distance).

However, as became clear during experimentation, different feature types can't necessarily be compared in this manner as their data structures can be completely different. A sparse data structure, (for example, from a GLCM - a mostly-empty array), may exhibit a *very low* mean squared error - whereas a more dense structure (for example, an LBP) may give a considerably different result.

I would have changed this or investigated this more, given time.

### Texture Feature Clustering

K-means clustering was the primary clustering technique used in this project. As well as having a particularly performant implementation in `scikit-learn` (a fantastic mini-batch implementation), it was also the most straightforward algorithm to use - its only drawback was having to specify $k$.

#### Notable Mention

It's annoying to have to specify $k$ for each image, particularly on unknown image sequences. Online, unsupervised estimation of $k$ is a whole other project in itself, so I also tried DBSCAN for clustering (which is a clustering algorithm that does not have this requirement). DBSCAN essentially looks for areas of density separated by sparse areas. It was incredibly slow (almost unusable, actually), so it wasn't used in the end - but it was worth mentioning.

### Testing Clustering Quality

Because of the flawed analysis of shadow-invariant texture features, instead I evaluated how consistent texture clusters were under shadow. This simply involved applying a clustering algorithm to texture features from shadow-free/shadowed image pairs, and comparing the resultant clusters with various metrics (e.g. Adjusted Rand Index, Homogeneity, etc).

## Simple Clustering-Based Shadow Detection

### Without Variance Threshold

### With Variance Threshold

## Machine Learning

Very close to the end of the project, there were one or two days spare - so it made sense to experiment with some machine learning techniques before time ran out. Two types were trialled - decision trees and random forests.

### Decision Trees

Decision trees were the most effective machine learning technique used in this project, mostly due to their extreme simplicity and excellent performance.

In [None]:
# a video in case of demo effect
YouTubeVideo("zBXaazf0vI8")

### Random Forests

## Demonstration