# Identification and localization of Sea Lion anatomical features using transfer learning

Will Dampier

- In my occasionally free moments.

In [1]:

from IPython.display import HTML

In [2]:
HTML('<iframe src="https://player.vimeo.com/video/266222134" width="640" height="360" frameborder="0" allowfullscreen></iframe>')

# Goal

#### Turn this 
![unlabeled-lion](imgs/cutie2.JPG)

#### To this
![labeled-lion](imgs/cutie2_labeled.jpg)

## And eventually

Are these two sea-lions the same?

| Lion 1| Lion 2|
|:------:|:-------:|
|![unlabeled-lion](imgs/cutie2.JPG)| ![unlabeled-lion](imgs/cutie.JPG)|
| | |


## Previous Sea Lion identification research

![whisker labeling](imgs/marked_whiskers.png)

![original requirements](imgs/orig_requirements.png)

But how do you get those 180 images in the wild?

![Sleeping lion](imgs/one-side.jpg)

## Biological Analogies


![Neuron](imgs/neuron-field.png)

[Source](http://neuroclusterbrain.com/neuron_model.html)

![Sleeping lion](imgs/Human_visual_pathway.svg)

[Source](https://creativecommons.org/licenses/by-sa/4.0)

![saccade example](imgs/saccades.png)

![saccade example](imgs/saccades-together.png)

### Computational Strategy

![Digital Cat](imgs/digital_cat.png)

[Source](http://cs231n.github.io/classification/)

![sliding convolutions](imgs/sliding_windows.gif)

[Source](https://medium.freecodecamp.org/an-intuitive-guide-to-convolutional-neural-networks-260c2de0a050)

![sliding convolutions](imgs/sliding_windows_multi.gif)

[Source](http://cs231n.github.io/convolutional-networks/)

ResNet is the current state of the art model. It has 25.6M parameters to tune.
![Resnet](imgs/resnet_model.png)

### ImageNet & Transfer Learning

![ImageNet](imgs/imgnet-logo.jpg)


15 millions labeled high-resolution images with around 22,000 categories. This collection was used to train ResNet to ~5% top-5 error rate.

![imagenet examples](imgs/imagenet.png)

![Layers](imgs/feature_levels.png)

[Source](https://torres.ai/artificial-intelligence-content/deeplearning/)

Cool and all, but this is what we'd get. The ability to tell if the __whole image__ contained a sea lion.

| Lion 97% | Lion 91% | Not Lion 0.12% |
|:------:|:-------:|:-------:|
|![unlabeled-lion](imgs/cutie2.JPG)| ![unlabeled-lion](imgs/cutie.JPG)| ![not-lion](imgs/not_lion.jpg)


### Retina Net

If you want to know __where__ an object is. You have to change your question  a bit.

![yolo-1](imgs/yolo-1.jpeg)

[Source](https://heartbeat.fritz.ai/gentle-guide-on-how-yolo-object-localization-works-with-keras-part-2-65fe59ac12d)

![yolo-2](imgs/yolo-2.jpg)

[Source](https://heartbeat.fritz.ai/gentle-guide-on-how-yolo-object-localization-works-with-keras-part-2-65fe59ac12d)

### Annotating Images

![dataturks](imgs/dataturks.png)

![annotation](imgs/annotation.png)

### Current Results

Training was done on only 40 labeled images done by one annotator (me). I used the `mobilenet` architecture and the smallest model parameters possible. 

![first-retinanet fun](imgs/first_retina_net.png)

![harder retinanet fun](imgs/hard_retina_net.png)

What's the image with the most "recognizable" face?

![Not a face](imgs/not_a_face.jpg)

## Future Directions

(Condensed from [source](https://medium.com/@ageitgey/machine-learning-is-fun-part-4-modern-face-recognition-with-deep-learning-c3cffc121d78))

![science](imgs/face_science.gif)

### Landmark detection

![landmarks](imgs/face_mask.png)

### That's so 1990

- Does not generalize well with faces that are not "front facing".
- False positive rate increases as the database size increases.
- Are these really the "best" points to use?

### Deep Learning Way

![triplet](imgs/triplet_example.png)

This strategy has many advantages:
  - Increasing the database __decreases__ false-positive rate.
  - Images can be taken from any angle.
  - Allows for "outlier" detection of novel faces that aren't in the database.
  - Face databases can be constructed in linear time and searched in `log(n)` time.
  - Generalizes to things beyond faces ... like whiskers and flippers.

Challenges yet to overcome:
  - We don't actually have the standard triplets. We have to make some assumptions about which sea lions are definitely not the same. 
    - Sea Lions recorded on disparate beaches on the same day are likely different lions.
  - How do we deal with integrating flipper matching, face recognition, whisker recognition, etc into one score.
  - Orders of magnitude fewer images. FaceNet is trained with at least 10 images from >50 million faces; we have ~3000 images of ? unique sea lions.