# Titanic

## Can we leverage deep learning on irregular domains to save lifes?

---

*Teo Stocco, Pierre-Alexandre Lee, Yves Lamonato, Charles Thiebaut*, [EPFL](https://epfl.ch).

[Network Tour of Data Science](https://github.com/mdeff/ntds_2017) final project. This notebook contains a detailed overview through the whole project with all essential parts. As this work required several attempts and exploration, only relevant parts are kept here. You can however access their associated notebooks (unguided) with all processes when mentionned. This project was **not shared** with any other class.


[Binder access](https://mybinder.org/v2/gh/zifeo/Titanic/) | [nbviewer access](https://nbviewer.jupyter.org/github/zifeo/Titanic/blob/master/project.ipynb)

TODO:

- spell check plugin
- isolate code blocks in functions in separate Python module
- abstract
- notebook toc plugin
- README with getting started

## 1 Introduction

Icebergs and ships do not get well along each other. To avoid dramatic events such as the one that happened a century ago, we aim at helping a noble quest: differentiating icebergs and ships based on radar data to see whether any
iceberg is drifting away and might cross the road of a ship.

|© Statoil/C-CORE - Icebergs and ships examples|
|-|
|![](./img/statoil-ccore.png)|

This remote sensing measurements can be performed either by planes or by satellites. The latter can provide radar information up to 14 time a day as in the case of [Sentinel-1](https://fr.wikipedia.org/wiki/Sentinel-1). The C-Band radar manage to capture data in numerous conditions (e.g. darkness, rain, cloud, fog, etc.) and measures the energy reflected back called backscatter (Torres et al, 2012). Those data can latter be analyzed and used to clear out potential collision between icebergs and ships. 

Building on the top of recent advances in the field of signal processing on graphs (Schuman et al., 2013) and deep learning on irregular domains (Bronstein et al., 2017), we investigate the performance of standard machine learning methods and the relevance of graph based convolutional neural networks to perform binary classification in this specific case. The new method provide a convenient way of getting rotational invariance over the data and set up a flexible framework for structured pooling (Defferrard et al., 2017). Pooling operations require adequate aggregation by coarsening the graph between layers. We experiment how this framework can be exploited through various processes: Graclus multilevel algorithm, ... TODO

In [4]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import torch

## 2 Data source

The dataset is provided by Statoil, an oil and gas compagny, and C-CORE, a monitoring company using compute vision, to keep operations safe and efficient. It was released on Kaggle for prediction competition in late 2017. The full dataset contains 10'028 iceberg or ship cases with only 1'604 labelled. Some of the images were computer generated to avoid hand labelling in the competition. As we will only focus on labelled one, this should not matter. 

### Description

- HH: transmit and receive horizontally
- HV: transmit horizontally and receive vertically

Quantitative / Qualitative


| Feature | Description | Type | Has N/A | Comment |
| - | - | - | - |
| id | | | |
| band_1 | | | |
| band_2 | | | |
| inc_angle | | | |
| is_iceberg | | | |

### Exploration

> More details on associated notebook [xx]()

some cases

3D picture 

### Vizualization

PCA
clustering

## 3 Preprocessing

> More details on [associated notebook]()


In [7]:
# setting seed for reproducability
np.random.seed(0)
torch.manual_seed(0);

### Cleaning

cause de NA ?

only angle, among the trials we saw that replace it by median/mean/fixed number

### Scaling/normalization

### Features

features extraction

### Validation set

## 4 Graphs

### Grid

circular

### 2D grid

### Pattern-based graph


## 5 Modelling

### Naive methods

knn, logistic

### Graph Convolution

fourier

chebyshev

### Pooling

graclus implement is courtesy of Michael Defferrard

korn?

algeibraic (spectral cut)

MST

pattern-based

## 6 Training

### Train/test split

### Skorch

## 7 Evaluation

### Models

### Stacking

## 8 Conclusion

### Futures work

### Acknowledgements

## 9 References

- TORRES, Ramon, SNOEIJ, Paul, GEUDTNER, Dirk, et al. GMES Sentinel-1 mission. Remote Sensing of Environment, 2012, vol. 120, p. 9-24.
- SHUMAN, David I., NARANG, Sunil K., FROSSARD, Pascal, et al. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine, 2013, vol. 30, no 3, p. 83-98.
- BRONSTEIN, Michael M., BRUNA, Joan, LECUN, Yann, et al. Geometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine, 2017, vol. 34, no 4, p. 18-42.
- DEFFERRARD, Michaël, BRESSON, Xavier, et VANDERGHEYNST, Pierre. Convolutional neural networks on graphs with fast localized spectral filtering. In : Advances in Neural Information Processing Systems. 2016. p. 3844-3852.