# Principles and Patterns for ML Practitioners

### S.O.L.I.D (and more) principles applied to an ML problem

##### By Wolfgang Giersche, ZÃ¼hlke Engineering AG

![solid gold](images/solid_gold.jpg)

---

# Principles and Practices in Code

#### - Motivation: Typical Python ML code
#### - SWE's S.O.L.I.D Principles
#### - Background: Machine Learning with Tensorflow
#### - Tutorial: Structured Experiments in Python

# Principles and Practices in Collaboration

#### - Explore - Experiment - Build - Infer

---

# Motivation

[The official Tensorflow MNIST example](mnist_original.py)

---

# There's More to Code Than Coding

## Minimize learning curve for those after you
## Code is written once, read and changed multiple times
## Dare touch a running system: make it easy-to-change
## Reduce efforts for testing
## Minimize dependency and reduce complexity


---

## Exactly because data analytics and machine learning

## have rather *exploratory traits*

## practices should better support *code and config changes* 

## *without endangering* the quality of the code.

---

# The Anatomy of a Machine Learning Experiment
![Anatomy of an ML epic](images/Anatomy-of-an-experiment.png)


---

# Principles to the Rescue: S.O.L.I.D
The [S.O.L.I.D. Principles](http://www.cvc.uab.es/shared/teach/a21291/temes/object_oriented_design/materials_adicionals/principles_and_patterns.pdf) 
are commonly attributed to [Robert C. Martin (Uncle Bob)](https://de.wikipedia.org/wiki/Robert_Cecil_Martin).

### SRP = Single Responsibility Principle
### OCP = Open-Close Principle
### LSP = Liskov Substitution Principle
### ISP = Interface Segregation Principle
### DIP = Dependency Inversion Principle
#### ...and following those principles leads to patterns

---

# Background: Tensorflow

### Tensorflow already sports an extremely helpful design

### The actual processing is described by a computational graph

### ```Dataset```s, ```Estimator```s, and ```Tower```s manage the training for you


The content here is heavily inspired by the 
[github tensorflow repo](https://github.com/tensorflow/models/tree/master/official/mnist) - 
indeed initially copied, and then significantly refactored to demonstrate how SWE patterns and principles make the code more readable, testable and reusable.

We're using [Zalando Research's Fashion Dataset](https://github.com/zalandoresearch/fashion-mnist)
in addition to the well-known [Handwritten Digits](http://yann.lecun.com/exdb/mnist/).

---

The pipeline | the neural network
- | - 
![alt](images/ds-pipeline.png) | ![alt](images/nn-training.png)

---

# Tensorflow Building Blocks
##### I am using the most current TF API 1.8.0 with the following building blocks:

- [Tensorflow Dataset API](https://www.tensorflow.org/programmers_guide/datasets)
    - Allows for pre-processing with a monadic API (map, flatmap, etc)
    - Preprocessing may even happen in parallel streaming fashion
    
- [Estimator API](https://www.tensorflow.org/programmers_guide/estimators)
    - very convenient highlevel API
    - Checkpointing and recovery 
    - Tensorboard summaries
    - much more...    
    
- [Multi-GPU Training of contrib.estimator package](https://www.tensorflow.org/api_docs/python/tf/contrib/estimator/)
    - convenient wrapper to distribute training on any number of GPUs on a single machine
    - works by means of synchonous gradient averaging over parallel mini-batches

---

### The ```Dataset``` API

``` python
def train_input_fn():
    ds_tr = dataset.training_dataset(hparams.data_dir, DATA_SET)
    ds_tr_tr, _ = split_datasource(ds_tr, 60000, 0.95)
    ds1 = ds_tr_tr.cache().shuffle(buffer_size=57000).\
        repeat(hparams.train_epochs).\
        batch(hparams.batch_size)
    return ds1

def eval_input_fn():
    ds_tr = dataset.training_dataset(hparams.data_dir, DATA_SET)
    _, ds_tr_ev = split_datasource(ds_tr, 60000, 0.95)
    ds2 = ds_tr_ev.batch(hparams.batch_size)
    return ds2
```

---

### The ```Estimator``` API
Create an ```Estimator``` by passing a *model function* to the constructor

``` python
mnist_classifier = tf.estimator.Estimator(
    model_fn=model_function,
    model_dir=hparams.model_dir,
    params={
        'data_format': data_format,
        'multi_gpu': hparams.multi_gpu
    })
```

The model function must return appropriate ```EstimatorSpec```s for 'TRAIN', 'EVAL', or 'TEST'. We create it in its own module using a given ```Model```.

A ```Model``` is the function that actually creates the graph. Two possible implementations can be found in their own modules in the ```models``` package 

``` python
model_function = create_model_fn(
    lambda params: Model(params),
    tf.train.AdamOptimizer(),
    tf.losses.sparse_softmax_cross_entropy,
    hparams)
```

---

## Train
``` python
start_time=time.time()
mnist_classifier.train(input_fn=train_input_fn, hooks=[logging_hook])
duration=time.time() - start_time
```

## Evaluate
``` python
eval_results = mnist_classifier.evaluate(input_fn=eval_input_fn)
accuracy = eval_results['accuracy']
steps = eval_results['global_step']
duration = int(duration)
```

---

# Tutorial

[run_experiment.ipynb](run_experiment.ipynb)

---

# Explore - Experiment - Train - Inference
![ex-ex-tr-inf](images/Ex-Ex-Tr-Inf.png)