In [None]:
%matplotlib inline
import matplotlib
import seaborn as sns
sns.set()
matplotlib.rcParams['figure.dpi'] = 144

In [None]:
import numpy as np
import matplotlib.pyplot as plt
matplotlib.rcParams['figure.dpi'] = 100

import time
from datetime import datetime, timedelta

In [None]:
#Start Spark with BigDL support
from pyspark import SparkContext
import bigdl
import bigdl.util.common
sc = SparkContext.getOrCreate(conf=bigdl.util.common.create_spark_conf().setMaster("local[3]")
                              .set("spark.driver.memory","1g"))
bigdl.util.common.init_engine()

# Keras, TensorFlow, Caffe

BigDL is still a very new player in the deep learning space.  TensorFlow and Caffe are much more established, and there are many systems that use them and pre-trained models you can download, such as the Inception model used in DeepDream.  In addition, Keras is slowly becoming a standardized interface for deep learning.

As such, BigDL has support for all three.  We can load models in all three formats, and design our models for BigDL in Keras with a little work.  BigDL is not yet a proper Keras backend, but the ways models are designed in BigDL are very similar to how they are designed in Keras.  This has much to do with history, as BigDL's interface is actually modeled on that of Torch, another older deep learning interface.

## Working with existing models

It is straightforward to save and load native BigDL models.  If we have a trained model, we can save it to a file (well, two files) with the `saveModel` function.  Let's do this with our simple logistic regressor from the Intro notebook.

In [None]:
import matplotlib.pyplot as plt
from bigdl.util.common import Sample

centers = np.array([[0, 0]] * 150 + [[1, 0.5]] * 150 + [[0,1]]*150)
np.random.seed(41)
data = np.random.normal(0, 0.2, (450, 2)) + centers
labels = np.array([[1]] * 150 + [[2]] * 150 + [[3]]*150)

#Real data, of course, will be coming from elsewhere and will most likely already be an RDD at this point
data_with_labels = zip(data, labels)
samples = sc.parallelize(data_with_labels).map(lambda x: Sample.from_ndarray(x[0],x[1]))

plt.scatter(data[:,0], data[:,1], c=labels.reshape(-1), cmap=plt.cm.brg)
plt.colorbar();

In [None]:
from bigdl.nn import layer
from bigdl.nn import criterion 
from bigdl.optim import optimizer

lin = layer.Linear(2,3)()
soft = layer.SoftMax()(lin)
model = layer.Model([lin],[soft])

fitter = optimizer.Optimizer(model=model, training_rdd=samples, 
                             criterion=criterion.ClassNLLCriterion(logProbAsInput=False), 
                             optim_method=optimizer.SGD(0.05), end_trigger=optimizer.MaxEpoch(20), 
                             batch_size=60)

trained_model = fitter.optimize()

def get_accuracy(predicts, trues):
    return sum([int(predicts[i] == trues[i]) for i in range(len(predicts))]) * 1.0 / len(trues)

predictions = model.predict(samples).map(lambda x: x.argmax() + 1).collect()
get_accuracy(predictions, labels)

In [None]:
# Order: file for graph/model shape, file for weights, whether to
# overwrite existing model.  These can be local files, HDFS, or S3
model.saveModel('./models/logistic.graph', './models/logistic.weights', True)

The format is an internal BigDL-specific binary format.  It is fairly compact, but very much not human readable.

To load it back in, we merely need to call `loadModel` with the same files.

In [None]:
loaded_model = layer.Model.loadModel('./models/logistic.graph', './models/logistic.weights')

And we can confirm that it's the same model by making the same prediction:

In [None]:
predictions = loaded_model.predict(samples).map(lambda x: x.argmax() + 1).collect()
get_accuracy(predictions, labels)

## Working with TensorFlow

BigDL is quite interoperable with TensorFlow.  We can save our models to TF format and we can read in a TF model (both trained and untrained).

Note that we defined our model using the Functional interface above.  That was intentional, as only the Functional models can be imported and exported to TensorFlow.  TensorFlow also needs to have names for the placeholder variables used for input, and their shape.  We'll have to supply that as well.

In [None]:
# Export to TF, untrained model
# Order: list of inputs in the form (name, shape), then the file to save
# to.  .pb is TF's format
model.save_tensorflow([('input',[10,2])], './models/logistic.pb')

We can read in a trained model from TensorFlow.  If it's a checkpoint file (just like BigDL's checkpoint files, they save the state as a model is trained), we'll need to do an extra step.

In TF, these models are traditionally named  `something.ckpt.*` with the extra extension being `.meta`, `.data` and `.index`.  BigDL comes with a tool to turn these in to the `.pb` and `.bin` files that it needs called `export_tf_checkpoint.py`. This is an external python script you need to run, and it takes as arguments.  These files contain the graph definition (`.pb`) and the weight values (`.bin`).

Once you have these, it's straightforward-ish to load the model.  You'll need to know what the input and output placeholders/variables were in the original TF graph.  We'll just read in the one we just wrote out and leave off the `.bin` part, so we'll get an untrained model.  If we had the `.bin` file, we'd need to add a `bin_file=` flag to our call.  Since this was a BigDL model, the output layer got a rather strange name.

In [None]:
outname = model.flattened_layers()[-1].name()
print(outname)

In [None]:
model = layer.Model.load_tensorflow('./models/logistic.pb', ['input'], [outname])

## Importing a trained model: Caffe

In Caffe, things are stored in external files by default, so we don't need special saving strategies.  If we have a text file describing an network architecture, called a `prototxt` and the trained model file, importing is another simple call:

```python
model = Model.load_caffe_model(prototxt_filename, model_filename)
```

## Importing a trained model: Keras

Keras models are stored in JSON files, with trained weights in the `HDF5` binary format.  It is a front end rather than a full learning system, so it uses several different programs for its back end, notably TensorFlow and Torch.  While the `HDF5` files should be independent of choice of backend, BigDL is only tested with files from the TensorFlow backend.

To load an untrained model, simply call:

```python
keras_model = layer.Model.load_keras(json_path=json_file)
```

With trained weights, you just need to also specify the `HDF5` file:

```python
keras_model = layer.Model.load_keras(json_path=json_file,
                                     hdf5_path=hdf5_file,
                                     by_name=False)
```

The `by_name` flag is optional, with a default of False.  It tells BigDL whether to ignore the names of the layers and fill in the model from the `HDF5` just following the architecture (False) or to only read in those layers that have the same name in both files, with layers in the JSON that are not in the other file given default weight values (True).

## Building a model in Keras

Since BigDL is not a supported Keras backend yet, we cannot _train_ directly using Keras.  We can, however, make our models in Keras then read them in to BigDL.  Designing a model in Keras is much like designing models in BigDL, with both Sequential and Functional interfaces.  All Keras models have a `to_json()` function, which outputs the model structure as a JSON string.  If you write this to a file, you can then use the load call above.

Why bother with this extra complication?  There are a lot of models out there already in Keras form, that you can download and manipulate or use as a jumping-off point.  And most deep learning utilities support Keras to some extent, so saving to the Keras JSON format means having your model in a format that anyone can use.

*Copyright &copy; 2018 The Data Incubator.  All rights reserved.*