# save and restore

following along [here](https://www.tensorflow.org/programmers_guide/saved_model)

use `tf.train.Saver` to save and restore models

In [22]:
import tensorflow as tf

import utils

## save and restore variables

best target for saving: variables (they already represent persistent state, now they persist between programs instead of just session runs). use `tf.train.Saver` to create a `save` and `restore` operation and then execute these.

tensorflow save files have a "binary checkpoint" format mapping variable names to tensor values

### save variables

`tf.train.Saver.save` is the goto:

In [23]:
# Create some variables.
v1 = tf.get_variable("v1", shape=[3], initializer = tf.zeros_initializer)
v2 = tf.get_variable("v2", shape=[5], initializer = tf.zeros_initializer)

In [24]:
# create some tensors as outputs of the assign operation
inc_v1 = v1.assign(v1 + 1)
dec_v2 = v2.assign(v2 - 1)

In [25]:
# Add an op to initialize the variables.
init_op = tf.global_variables_initializer()

In [26]:
# Add ops to save and restore all the variables.
saver = tf.train.Saver()

In [27]:
# Later, launch the model, initialize the variables, do some work, and save the
# variables to disk.
with tf.Session() as sess:
    sess.run(init_op)
    
    # Do some work with the model.
    inc_v1.op.run()
    dec_v2.op.run()
    
    # Save the variables to disk.
    save_path = saver.save(sess, "/tmp/model.ckpt")
    
    print("Model saved in path: {}".format(save_path))

Model saved in path: /tmp/model.ckpt


In [28]:
ll /tmp/

total 24
-rw-r--r-- 1 zlamberty   87 Jul  3 18:50 checkpoint
-rw-r--r-- 1 zlamberty   32 Jul  3 18:50 model.ckpt.data-00000-of-00001
-rw-r--r-- 1 zlamberty  143 Jul  3 18:50 model.ckpt.index
-rw-r--r-- 1 zlamberty 3931 Jul  3 18:50 model.ckpt.meta
drwx------ 2 root      4096 Apr 28 00:42 [0m[01;34mtmp51j6z2lb_kernels[0m/
drwx------ 2 root      4096 Jun 14 19:00 [01;34mtmpgk_8bk3y_kernels[0m/


In [29]:
!cat /tmp/checkpoint

model_checkpoint_path: "/tmp/model.ckpt"
all_model_checkpoint_paths: "/tmp/model.ckpt"


### quick asside on nuking the default graph

how did I not see `tf.reset_default_graph()` before now??

In [30]:
utils.show_graph(tf.get_default_graph().as_graph_def())

In [31]:
tf.reset_default_graph()

In [32]:
utils.show_graph(tf.get_default_graph().as_graph_def())

### restore variables

we saved 'em. big whoop. show me how to restore them and I'll *really* be impressed

In [33]:
# well I'll be damned, first time I've seen this, and it's pretty goddamn useful
tf.reset_default_graph()

In [35]:
# Create some variables.
v1 = tf.get_variable("v1", shape=[3])
v2 = tf.get_variable("v2", shape=[5])

In [36]:
# Add ops to save and restore all the variables.
saver = tf.train.Saver()

In [37]:
# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
    # Restore variables from disk.
    saver.restore(sess, "/tmp/model.ckpt")
    print("Model restored.")
    # Check the values of the variables
    print("v1 : %s" % v1.eval())
    print("v2 : %s" % v2.eval())

INFO:tensorflow:Restoring parameters from /tmp/model.ckpt
Model restored.
v1 : [1. 1. 1.]
v2 : [-1. -1. -1. -1. -1.]


### choose variables to save and restore

by default, `tf.train.Saver` will save *all* variables in the graph, and will save them with a key being the name of the variable as it was created

+ want different keys / names? want to pick up a variable named `x` but call it `y` this time? provided example: `weights --> params`.
+ want to save only one of a berjuilion variables?

`tf.train.Saver` takes a list or mapping of variable names to pay attention to:

In [79]:
tf.reset_default_graph()
# Create some variables.
v1 = tf.get_variable("v1", [3], initializer = tf.zeros_initializer)
v2 = tf.get_variable("v2", [5], initializer = tf.zeros_initializer)

# Add ops to save and restore only `v2` using the name "v2"
saver = tf.train.Saver({"v2": v2})

# Use the saver object normally after that.
with tf.Session() as sess:
    # Initialize v1 since the saver will not.
    v1.initializer.run()
    saver.restore(sess, "/tmp/model.ckpt")
  
    print("v1 : %s" % v1.eval())
    print("v2 : %s" % v2.eval())

INFO:tensorflow:Restoring parameters from /tmp/model.ckpt
v1 : [0. 0. 0.]
v2 : [-1. -1. -1. -1. -1.]


so, in the above, we told `saver` to only care about `v2`, and lo! only `v2` was restored when we called `saver.restore`. not bad.

good idea: we can have multiple savers for different contexts / intentions.

### inspect variables in a checkpoint

you can inspect variables in a checkpoint with `tf.inspect_checkpoint`:

In [83]:
# import the inspect_checkpoint library
from tensorflow.python.tools import inspect_checkpoint as chkp

In [86]:
chkp.print_tensors_in_checkpoint_file?

In [87]:
# print all tensors in checkpoint file
chkp.print_tensors_in_checkpoint_file(
    file_name="/tmp/model.ckpt",
    tensor_name='',
    all_tensors=True
)

tensor_name:  v1
[1. 1. 1.]
tensor_name:  v2
[-1. -1. -1. -1. -1.]


In [88]:
# print only tensor v1 in checkpoint file
chkp.print_tensors_in_checkpoint_file(
    "/tmp/model.ckpt",
    tensor_name='v1',
    all_tensors=False
)

tensor_name:  v1
[1. 1. 1.]


In [89]:
# print only tensor v2 in checkpoint file
chkp.print_tensors_in_checkpoint_file(
    "/tmp/model.ckpt",
    tensor_name='v2',
    all_tensors=False
)

tensor_name:  v2
[-1. -1. -1. -1. -1.]


not bad

## save and restore models

variables are cool and all but what bout dem models boi?

claim: `SavedModel` is used to save and load models. this is meant to be language-neutral. there are several methods of interacting with `SavedModel` discussed below

## build and load a `SavedModel`

### simple save

just use `tf.saved_model.simple_save` *a la*

```python
tf.saved_model.simple_save(
    session,
    export_dir,
    inputs={'x': x, 'y': y},
    outputs={'z': z}
)
```

this presupposes the model is built and defined with the stated inputs and outputs. my assumption is that the entire intermediary graph is then persisted intact, and all you need to know on re-instantiation is that it will expect those inputs and emit those outputs.

### manually build a `SavedModel`

an alternative for more complicated scenarios (not sure what those scenarios are yet) is to use the `tf.saved_model.builder.SavedModelBuilder` class. this class will save `MetaGraphDef` objects: objects which represent a dataflow graph plus variables, assets, and "signatures" (inputs and outputs, i.e. function signatures). this is a protocol buffer repr of a regular `MetaGraph`.

example use:

```python
export_dir = ...

# do stuff
# ...
    

builder = tf.saved_model.builder.SavedModelBuilder(export_dir)
with tf.Session(graph=tf.Graph()) as sess:
    # do stuff
    # ...
    
    builder.add_meta_graph_and_variables(
        sess,
        [tag_constants.TRAINING],
        signature_def_map=foo_signatures,
        assets_collection=foo_assets,
        strip_default_attrs=True
    )

# Add a second MetaGraphDef for inference.
with tf.Session(graph=tf.Graph()) as sess:
    # do stuff
    # ...
    
    builder.add_meta_graph([tag_constants.SERVING], strip_default_attrs=True)

# do stuff
# ...

builder.save()
```

#### forward compatibility via `strip_default_attrs = True`

says what is means

### loading a `SavedModel` in `python`

assuming a model has been saved using the `SavedModelBuilder` as above. we can load from those saved models using the items associated with any set of tags:

```python
export_dir = ...

with tf.Session(graph=tf.Graph()) as sess:
  tf.saved_model.loader.load(sess, [tag_constants.TRAINING], export_dir)
```

### load a `SavedModel` in `c++`

rare moment of c code creepin in

```c++
const string export_dir = ...
SavedModelBundle bundle;
...
LoadSavedModel(session_options, run_options, export_dir, {kSavedModelTagTrain},
               &bundle);
```

### load and serve a `SavedModel` in `tensorflow` serving

the `tensorflow_model_server` package comes with a `cli` for serving `SavedModel` objects from file. as long as they all live in directoies `{model_base_bath}/{numeric_index}`, they will be loaded when the value of `--model_base_path` is set as that containing directory.

their example: models save to `/tmp/model/0001` and `/tmp/model/0002` serve via the call

```bash
tensorflow_model_server --port=port-numbers --model_name=your-model-name --model_base_path=your_model_base_path
```

### standard constants

there are a lot of built-in constants living in:

In [96]:
[_ for _ in dir(tf.saved_model.tag_constants) if _[:2] != '__']

['GPU', 'SERVING', 'TPU', 'TRAINING']

In [97]:
[_ for _ in dir(tf.saved_model.signature_constants) if _[:2] != '__']

['CLASSIFY_INPUTS',
 'CLASSIFY_METHOD_NAME',
 'CLASSIFY_OUTPUT_CLASSES',
 'CLASSIFY_OUTPUT_SCORES',
 'DEFAULT_SERVING_SIGNATURE_DEF_KEY',
 'PREDICT_INPUTS',
 'PREDICT_METHOD_NAME',
 'PREDICT_OUTPUTS',
 'REGRESS_INPUTS',
 'REGRESS_METHOD_NAME',
 'REGRESS_OUTPUTS']

the idea is that these are common tags you might use in the arguments to `tf.saved_model` functions, so keep them normalized and generally accessible

## using `SavedModel` with `Estimator`s

elsewhere we've covered `Estimator` objects; here we show how they interact with `SavedModel` objects. the use case is pretty obvious -- we've trained a good model, now let's use it to make scores / predictions.

the idea, then, is to take that trained estimator and to use it to create a new object which is an encapsulated model *service*. the format for that model service is an assumed standard (I have never heard of it so I will have to trust tf on this one): the `SavedModel` format

the steps:

1. specify output nodes and apis
1. export model to `SavedModel` format
1. serve model and request predictions

### prepare serving inputs

when you create a training estimator you have to write an `input_fn` to ingest and preprocess data. it would seem easy enough to re-use this, but for reasons beyond me you actually have a separate function for serving: `serving_input_receiver_fn`.

the `serving_input_receiver_fn` ingests and preprocesses inference requests, returning a `tf.estimator.export.ServingInputReceiver`

typical usage pattern:

1. we start by knowning that the server will eventually receive inference request
    1. it is a serialized `tf.Example`
1. `serving_input_receiver_fn` creates a string placeholder which can take on the value of the serialized inference request
1. `serving_input_receiver_fn` parses `tf.Example` with `tf.parse_example` operations
    1. this requires a parsing specification (how do I take what I need out of `tf.Example`?)
        1. a parsing specification is a lot like a feature dict -- it has keys that are feature names and then `tf.FixedLenFeature`, `tf.VarLenFeature`, or `tf.SparseFeature`

a full example:

In [111]:
# a feature specification is required for the `tf.parse_example`
# call, which is an operation that eats a string (e.g. a 
# string-serialized protobuff of a scorable record, sent over grpc)
# and converts it into a tensor of fixed dtype and shape
feature_spec = {
    'foo': tf.FixedLenFeature(dtype=tf.float32, shape=[4]),
    'bar': tf.VarLenFeature(dtype=tf.float32)
}

In [112]:
default_batch_size = 32

def serving_input_receiver_fn():
    """An input receiver that expects a serialized tf.Example."""
    # the placeholder tensor which will eventually hold the serialized infernece request
    serialized_tf_example = tf.placeholder(
        dtype=tf.string,
        shape=[default_batch_size],
        name='input_example_tensor'
    )
    
    # a feature dict
    receiver_tensors = {'examples': serialized_tf_example}
    
    # features here is built out of the placeholder and the feature_spec we defined above
    features = tf.parse_example(serialized_tf_example, feature_spec)
    
    # the feature dict plus features come togehter to give us the input receiver tensor
    return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)

In [113]:
serving_input_receiver_fn()

ServingInputReceiver(features={'bar': <tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7f32bffefcf8>, 'foo': <tf.Tensor 'ParseExample_3/ParseExample:3' shape=(32, 4) dtype=float32>}, receiver_tensors={'examples': <tf.Tensor 'input_example_tensor_3:0' shape=(32,) dtype=string>}, receiver_tensors_alternatives=None)

the above is only necessary if the input to your serving process is the serialized protobuff example records. if you are serving things locally, they will exist in "raw" in-memory form, and you can consume them directly. **even then**, you need to create a placeholder to be updated, and you still must provide a `serving_input_receiver_fn` to do this.

they recommend using the built-in `tf.estimator.export.build_raw_serving_input_receiver_fn` for this use case.

a more concrete way of discussing all this: when serving a model, you have fixed inputs to the model, but not necessarily fixed inputs to the interface of the serving process. therefore, tensorflow requires you define a function -- even an extremely simple one -- to perform the transform stage of that ETL from received input to model input, and this function is called the `serving_input_receiver_fn`

### perform the export

the work we did in the last section was to define how a `SavedModel` object should expect recieved inference requests to be communicated (raw tensors? serialized example records?) and then how it should convert them to the model's expected input.

given a trained model and this `serving_input_receiver_fn` set of operations/tensors, we can now save the combination together as a `SavedModel` via the cleverly named export function `tf.estimator.Estimator.export_savedmodel`

the function call will typically look like:

```python
estimator.export_savedmodel(
    export_dir_base,  # where the model will be saved
    serving_input_receiver_fn,  # the function we defined above
    strip_default_attrs=True
)
```

### specify the outputs of a custom model

in a previous notebook (007_creating_custom_estimators) we defined a custom model. defining a custom estimator is basically equivalent to defining a custom `model_fn`:

```python
def my_model_fn(features, labels, mode, params):
    # 1. build the model
    net = tf.feature_column.input_layer(features=features,
                                        feature_columns=params['feature_columns'])
    
    for units in params['hidden_units']:
        net = tf.layers.dense(net, units=units, activation=tf.nn.relu)
    
    logits = tf.layers.dense(net, params['n_classes'], activation=None)
    
    # 2. define mode behavior
    predicted_classes = tf.argmax(logits, 1)
    
    # 2.a. PREDICT
    if mode == tf.estimator.ModeKeys.PREDICT:
        predictions = {'class_ids': predicted_classes[:, tf.newaxis],
                       'probabilities': tf.nn.softmax(logits),
                       'logits': logits}
        return tf.estimator.EstimatorSpec(mode, predictions=predictions)
    
    # the remaining two modes both require loss and accuracy calculations
    loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
    
    accuracy = tf.metrics.accuracy(labels=labels,
                                   predictions=predicted_classes,
                                   name='acc_op')
        
    # 2.b. EVAL
    if mode == tf.estimator.ModeKeys.EVAL:
        return tf.estimator.EstimatorSpec(mode,
                                          loss=loss,
                                          eval_metric_ops={'accuracy': accuracy})
    
    # 2.c. TRAIN
    if mode == tf.estimator.ModeKeys.TRAIN:
        optimizer = tf.train.AdagradOptimizer(learning_rate=0.1)
        train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
        return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)
    
    raise ValueError("mode must be one of tf.estimator.ModeKeys.{PREDICT,EVAL,TRAIN}")
```

note that in the above, what our `model_fn` is *actually* returning is a `tf.estimator.EstimatorSpec`

In [124]:
tf.estimator.EstimatorSpec.__new__?

one of the parameters to `tf.estimator.EstimatorSpec` is `export_outputs`. this argument is explicitly related to the `SavedModel` format and defines the type of output that will be produced by this model function in the provided `mode`.

to steal directly from the documentation:

> ```
export_outputs: Describes the output signatures to be exported to
    `SavedModel` and used during serving.
    A dict `{name: output}` where:
    * name: An arbitrary name for this output.
    * output: an `ExportOutput` object such as `ClassificationOutput`,
        `RegressionOutput`, or `PredictOutput`.
    Single-headed models only need to specify one entry in this dictionary.
    Multi-headed models should specify one entry for each head, one of
    which must be named using
    signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY.
```

so, what's that mean?

basically, the tensorflow model serving api is tightly defined, and you have to map the desired output type for your model to one of the defined api methods. the three output types (`Classification`, `Regression`, and `Predict`) will correspond to different api endpoints. also, if there are multiple models being considered / served, the `name` in the `export_outputs` dictionary will also be used in the api call to differentiate between models / pick which model we is served for any given api call

### serve the exports model locally

for development you can run the opensource tensorflow serving. you build it with `bazel`. it seems like a pain.

also looks like there's a `docker` image for ya: https://www.tensorflow.org/serving/docker

### request predictions from a local server

there is some basic information here about how the service is built and how `python` programs can invoke it.

for example, in order to have `python` protobuffs downloaded and available for the service to use, you must inform the `bazel` build tool of those dependencies ahead of time by providing the dependency list:

```python
deps = [
    # optional, often not all will be required / defined
    "//tensorflow_serving/apis:classification_proto_py_pb2",
    "//tensorflow_serving/apis:regression_proto_py_pb2",
    "//tensorflow_serving/apis:predict_proto_py_pb2",
    # required for the whole durn thing
    "//tensorflow_serving/apis:prediction_service_proto_py_pb2",
]
```

a service that installs those dependencies will be accessible through `python` code with officially maintained api libraries:

```python
from tensorflow_serving.apis import classification_pb2
from tensorflow_serving.apis import regression_pb2
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2
```

so in this instance, your `SavedModel` service will be exposed over `rpc` and your `python` code can make simple direct connections to that service for classification, regression, prediction, etc.

here's an example program for interacting with a service:

```python
# library required to build grpc protobuff messages
from grpc.beta import implementations

# create the connection to the service
channel = implementations.insecure_channel(host, int(port))
stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)

# create an empty protobuff request and then add images to it
request = classification_pb2.ClassificationRequest()
example = request.input.example_list.examples.add()
example.features.feature['x'].float_list.value.extend(image[0].astype(float))

# make the classification request
result = stub.Classify(request, 10.0)  # 10 secs timeout
```

## cli to inspect and execute `SavedModel`

simply cli for sanity checks (e.g. introspection of inputs and outputs)

### install the `SavedModel` cli

the `SavedModel` cli is installed on this system by virtue of being packaged with the `tensorflow` `docker` base image. we would possibly have had to install it separately under differenc circumstances

### overview of commands

In [127]:
!saved_model_cli -h

  from ._conv import register_converters as _register_converters
usage: saved_model_cli [-h] [-v] {show,run,scan} ...

saved_model_cli: Command-line interface for SavedModel

optional arguments:
  -h, --help       show this help message and exit
  -v, --version    show program's version number and exit

commands:
  valid commands

  {show,run,scan}  additional help


as it says: `show`, `run`, and `scan` exist. docs only cover the first two

### `show` command

In [128]:
!saved_model_cli show -h

  from ._conv import register_converters as _register_converters
usage: saved_model_cli show [-h] --dir DIR [--all] [--tag_set TAG_SET]
                            [--signature_def SIGNATURE_DEF_KEY]

Usage examples:
To show all tag-sets in a SavedModel:
$saved_model_cli show --dir /tmp/saved_model

To show all available SignatureDef keys in a MetaGraphDef specified by its tag-set:
$saved_model_cli show --dir /tmp/saved_model --tag_set serve

For a MetaGraphDef with multiple tags in the tag-set, all tags must be passed in, separated by ';':
$saved_model_cli show --dir /tmp/saved_model --tag_set serve,gpu

To show all inputs and outputs TensorInfo for a specific SignatureDef specified by the SignatureDef key in a MetaGraph.
$saved_model_cli show --dir /tmp/saved_model --tag_set serve --signature_def serving_default

To show all available information in the SavedModel:
$saved_model_cli show --dir /tmp/saved_model --all

optional arguments:
  -h, --help            show this help message a

there are also some interesting examples provided:

```sh
$ saved_model_cli show --dir /tmp/saved_model_dir
The given SavedModel contains the following tag-sets:
serve
serve, gpu
```

```sh
$ saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve
The given SavedModel `MetaGraphDef` contains `SignatureDefs` with the
following keys:
SignatureDef key: "classify_x2_to_y3"
SignatureDef key: "classify_x_to_y"
SignatureDef key: "regress_x2_to_y3"
SignatureDef key: "regress_x_to_y"
SignatureDef key: "regress_x_to_y2"
SignatureDef key: "serving_default"
```

### `run` command

In [129]:
!saved_model_cli run -h

  from ._conv import register_converters as _register_converters
usage: saved_model_cli run [-h] --dir DIR --tag_set TAG_SET --signature_def
                           SIGNATURE_DEF_KEY [--inputs INPUTS]
                           [--input_exprs INPUT_EXPRS]
                           [--input_examples INPUT_EXAMPLES] [--outdir OUTDIR]
                           [--overwrite] [--tf_debug]

Usage example:
To run input tensors from files through a MetaGraphDef and save the output tensors to files:
$saved_model_cli show --dir /tmp/saved_model --tag_set serve \
   --signature_def serving_default \
   --inputs input1_key=/tmp/124.npz[x],input2_key=/tmp/123.npy \
   --input_exprs 'input3_key=np.ones(2)' \
   --input_examples 'input4_key=[{"id":[26],"weights":[0.5, 0.5]}]' \
   --outdir=/out

For more information about input file format, please see:
https://www.tensorflow.org/programmers_guide/saved_model_cli

optional arguments:
  -h, --help            show this help message and exit
  --dir

there are several supported ways of passing tensors to the saved model:

1. `--inputs`: accepts files in `npy`, `npz`, or `pkl` format
    1. e.g. `--inputs x=x.npy`
    1. e.g. `--inputs x=xyz.pkl[x]`
1. `--input_exprs`: accepts full `python` expressions
    1. e.g. `--input_exprs x=[[1],[2],[3]]`
    1. e.g. `--input_exprs x=np.ones((32,32,3))`
1. `--input_examples`: accepts tensorflow examples as fully written out strings obeying the `tf.train.Example` protobuff specification
    1. e.g. `--input_examples x=[{"age":[22,24],"education":["BS","MS"]}]`

## structure of a `SavedModel` directory

basic discussion of the contents in the directory created when you save a model:

```
assets/
assets.extra/
variables/
    variables.data-?????-of-?????
    variables.index
saved_model.pb|saved_model.pbtxt
```

# summary

whew. a lot.

1. there are a few ways of saving and restoring things, depending on what you're saving
    1. *variables* can be saved with `tf.train.Saver` 
    1. *models* are saved as `SavedModel` objects
1. variable saving (via `tf.train.Saver`) is extremely straight-forward
    1. create a `tf.train.Saver` object which points to a specific directory / filename prefix
    1. create `save` or `restore` operations
    1. execute those operations within some session
1. what is a "model" here?
    1. a "model" here is defined as the "variables, the graph, and the graph's metadata"
    1. models are saved in a language-independent format called `SavedModel`
1. saving models
    1. the exact way we save models depends on the use case
        1. saving so we can re-open in `python` or `c++`
            1. simple, support "only" the `Predict` api: save using `tf.saved_model.simple_save`
                1. docs go out of their way to explain that this will only support the `Predict` api, not sure what that excludes
            1. beyond that: create a `tf.saved_model.builder.SavedModelBuilder` object and get explicit with it
1. saving `Estimators` such that we can feed them into a `tensorflow_server`
    1. this is a *huge* digression into exacly how this is done. basic steps:
        1. add a few special items to the already-trained estimator:
            1. a function which determines the input of records to the service and transforms them to the expected input type of the model
            1. if the estimator is custom, define the output types `export_outputs` dictionary as part of the `model_fn` definition
        1. use the estimator's `estimator.export_savedmodel` method