# TensorFlow Deploy - Labels 

Welcome to the second `tfd` training notebook. If you are here, you have most probably got familiar with the first notebook of the series.

This time we would like to show you real power and benefits of using `tfd` for either your own or company projects. A/B tests are crucial part of many ML projects. We will show you how to run such tests easily by using `tfd`.

In [None]:
from helpers import grpc_client, show, show_label, return_mistakes
import tensorflow as tf
import tensorflow_deploy_utils as tfd

In [None]:
tf.__version__, tfd.__version__

## Przygotowanie danych

In [None]:
(train_data, train_labels), (test_data, test_labels) = tf.keras.datasets.mnist.load_data()

train_data = train_data / 255.0
test_data = test_data / 255.0
train_labels = tf.keras.utils.to_categorical(train_labels)
test_labels = tf.keras.utils.to_categorical(test_labels)

## Model 1 - podstawowy

This is a repetition of what can be found in the introductory notebook, so you already know this part.

In [None]:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(units=5, input_shape=(28, 28)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(10, activation="softmax"))

model.compile(optimizer="rmsprop", loss="categorical_crossentropy", metrics=["accuracy"])
model.summary()

_ = model.fit(x=train_data, y=train_labels, validation_split=0.2, epochs=1, batch_size=64)

loss, accuracy = model.evaluate(x=test_data, y=test_labels, batch_size=64, verbose=0)
print("Test accuracy: {a}".format(a=accuracy))

EXPORT_PATH = "/tmp/models/mnist/1"
tf.saved_model.save(model, EXPORT_PATH)

TEAM = "demo"
PROJECT = "labels"
NAME = "mnist"
LABEL = "basic"  # opcja

tfd_cursor = tfd.TFD(team=TEAM, project=PROJECT, name=NAME, label=LABEL, host="tfd")

DESCRIPTION = """
This is a smiple model version solving MNIST issue. A simple two-layer neural network with 
a few neurons has been applied to solve it.  
"""

tfd_cursor.generate_model_readme(dst_path=EXPORT_PATH, description=DESCRIPTION, 
                                 metrics={"accuracy": accuracy, "loss": loss})

response = tfd_cursor.deploy_model(EXPORT_PATH)
print(response)

Let's see if the model really reached 'tfd' and is available

In [None]:
tfd_cursor.list_models()

We can see that the status of our `basic` model for `labels` project is set to `ready`

**Attention**: we changed the project's name here - i.e. our model reached other `tfs` instance than it was described in our introductory notebook.

We are aware that our first model is very simple and its quality is far from perfect (accuracy tests gave us results of 87-90% which is quite poor). Let's have a look at a few examples where the model is not correct. 

In order to do that, we will ask `tfs` for our test data, and next we will compare `basic` model decision with real labels (`true_labels`).

In [None]:
predictions = grpc_client(dataset=test_data, host=f"tfs-{TEAM}-{PROJECT}", port=8500, 
                          model_name=NAME, model_label="basic")
predictions = [p.outputs["dense_1"].float_val for p in predictions]

In [None]:
mistakes = return_mistakes(test_data, test_labels, predictions, mistakes_number=5)

In [None]:
for i, (img, true_label, predicted_label) in enumerate(mistakes):
    show(array=img, title=f"Mistake #{i + 1}. True label: {true_label};   Prediction: {predicted_label}")

The results above are not satisfactory. It may be true that some of the examples are difficult for interpretation even for a human being, but some of them are obvious mistakes. Thus we will try to improve our model and get rid of these obvious mistakes.

## Model 2 - extended

As you have just seen, this model is not a best solution for production, as it can make mistakes in even very clear cases. Let's see how to quickly change our model into a more sufficient one. In the following cell we will build a neuron web with a more advanced structure. Then we will implement it again labelled as `extended`.

Once implemented, we will re-test predictions for the previous mistakes by asking `tfs`.

In [None]:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(units=32, input_shape=(28, 28), activation="relu", name="dense"))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(64, activation="relu"))
model.add(tf.keras.layers.Dense(32, activation="relu"))
model.add(tf.keras.layers.Dense(10, activation="softmax"))

model.compile(optimizer="rmsprop", loss="categorical_crossentropy", metrics=["accuracy"])
model.summary()

history = model.fit(x=train_data, y=train_labels, validation_split=0.2, epochs=5, batch_size=64)
loss, accuracy = model.evaluate(x=test_data, y=test_labels, batch_size=64, verbose=0)
print(f"\nTest accuracy: {accuracy}\n")

EXPORT_PATH = "/tmp/models/mnist/2"
tf.saved_model.save(model, EXPORT_PATH)

DESCRIPTION = """
Advanced model for the MNIST issue. For its solution, I applied a simple MLP with 64 neurons within the first hidden layer, 
32 within the second hidden layer and 16 within the third one.
"""

tfd_cursor.generate_model_readme(dst_path=EXPORT_PATH, description=DESCRIPTION, 
                                 metrics={"accuracy": accuracy, "loss": loss})
response = tfd_cursor.deploy_model(EXPORT_PATH, label="extended")
print(response)

After running the accuracy test, we can see that our new model is incomparably better that the previous one. Let's see how the new model will evaluate mistakes of the previous one.

In [None]:
new_predictions = grpc_client(dataset=[img for img, _, _ in mistakes], host=f"tfs-{TEAM}-{PROJECT}", port=8500, 
                              model_name=NAME, model_label="extended")
new_predictions = [show_label(p.outputs["dense_4"].float_val) for p in new_predictions]
new_mistakes = [(img, true_label, new_prediction) for ((img, true_label, _), new_prediction) in zip(mistakes, new_predictions)]

In [None]:
for i, (img, true_label, predicted_label) in enumerate(new_mistakes):
    show(array=img, title=f"Mistake #{i + 1}. True label: {true_label};   Prediction: {predicted_label}")

## Summary

Advanced model version allowed to eliminate almost each mistake that occured in the basic model. Only the first example with very unclear number 5 was acknowledged as 6, but I would myself (as a human being) go that way too ;)

This notebook gave us an overview of an unique functionality delivered by `tfs`, which is model labelling. Labels are extremely useful, even when we want to run A/B tests. `Tfd`, however, allowed us to make use of that advanced functionality in a very simple way by assigning another label. 

You are very welcome to read our next notebook if you are interested in other **TensorFlow Deploy** functionalities. 