# Serve a Keras model with Konduit Serving 

This notebook illustrates a simple client-server interaction to perform inference on a Keras LSTM model using the Python SDK for Konduit Serving.


In [1]:
from konduit import ModelConfig, ParallelInferenceConfig, ModelConfigType, \
ModelStep, ServingConfig

from konduit.server import Server
from konduit.client import Client
import os 
import numpy as np 

# Saving models in Keras H5 format

HDF5 model files can be saved with the `.save()` method. Refer to the [TensorFlow documentation for Keras](https://www.tensorflow.org/guide/keras/save_and_serialize) for details. 

Note: Keras model loading functionality in Konduit Serving converts Keras models to Deeplearning4J models. As a result, Keras models containing operations not supported in Deeplearning4J cannot be served in Konduit Serving. See [issue 8348](https://github.com/eclipse/deeplearning4j/issues/8348). 

# Configuring the server

Konduit Serving works by defining a series of **steps**. These include operations such as 
1. Pre- or post-processing steps
2. One or more machine learning models
3. Transforming the output in a way that can be understood by humans

If deploying your model does not require pre- nor post-processing, only one step - a machine learning model - is required. This configuration is defined using a single `ModelStep`. 

Before running this notebook, run the `build_jar.py` script and copy the JAR (`konduit.jar`) to this folder. Refer to the [Python SDK README](https://github.com/KonduitAI/konduit-serving/blob/master/python/README.md) for details. 

# Configuring step 

Define the Keras configuration as a `ModelConfig` object. 
- `model_config_type`: This argument requires a `ModelConfigType` object. Specify `model_type` as `KERAS`, and `model_loading_path` to point to the location of Keras weights saved in the HDF5 file format.

For the `ModelStep` object, the  following parameters are specified: 
- `model_config`: pass the ModelConfig object here 
- `parallel_inference_config`: specify the number of workers to run in parallel. Here, we specify `workers = 1`.
- `input_names`:  names for the input nodes  
- `output_names`: names for the output nodes

Input and output names can be obtained by visualizing the graph in [Netron](https://github.com/lutzroeder/netron). 

In [2]:
keras_config = ModelConfig(    
    model_config_type = ModelConfigType(
        model_type='KERAS',
        model_loading_path=os.path.abspath(
            f'../data/keras/embedding_lstm_tensorflow_2.h5'
        )
    )
)

keras_step = ModelStep(
    model_config=keras_config, 
    parallel_inference_config=ParallelInferenceConfig(workers=1), 
    input_names=["input"], 
    output_names=["lstm_1"]
)

## Configuring the server 

In the `ServingConfig`, specify a port number. 

The `ServingConfig` has to be passed to `Server` in addition to the steps as a Python list. In this case, there is a single step: `keras_step`. 

In [3]:
serving_config = ServingConfig(http_port=1337)

server = Server(
    serving_config=serving_config, 
    steps=[keras_step]
)

# Running the server

In [4]:
server.start()

Starting server...

Server has started successfully.


<subprocess.Popen at 0x193bcb8bb00>

## Configure client

To configure the client, create a Client object with the port argument. 

Note that you should create the Client object after the Server has started, so that Client can inherit the Server's attributes. 

In [5]:
client = Client(port=1337)

In [6]:
client.predict()
input_array = np.random.uniform(size = [10])

prediction = client.predict({"input": np.expand_dims(input_array, axis=0)})

server.stop()

In [7]:
print(prediction) 
prediction.shape

[[[-3.53171281e-03 -6.55398145e-03 -9.07615386e-03 -1.11385630e-02
   -1.27962595e-02 -1.41087854e-02 -1.51341194e-02 -1.59251783e-02
   -1.65283810e-02 -1.69831831e-02]
  [-8.98816856e-04 -2.16921209e-03 -3.50447348e-03 -4.74313367e-03
   -5.81311202e-03 -6.69444632e-03 -7.39536854e-03 -7.93745089e-03
   -8.34691338e-03 -8.64967890e-03]
  [ 4.57964110e-04  7.27277948e-04  8.35937157e-04  8.26211413e-04
    7.38725299e-04  6.06592686e-04  4.54327965e-04  2.98691681e-04
    1.50312349e-04  1.51993027e-05]
  [-3.72108625e-04 -7.85349868e-04 -1.15279201e-03 -1.44809973e-03
   -1.67194800e-03 -1.83526019e-03 -1.95118226e-03 -2.03184737e-03
   -2.08712765e-03 -2.12461245e-03]
  [-1.70938822e-03 -2.67439662e-03 -3.14217363e-03 -3.29596084e-03
   -3.26494290e-03 -3.13617825e-03 -2.96555483e-03 -2.78691365e-03
   -2.61911890e-03 -2.47139740e-03]
  [ 4.90737101e-03  8.10048729e-03  1.01506319e-02  1.14426687e-02
    1.22359265e-02  1.27047608e-02  1.29658673e-02  1.30968681e-02
    1.31488722e-

(1, 6, 10)