# Serve an image classification model built in Deeplearning4j with Konduit Serving

This notebook illustrates a simple client-server interaction to perform inference on a Deeplearning4j image classification model using the Python SDK for Konduit Serving.


In [1]:
from konduit import ModelConfig, TensorDataTypesConfig, ModelConfigType, ModelPipelineStep, ParallelInferenceConfig, \
ServingConfig, InferenceConfiguration

from konduit.server import Server
from konduit.client import Client
import numpy as np 

import time
import os 

## Saving models in Deeplearning4j 
The following is a short Java program that loads a simple CNN model from Deeplearning4j's model zoo, initializes weights, then saves the model to a new file, "SimpleCNN.zip". 


```java
import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.deeplearning4j.zoo.ZooModel;
import org.deeplearning4j.zoo.model.SimpleCNN;

import java.io.File;

public class SaveSimpleCNN {
    private static int nClasses = 5;
    private static boolean saveUpdater = false;

    public static void main(String[] args) throws Exception {
        ZooModel zooModel = SimpleCNN.builder()
            .numClasses(nClasses)
            .inputShape(new int[]{3, 224, 224})
            .build();
        MultiLayerConfiguration conf = ((SimpleCNN) zooModel).conf();
        MultiLayerNetwork net = new MultiLayerNetwork(conf);
        net.init();
        System.out.println(net.summary());
        File locationToSave = new File("SimpleCNN.zip");
        net.save(locationToSave, saveUpdater);
    }
}

```

A reference Java project using Deeplearning4j 1.0.0-beta5 is provided in this repository with a Maven `pom.xml` dependencies file. If using the IntelliJ IDEA IDE, open the `java` folder as a Maven project and run the `main` function of the `SaveSimpleCNN` class. 

## Configuring the server 

### Configuring `ModelPipelineStep` 

Define the TensorFlow configuration as a `TensorFlowConfig` object. 

- `tensor_data_types_config`: The ModelConfig object requires a dictionary `input_data_types`. Its keys should represent column names, and the values should represent data types as strings, e.g. `"INT32"`. See [here](https://github.com/KonduitAI/konduit-serving/blob/master/konduit-serving-api/src/main/java/ai/konduit/serving/model/TensorDataType.java) for a list of supported data types. 
- `model_config_type`: This argument requires a `ModelConfigType` object. In the Java program above, we recognised that SimpleCNN is configured as a MultiLayerNetwork, in contrast with the ComputationGraph class, which is used for more complex networks. Specify `model_type` as `MULTI_LAYER_NETWORK`, and `model_loading_path` to point to the location of Deeplearning4j weights saved in the ZIP file format.


For the `ModelPipelineStep` object, the  following parameters are specified: 
- `model_config`: pass the ModelConfig object here 
- `parallel_inference_config`: specify the number of workers to run in parallel. Here, we specify `workers = 1`.
- `input_names`:  names for the input data  
- `output_names`: names for the output data

To find the names of input and output nodes in Deeplearning4j, 

- for `input_names`, print the first element of `net.getLayerNames()`.
- for `output_names`, check the last layer when printing `net.summary()`. 


In [2]:
input_data_types = {"image_array": "FLOAT"}
input_names = list(input_data_types.keys())
output_names = ["output"]
port = np.random.randint(1000, 65535)

dl4j_config = ModelConfig(
    tensor_data_types_config=TensorDataTypesConfig(
        input_data_types=input_data_types
    ), 
    model_config_type=ModelConfigType(
        model_type="MULTI_LAYER_NETWORK", 
        model_loading_path=os.path.abspath("../data/multilayernetwork/SimpleCNN.zip")
    )
)

dl4j_pipeline_step = ModelPipelineStep(
    model_config=dl4j_config,
    parallel_inference_config=ParallelInferenceConfig(workers=1),
    input_names=input_names,
    output_names=output_names
)

### Configuring the server

Specify the following:
- `http_port`: select a random port.
- `input_data_type`, `output_data_type`: Specify input and output data types as strings. 

<div class="alert alert-info">
ℹ Accepted input and output data types are as follows: 
    <ul>
        <li> Input: JSON, ARROW, IMAGE, ND4J (not yet implemented) and NUMPY. </li>
        <li> Output: NUMPY, JSON, ND4J (not yet implemented) and ARROW.</li>
    </ul>
</div>

The `ServingConfig` has to be passed to `InferenceConfiguration` in addition to the pipeline steps as a Python list. In this case, there is a single pipeline step: `dl4j_pipeline_step`. 

In [3]:
serving_config = ServingConfig(
    http_port=port,
    input_data_type='NUMPY',
    output_data_type='NUMPY'
)

server = Server(
    config=InferenceConfiguration(
        serving_config=serving_config,
        pipeline_steps=[dl4j_pipeline_step]
    )
)

Again, we can use the `as_dict()` method of the `config` attribute of `server` to view the overall configuration:

In [4]:
server.config.as_dict()

{'@type': 'InferenceConfiguration',
 'pipelineSteps': [{'@type': 'ModelPipelineStep',
   'inputNames': ['image_array'],
   'outputNames': ['output'],
   'modelConfig': {'@type': 'ModelConfig',
    'tensorDataTypesConfig': {'@type': 'TensorDataTypesConfig',
     'inputDataTypes': {'image_array': 'FLOAT'}},
    'modelConfigType': {'@type': 'ModelConfigType',
     'modelType': 'MULTI_LAYER_NETWORK',
     'modelLoadingPath': 'C:\\Users\\Skymind AI Berhad\\Documents\\konduit-serving-examples\\data\\multilayernetwork\\SimpleCNN.zip'}},
   'parallelInferenceConfig': {'@type': 'ParallelInferenceConfig',
    'workers': 1}}],
 'servingConfig': {'@type': 'ServingConfig',
  'httpPort': 53083,
  'inputDataType': 'NUMPY',
  'outputDataType': 'NUMPY'}}

## Configuring the client 

To configure the client, create a Client object with the following arguments: 
- `input_names`: names of the input data
- `output_names`: names of the output data
- `input_type`: data type passed to the server for inference
- `endpoint_output_type`: data type returned by the server endpoint 
- `return_output_type`: data type to be returned to the client. Note that this argument can be used to convert the output returned from the server to the client into a different format, e.g. NUMPY to JSON.


<div class="alert alert-warning">
    ⚠ Future versions of the Python SDK may remove the <code>input_names</code> and <code>output_names</code> arguments in <code>Client()</code>, since these are already specified in <code>ModelPipelineStep()</code>. 
</div>

In [5]:
client = Client(
    input_names=input_names,
    output_names=output_names,
    input_type='NUMPY',
    endpoint_output_type='NUMPY',
    return_output_type="NUMPY",
    url='http://localhost:' + str(port)
)

## Running the server

We generate a (3, 224, 224) array of random numbers between 0 and 255 as input to the model for prediction. 

Before requesting for a prediction, we normalize the image to be between 0 and 1: 

In [6]:
rand_image = np.random.randint(255, size=(3, 224, 224)) / 255

In [7]:
server.start()
time.sleep(30)

prediction = client.predict({"image_array": rand_image})
print(prediction)

server.stop()

Wrote config.json to path C:\Users\Skymind AI Berhad\Documents\konduit-serving-examples\python\config.json
Running with args
java -cp konduit.jar ai.konduit.serving.configprovider.KonduitServingMain --configPath C:\Users\Skymind AI Berhad\Documents\konduit-serving-examples\python\config.json --verticleClassName ai.konduit.serving.verticles.inference.InferenceVerticle
[[4.1586593e-02 3.2325742e-01 2.5140349e-02 3.9734459e-05 6.0997599e-01]]
