# Serve an image classification model built in Deeplearning4j with Konduit Serving

This notebook illustrates a simple client-server interaction to perform inference on a Deeplearning4j image classification model using the Python SDK for Konduit Serving.


In [1]:
from konduit import ModelConfig, TensorDataTypesConfig, ModelConfigType, ModelStep, ParallelInferenceConfig, \
ServingConfig, InferenceConfiguration

from konduit.server import Server
from konduit.client import Client
import numpy as np 

import os 

## Saving models in Deeplearning4j 
The following is a short Java program that loads a simple CNN model from Deeplearning4j's model zoo, initializes weights, then saves the model to a new file, "SimpleCNN.zip". 


```java
import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.deeplearning4j.zoo.ZooModel;
import org.deeplearning4j.zoo.model.SimpleCNN;

import java.io.File;

public class SaveSimpleCNN {
    private static int nClasses = 5;
    private static boolean saveUpdater = false;

    public static void main(String[] args) throws Exception {
        ZooModel zooModel = SimpleCNN.builder()
            .numClasses(nClasses)
            .inputShape(new int[]{3, 224, 224})
            .build();
        MultiLayerConfiguration conf = ((SimpleCNN) zooModel).conf();
        MultiLayerNetwork net = new MultiLayerNetwork(conf);
        net.init();
        System.out.println(net.summary());
        File locationToSave = new File("SimpleCNN.zip");
        net.save(locationToSave, saveUpdater);
    }
}

```

A reference Java project using Deeplearning4j 1.0.0-beta5 is provided in this repository with a Maven `pom.xml` dependencies file. If using the IntelliJ IDEA IDE, open the `java` folder as a Maven project and run the `main` function of the `SaveSimpleCNN` class. 

# Overview

Konduit Serving works by defining a series of **steps**. These include operations such as 
1. Pre- or post-processing steps
2. One or more machine learning models
3. Transforming the output in a way that can be understood by humans

If deploying your model does not require pre- nor post-processing, only one step - a machine learning model - is required. This configuration is defined using a single `ModelStep`. 

Before running this notebook, run the `build_jar.py` script and copy the JAR (`konduit.jar`) to this folder. Refer to the [Python SDK README](https://github.com/KonduitAI/konduit-serving/blob/master/python/README.md) for details. 

# Configure the step 

Define the Deeplearning4j configuration as a `TensorFlowConfig` object. 

- `tensor_data_types_config`: The ModelConfig object requires a dictionary `input_data_types`. Its keys should represent column names, and the values should represent data types as strings, e.g. `"INT32"`. See [here](https://github.com/KonduitAI/konduit-serving/blob/master/konduit-serving-api/src/main/java/ai/konduit/serving/model/TensorDataType.java) for a list of supported data types. 
- `model_config_type`: This argument requires a `ModelConfigType` object. In the Java program above, we recognised that SimpleCNN is configured as a MultiLayerNetwork, in contrast with the ComputationGraph class, which is used for more complex networks. Specify `model_type` as `MULTI_LAYER_NETWORK`, and `model_loading_path` to point to the location of Deeplearning4j weights saved in the ZIP file format.


For the `ModelStep` object, the  following parameters are specified: 
- `model_config`: pass the ModelConfig object here 
- `parallel_inference_config`: specify the number of workers to run in parallel. Here, we specify `workers=1`.
- `input_names`:  names for the input data  
- `output_names`: names for the output data

To find the names of input and output nodes in Deeplearning4j, 

- for `input_names`, print the first element of `net.getLayerNames()`.
- for `output_names`, check the last layer when printing `net.summary()`. 


In [2]:
input_data_types = {"image_array": "FLOAT"}
input_names = list(input_data_types.keys())
output_names = ["output"]
port = np.random.randint(1000, 65535)

dl4j_config = ModelConfig(
    tensor_data_types_config=TensorDataTypesConfig(
        input_data_types=input_data_types
    ), 
    model_config_type=ModelConfigType(
        model_type="MULTI_LAYER_NETWORK", 
        model_loading_path=os.path.abspath("../data/multilayernetwork/SimpleCNN.zip")
    )
)

dl4j_step = ModelStep(
    model_config=dl4j_config,
    parallel_inference_config=ParallelInferenceConfig(workers=1),
    input_names=input_names,
    output_names=output_names
)

# Configure the server
Specify the following:
- `http_port`: select a random port.
- `input_data_format`, `output_data_format`: Specify input and output data formats as strings. 

{% hint style="info" %}
Accepted input and output data formats are as follows:

*  Input: JSON, ARROW, IMAGE, ND4J \(not yet implemented\) and NUMPY.
*  Output: NUMPY, JSON, ND4J \(not yet implemented\) and ARROW.
{% endhint %}

The `ServingConfig` has to be passed to `Server` in addition to the steps as a Python list. In this case, there is a single step: `dl4j_step`. 

In [3]:
serving_config = ServingConfig(
    http_port=port,
    input_data_format='NUMPY',
    output_data_format='NUMPY'
)

server = Server(
    serving_config=serving_config,
    steps=[dl4j_step]
)

# Start the server

In [4]:
server.start()

Starting server.......

Server has started successfully.


<subprocess.Popen at 0x232fd7a7748>

# Configure the client 

To configure the client, create a Client object:


In [5]:
client = Client(port=port)

# Inference

We generate a (3, 224, 224) array of random numbers between 0 and 255 as input to the model for prediction. 

Before requesting for a prediction, we normalize the image to be between 0 and 1: 

In [6]:
rand_image = np.random.randint(255, size=(3, 224, 224)) / 255

In [7]:
prediction = client.predict({"image_array": rand_image})
print(prediction)

server.stop()

[[4.1421477e-02 3.2405543e-01 2.4740364e-02 3.9147784e-05 6.0974360e-01]]
