Note: Do not to run the python cells, they are there for explaining stuff, not for the actual work (in fact, I don't even need any jupyter notebooks to do the deployment), you will get errors running and may ruin the outputs.

# Importing the models
The first step that I will need to do is to copy the CNN model architectures and their weights from my DELE CA1 into here. To make things more organised, I create a directory called "models" and store the code for initialising the models and loading their respective weights in there, so on the root directory or simply when testing, I can simply call the initialisation function to get the model, for example... (see the python cell below)

In [3]:
import models
model_128 = models.create_model_128()   # this will call the function to initialise the model, load the weights, and return the model

This makes the code more cleaner, especially on the root directory or on testing. And I might as well name the python file for initialising the models as __ init __.py so the directory can be treated as a module for creating the models. Lastly, in the __ init __.py of the models, I also added a function to get the labels so I don't have to keep copying and pasting the labels if I need to use it on multiple python files, which will happen in the tests. This will be what the labels be inside:

In [4]:
labels = models.get_labels()
labels

['Bean',
 'Bitter_Gourd',
 'Bottle_Gourd',
 'Brinjal',
 'Broccoli',
 'Cabbage',
 'Capsicum',
 'Carrot',
 'Cauliflower',
 'Cucumber',
 'Papaya',
 'Potato',
 'Pumpkin',
 'Radish',
 'Tomato']

# Test the models
Before I can move on the trying to deploy the models locally, I need to do some test on the models to ensure that the predictions are working as intended. First of all, I copy a few images from the test directory of the DELE CA1 dataset and paste them under the test_images directory which is under the tests directory; I picked a random image of a bean, carrot, and pumpkin, and I rename the images based on the vegatable type. 

Then I set up a conftest.py where I created a few fixtures as a way to get the images and preprocess the data for the tests. The first fixture which I called get_images(), is supposed to try to open all the images under the test_images directory, get their labels from their filename, and return the data in this following format:

In [None]:
[[Image, Image, Image], ['label_1', 'label_2', 'label_3']] 

Basically a nested list. The first list will be a list of Image object, which are the test image data, and the second list will be a list of strings, storing the Image object's label. Each Image object will be mapped to their label, so for example:

In [None]:
# assuming on the first list (the list of Image object), I store the bean, carrot, and the pumpkin images respectively like in the code below
data = [[Image, Image, Image], ['bean', 'carrot', 'pumpkin']] 
data[0][0]  # when accessing the first list and the first item, I expected to get the bean image data
data[1][0]  # and when trying to get the label, I expect this to return the bean label as well

So this is what I meant when I said each Image object will be mapped to their label.

The next two fixtures will basically be transforming the images for the models' input. The reason for having two fixture where one for 128 pixel while the other is for 31 pixel is because I want to create a fixture that takes the get_images() fixture as an input, but also an indirect parameter from the test cases. Unfortunately, I am unable to use indirect parameter while also having another fixture as another argument, like this:

In [None]:
# this unfortunately does not work, so the code below were all supposedly to be there
# the function below under conftest.py
@pytest.fixture
def transform_images(get_images, pixel):
    return [[transformer(img, pixel) for img in get_images[0]], get_images[1]]  # the transformer function will be transforming the images

# the function below under test_model.py
@pytest.mark.parametrize("transform_images", [128, 31], indirect=True)
def test_prediction(transform_images):
    pass    # whatever code here will be on testing the prediction

Unfortunately, because I am unable to tell pytest that I want [128, 31] to be the parameters for the pixel argument in transform_images() fixture, this plan does not work. Hence, I ended up having two fixtures as you can see in conftest.py right now. Lastly, the way I preprocess the data is to simply scale them to have all the pixel values range from 0 to 1, and transform them into grayscale, the appropriate size, and reshaping the data appropriate for the models.

In test_model.py, I get the two fixtures to get the 128 pixel scaled and 21 pixel scaled images, and then the way I test is to simply iterate over the list of 128 pixel images, make prediction, and assert that the predicted label is the same as the actual label. Repeat the same process for the 31 pixel images. If the test pass, it means the model is working and so is the prediction.

# Writing the Dockerfile for deployment
My original plan for deploying two models is to have both of them share the same container while residing in different directories. And instead of building the models on my host and then having the models send to gitlab and to the cloud deployment, I'll have the source code and the model's weights saved into gitlab, send the codes to the cloud deployment, and build the model on-site. And the way to switch between model will be that when I specify that I want model_128 on the url, tensorflow serving will switch to model_128, and vice versa. This will be done in a configuration file and be read by tensorflow serving.

So first, I will have a python file that will be executed inside the docker container. This python file is called model.py in the root directory of this project folder. The script will first get the variable called MODELS_BASE_PATH which dictates where will the models be stored. The variable will be from the environment variable which is set in the dockerfile. Then I change the working directory so that the script can access the models module that I built earlier, then I can call the create models function under the models module and save them on their respective directories. I get the current timestamp so as to have a unique version for each model, in case there are any new version for each model. Both models will have the same version name as they both will be treated as different inferencing while residing in the same container, so it didn't matter, hence I use the same timestamp for both of them. Then I save them under the model_128 directory for the 128 pixels model, and model_31 for the 31 pixels model.

The trickest part is figuring out how to write a Dockerfile that allows me to run the model.py file to create the model and set up the tensorflow serving. Unfortunately, it is not straight forward as pip is not installed by default, hence I need to install pip. Then I can install the required packages listed in requirements.txt and then I can finally run the file. Then I create a short bash script to start the tensorflow serving with the provided configurations under the config directory. I also need to make sure I have set the environment variables appropriately, where I set the configuration file path and the models base path.

Lastly, I also write the config file based on what I want mentioned above. This is how the config file looks like:

# Deploy the models locally
Before I deploy on render.com, I first need to try and see if I can deploy the models at all, so I try to do so locally. So once I finish up the code, I will first try to build the docker image with the following command:

In [None]:
sudo docker build -t veg_server_image:latest .

Note that I am running Linux on my host, hence the sudo command. This command will be run on the same directory as this project's root directory, so the '.' dot symbol at the end, which indicates the directory. After running this command, it will start updating its package manager, install pip manager, install the needed python packages, run the file to build and save the models, and then create a script for the tensorflow serving including its configuration.

Once it's done building, I will get the image ID from the output of the command. This ID is used to identify the local image that I built. So I will use this image ID to finally run a container. So, for in this case, the image ID is 9e638465556c, then the command for me to run the container will be:

In [None]:
sudo docker run -it -d -p 8501:8501 --name veg_server 9e638465556c

After running this, I will create a container called veg_server and it will start running the image I created. At this point, the models should be able to receive data from outside the container and start predicting and send back the prediction. So it is time to test.

# Test the locally deployed models
Once the models successfully deployed, the first thing I do is to visit the url to check if they are available. I visited two urls, one for 128 pixels model, another one for 31 pixels model. I visited http://localhost:8501/v1/models/model_128 and http://localhost:8501/v1/models/model_31 and they both return the available state with error_code OK. If I try to vist something else, like http://localhost:8501/v1/models/model_69, it will be an error.

Next, I will write another python test file to automate the test on making a GET request to the URL above and see if I get the OK error_code, and test the prediction from the container now. The test_url_prediction test function will be similar to the test_model.py, except that instead of instantiating the model like in test_model.py, I need to create a POST request to the models in order to make a prediction. The good thing about naming the directories to be "model_128" and "model_31" is that I can conveniently specify which model I want to use by concatenating the pixel numbers into the url, just like this:

In [None]:
get_url = lambda pixel: f'http://localhost:8501/v1/models/model_{pixel}:predict'

So if I want to use the 128 pixel mode, I just need to input 128 into the get_url function, and then I can just get the url for that model for the prediction. 

In the test_connection test function, I simply send a GET request to http://localhost:8501/v1/models/model_{pixel} where {pixel} is either 128 or 31. I then extract the value in the response so that I will get the OK value in error_code, so I just assert to ensure that the connection works.

I will also need to create a function to do the POST and receiving the prediction and get the label. I do all this in one function so I can use it twice for both the 128 pixel images and 31 pixel images. The asserting will be exactly the same as test_model.py.

In the end, this locally deployed models passed the test, and hence we can move on to deploying the models on render.com.

# Deploy the models on render.com

Now I can simply move on to deployment on the internet. I have push my commit into gitlab, and then I simply go to render.com and deploy my models with docker. And then let render build the models and start serving. Once done, they should be able to start serving. The model_128 will be https://twob01-2239745-shawnlim-ca2-models.onrender.com/v1/models/model_128 and the model_31 will be https://twob01-2239745-shawnlim-ca2-models.onrender.com/v1/models/model_31

# Test the externally deployed models

Finally, to check if the models are deployed properly and are working. I will need to test the connection and also the prediction. Luckily, I can literally reuse the test file for testing the locally deployed models -- all I need to do is to change the URL, and it will test the externally deployed models. So I just switch the URL and use the exact same test. In the end, they also pass the test, hence the externally deployed models are working and can predict the images as expected.

# CICD the models

Lastly, to ensure that the Keras models can be modified if needed, I would want to implement CICD such that if I ever want to improve on the models to make better predictions, I can automate the testing so that if the models were to not succeed in building or predicting, I will be notified about it immediately just by checking the automated testing in gitlab.

To ensure this can work, I first create a YAML file called .gitlab-ci.yml to configure my testing such that every time I commit changes to gitlab, it will test the building and predicting of the models to ensure it can still predict the images I place in the test_images folder.

There will be two stages, one of which is to test the model, which is part of contiunous integration, and another to deploy, which is part of continuous deployment. In the first stage, I pull the python 3.11 image, install the needed modules, run the test and output to a junit XML file, and then store as an artifact, regardless whether it fail. And then for the second stage, it will deploy the model by triggering the "Deploy Hook" URL that came frfom my models deployment.