Skip to content

Sharif-Smart-and-Secure-Edge-Cloud-Lab/edge-mlops

Repository files navigation

MLOps

Contents:

Introduction

MLOps is an emerging field of ML research that aims to enable and automate ML models into production. According to sig-mlops MLOps is defined as:

the extension of the DevOps methodology to include Machine Learning and Data Science assets as first class citizens within the DevOps ecology

In this repository we won't discuss benfites and limitations of MLOps, but we provide some references for those who are interested in using MLOps.

Note: AutoML is a technology that targets non-expert ML practitioners to build and deploy ML models. It can be used in conjuction with MLOps. However, it is fairly in early stages and we're not going to discuss it here.

Pipeline

Based on our research and the requirements of the project, we decided to use the following pipeline:

  1. Model and dataset versioning: As ML-base software is fundamentally different from traditional software, model and dataset versioning is an issue and cannot be handled just by using git (as the amount of data is too large).
  2. Automatic model training: We will autmoate the training of a face detection/recognition model.
  3. Automatic build: The process of packaging will be automated (creation of Docker images, building of Docker containers, etc.).
  4. Automatic deployment: The Docker images will be deployed to a local server automatically.
  5. Model monitoring: We will provide simple logging and monitoring tools to monitor the performance of the model.
  6. Metadata gathering: During whole pipeline execution, some metadata will be gathered and stored in the database.
  7. Triggering mechanisms: The pipeline execution triggering mechanism will be based on pipeline change and manual triggering.
  8. Choosing edge devices: Learning about various edge devices, their limits, and the various models that can be used with them.
  9. Testing datasets: Examining and evaluating a few datasets that have been processed by edge devices.

A discussion of currently available tools for each stage of the pipeline is provided below.

Model and dataset versioning

As mentioned briefly above, ML-base software is different from traditional software in that it is not enough to only have code, but also one need whole dataset to produce the exact model. Plus, the explicit relationship between input and output is not known. So, it requires special attention to versioning.

Git is widely used in versioning and source control for traditional software. However, it is not suitable for ML-base software. The dataset is too large and it is not feasible to index it in git. Models are binary and switching between different versions of the model is not easy. There are other reasons that git alone is not suitable for ML-base software. You can refer to this for more information.

Tools for versioning ML-base software:

  1. DVC: An open-source git-based version control system for ML projects. It is by far the most popular version control system in the wild.
  2. dotmesh: According to dotmesh

    dotmesh (dm) is like git for your data volumes (databases, files etc) in Docker and Kubernetes

dotmesh doesn't have an active community and the latest release was in 2020. So, DVC is the best and pretty the only solution for version control. Some important features of DVC are (according to DVC features):

  • Git-compatible
  • Storage agnostic
  • Reproducible
  • Low friction branching
  • Metric tracking
  • ML pipeline framework
  • Language and framework agnostic
  • Track failures

So, we are considering to use DVC as a version and CI/CD app.

Tools for CI/CD

The other option for us is jenkins. This open-source software can provide us with a pipeline that could be run right after the version-control app but it does not provide any version-control itself unlike Gitlab and DVC.

Other option is docker compose but we have the same problem as jenkins. It does not provide any version-control for us.

We are considering the choice between DVC and gitlab as both of these tools are very useful in our case .

Our choice of tools

We are focusing on developing MLOps techniques for edge devices. Edge devices are quite versatile and designed by different manufacturers. So, we need to have a tool that is compatible with different edge devices and to be device-agnostic. Another important feature is to be framework-agnostic. In other words, we should be able to use all models that are trained with different frameworks, without any modification. To tackle this two issues, we used the following pipeline:

pipeline

ONNX standard helps us to be framework agnostic. Almost all training frameworks support ONNX and one can convert the final model to a .onnx format and later use it in inference frameworks that support this format (such as OpenVINO and ONNXRuntime). For inference side, we are going to use ONNXRuntime. It is a cross-platform inference engine that supports multiple frameworks and hardware accelerators. So, it's a great choice for edge devices.

Plus we are using Docker to package our application and it's dependencies. It also helps us to create a CI/CD pipeline which is essential for MLOps. As Docker it self could be inefficient for edge devices, we are using balenaOS which is a lightweight OS, tailored for each hardware with capabilities to run Docker containers. Under the hood, balenaOS uses yocto to build the image file. As of writing this doc, it supports more than 80 devices. More information about balenaOS can be found here.

Model Training

For the model in this project, we decided to use the IMDB movie reviews dataset. This dataset contains reviwes from users for movies which are labeled either a positive review or a negative review. The format of each data in this dataset is an array of numbers which represent a word in this dataset's dictionary.
For example the word "film" is indexed as integer 13 in this dataset. An example:

print(data[0])


[1, 14, 22, 16, 43, 530, 973, 1622, 1385, 65, 458, 4468, 66, 3941, 4, 173, 36, 256, 5, 25, 100, 43, 838, 112, 50, 670, 2, 9, 35, 480, 284, 5, 150, 4, 172, 112, 167, 2, 336, 385, 39, 4, 172, 4536, 1111, 17, 546, 38, 13, 447, 4, 192, 50, 16, 6, 147, 2025, 19, 14, 22, 4, 1920, 4613, 469, 4, 22, 71, 87, 12, 16, 43, 530, 38, 76, 15, 13, 1247, 4, 22, 17, 515, 17, 12, 16, 626, 18, 2, 5, 62, 386, 12, 8, 316, 8, 106, 5, 4, 2223, 5244, 16, 480, 66, 3785, 33, 4, 130, 12, 16, 38, 619, 5, 25, 124, 51, 36, 135, 48, 25, 1415, 33, 6, 22, 12, 215, 28, 77, 52, 5, 14, 407, 16, 82, 2, 8, 4, 107, 117, 5952, 15, 256, 4, 2, 7, 3766, 5, 723, 36, 71, 43, 530, 476, 26, 400, 317, 46, 7, 4, 2, 1029, 13, 104, 88, 4, 381, 15, 297, 98, 32, 2071, 56, 26, 141, 6, 194, 7486, 18, 4, 226, 22, 21, 134, 476, 26, 480, 5, 144, 30, 5535, 18, 51, 36, 28, 224, 92, 25, 104, 4, 226, 65, 16, 38, 1334, 88, 12, 16, 283, 5, 16, 4472, 113, 103, 32, 15, 16, 5345, 19, 178, 32]

If we translate the entry above we will get this:


# this film was just brilliant casting location scenery story direction everyone's really suited the part they played and you could just imagine being there robert # is an amazing actor and now the same being director # father came from the same scottish island as myself so i loved the fact there was a real connection with this film the witty remarks throughout the film were great it was just brilliant so much that i bought the film as soon as it was released for # and would recommend it to everyone to watch and the fly fishing was amazing really cried at the end it was so sad and you know what they say if you cry at a film it must have been good and this definitely was also # to the two little boy's that played the # of norman and paul they were just brilliant children are often left out of the # list i think because the stars that play them all grown up are such a big profile for the whole film but these children are amazing and should be praised for what they have done don't you think the whole story was so lovely because it was true and was someone's life after all that was shared with us all

The "#" characters are the ones that are not available in model's dictionary.

The directory saved_model contains saved model from tensorflow.
The directory convert_model contains the onnx model.
To get the onnx output use the command below:

$>python -m tf2onnx.convert --saved-model ./saved_model/ --opset 12 --output ./convert_model/output.onnx

ONNXRuntime Cross Compiling

We have cross compiled ORT for armv7 architecture and tested it on Raspberry Pi 400. First, clone onnxruntime repository and a custom protoc version for cross compiling (refer to ORT documentation for more details). You can either follow these steps to compile ORT manually or use the Dockerfile provided in this repository. First, manual steps are explained and then using Docker is introduced.

1) Manual compilation

I have used the following tool.cmake file for cross compiling:

SET(CMAKE_SYSTEM_NAME Linux)
SET(CMAKE_SYSTEM_VERSION 1)
SET(CMAKE_SYSTEM_PROCESSOR armv7-a)

SET(CMAKE_SYSROOT <path to sysroot>)

SET(CMAKE_C_COMPILER arm-none-linux-gnueabihf-gcc)
SET(CMAKE_CXX_COMPILER arm-none-linux-gnueabihf-g++)

SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-psabi")

SET(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
SET(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
SET(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)
SET(CMAKE_FIND_ROOT_PATH_MODE_PACKAGE ONLY)

Make sure that ARM toolchain is accessible from PATH (otherwise provide absolute path). You might want to transfer linker and some exectuables to your /usr path (such as ld-linux-armhf.so.3 to /usr/arm-linux-gnueabihf/).

I compiled v1.12.1 of ORT. It seems that v1.13 has some issues with cmake. So, make sure to use v1.12.1:

$ git checkout v.1.12.1

Next, run the following command to cross compile ORT:

./build.sh --config Release --parallel --arm --update --build --build_shared_lib --cmake_extra_defines ONNX_CUSTOM_PROTOC_EXECUTABLE=<path to bin/protoc> CMAKE_TOOLCHAIN_FILE=<path to tool.cmake>

After waiting a long time, dynamic and static libraries will be generated. You can find them in build/Linux/Release/ directory. Set this path in CMakeLists.txt to compile this project. Also set include directory path in CMakeLists.txt and change TC-arm.cmake accordingly. Finally, build the project (from build folder):

$ cmake -DCMAKE_TOOLCHAIN_FILE=<path to TC-arm.cmake> -DCMAKE_INSTALL_PREFIX=<install prefix> ..
$ make 
$ make install

And you're all set!

2) Docker image

You can use the Dockerfile provided in this repository to build a Docker image that will cross compile your project with ORT libraries. The final image can be used either manually or as a base image for your project. Image is built on Ubuntu:22.04 base image. The final image is around 4GB. You can build the image with the following command (make sure that you are in this repository's root directory):

$ docker build . -t edgemlops:1.0.0

Image building process could take about two hours (depending on your machine and internet speed). After building the image, ORT libraries are in /ORT/onnxruntime/ directory.


MQTT Managers

In the mqtt folder, there are 2 programs. One for the host machine that manages the devices that are connected and are supposed to run the model and one for the clients on edge. These programs need a broker to be able to communicate with each other. To do so you can use a broker such as mosquitto to setup your own broker.
To use these programs, you need to compile them using the Eclipse Paho MQTT C library.
https://github.com/eclipse/paho.mqtt.c
Make sure to edit the CMakeLists file to build the static libraries as well.
To connect to the program, simply enter the IP of the broker as an argument:

$ ./hostManager "192.168.1.110"

The program for the host manager must be in the same folder as other folders such as scripts and inference. To make the inference program, you need to already have the docker image in order for program to use it.
The scripts folder is a simple implementation of the operations that we want to use as a host such as moving files to the edge device, instructions for compiling the inference program and other similar scripts.


Edge Devices

When it comes to choosing the right edge device, It's important to consider our specific use case. There are several options available in the market, but two popular choices are Raspberry Pi and Jetson. In the following, we'll provide a comparative analysis of these devices.

Raspberry Pi

The Raspberry Pi is a popular choice for edge computing due to its low cost and versatility. It is a credit-card sized computer that can run various operating systems, including Linux and Windows.

Running a ML program in a Raspberry Pi requires a significant amount of memory (or RAM) to process calculations. The lastest and preferred model for ML applications is the Raspberry Pi 4 Model B.

Typical ML projects for the Raspberry Pi involve classifying items, including different visual, vocal, or statistical patterns. The backbone of all ML models is a software library and its dependencies. There are currently a variety of free ML frameworks. Some of most well-known platforms include the following:

  • TensorFlow: A flexible platform for building general ML models.
  • OpenCV: A library dedicated to computer vision and related object detection tasks.
  • Google Assistant: A library dedicated to voice recognition tasks.
  • Edge Impulse: A cloud-based platform that simplifies ML app development.

Raspberry Pi can be used to train and run ML models for image classification. For example, you can use TensorFlow to train a model on a dataset of images and then use it on a Raspberry Pi to classify new images in real-time.
Here is some tutorial on building a real-time object recognition on Raspberry Pi using TensorFlow and OpenCV:

Other varied uses, such as voice recognition and anomaly detection, are covered in the tutorials and examples here:

For more information about Raspberry Pi click here

Jetson

Jetson is a line of embedded systems designed by NVIDIA specifically for edge computing applications. Jetson devices are equipped with a powerful GPU, which makes them ideal for tasks such as image and video processing, machine learning, and deep learning. Jetson devices are more expensive than Raspberry Pi, but they offer better performance and capabilities for demanding edge computing tasks.

As was the case with the Raspberry Pi, ML applications require a sizable amount of memory (or RAM), therefore the Jetson Nano is the device that is most commonly used for ML applications.

Several ML frameworks are compatible with Jetson, just like Raspberry Pi. In addition to the frameworks listed in the Raspberry Pi section, Jetson also supports PyTorch. PyTorch is known for its ease of use and flexibility, and is widely used in computer vision and natural language processing applications.

Like the Raspberry Pi, the Jetson may be utilized in several ML models. Models for object detection, facial recognition, audio recognition, natural language processing, and several more applications are just a few examples.

Here are a few Jetson ML model examples and tutorials:

For more information about Jetson Nano click here

Challenges and Solutions

When building ML models on resource-limited devices such as the Raspberry Pi or Jetson Nano, one of the main challenges that can arise is a lack of available RAM. ML models often require a significant amount of memory to operate, and if there isn't enough RAM available, the models may not be able to run properly or may even crash.

There are several strategies that can be employed to mitigate RAM problems when building ML models on these devices. One approach is to use a smaller model architecture that requires less memory. This can be achieved by reducing the number of layers or neurons in the model.

Another strategy is to reduce the batch size used during training. By using a smaller batch size, less memory is required to store the intermediate activations of the model during training. However, this can also result in longer training times and reduced training accuracy.

One possible solution is to use a swapfile. A swapfile is a file on the system's hard drive that is used as virtual memory when the system runs out of physical RAM. When the system needs more memory than what is available in RAM, it swaps out the least-used memory pages to the swapfile, freeing up space in RAM for more important processes. However, it's important to note that using a swapfile can slow down the system's performance, as accessing the hard drive is slower than accessing RAM. Therefore, it's recommended to use a swapfile only as a temporary solution when running memory-intensive processes on these devices.

Datasets

In this discussion, we'll look at some of the interesting datasets that have been and may be analyzed using devices we've talked about, as well as the conclusions drawn from such investigations.

CIFAR Dataset

CIFAR (Canadian Institute for Advanced Research) is a collection of datasets that are commonly used for image recognition. The most popular dataset is CIFAR-10, which consist of 60,000 32*32 color imaages in 10 classes, with 6,000 images per class.

The CIFAR-10 dataset can be downloaded from the official website here.

The CIFAR-10 dataset can be used on a Raspberry Pi for various image recognition tasks, such as object recognition and image classification. The small size of images in this dataset makes it easy to work with.

Numerous studies have been conducted in this direction, and one excellent thorough study with time and memory usage results is available here.

MNIST Dataset

MNIST dataset is a classic daataset of handwritten digits, often used as a benchmark for image classification tasks. It consists of 70,000 grayscale images of size 28*28 pixels, with each image representing a single digit from 0 to 9. The dataset is split into 60,000 training images and 10,000 test images.

The MNIST dataset can be downloaded from the official website here.

Several repositories have used this dataset in some interesting ways. This dataset has a TensorFlowLite version that utilizes a camera, and the setup procedures are available here.

Speech command Dataset

A collection of short audio clips, each containing a spoken command. The dataset is often used for speech recognition tasks, where the goal is to identify the spoken command from the audio clip. The dataset contains of different spoken command such as "yes", "no", "up", "down" and "stop".

This dataset can be downloaded from here.

There is a source code for this dataest using TensorFlow for classification and data processing here.

UrbanSound8K Dataset

The UrbanSound8K dataset is a popular dataset used for sound classification tasks. It consists of 8732 labeled sound clips, each of which is 4 seconds long, and is classified into 10 classes of urban sounds. The 10 classes of urban sounds in the dataset are: air conditioner, car horn, children playing, dog bark, drilling, engine idling, gun shot, jackhammer, siren and street music.

Visit this page for additional details about this dataset and to download it.

One instance of classification of this dataset can be found here. Beside classification there is a guide to create a docker image in this repository.

IMDB movie reviews datasest

This dataset contains 50,000 of movie reviews, split into 25,000 reviews for training and 25,000 reviews for testing. Each review is labeled as either positive or negative, based on its overall sentiment. The dataset is often used for sentiment analysis, building recommendation system and product resesarch, as it provides valuable insights into customer opinions and preferences.

To learn more about this dataset and to download it, go visit this website.

Standford Question Answereing Dataset (SQuAD)

This dataset contains over 100,000 question-answer pairs based on Wikipedia articles. The dataset is designed to test the ability of machine learning models to answer human-generated questions by providing a large corpus of text and a set of associated questions. This dataset can be used to train question-answering models and build chatbots.

This dataset can be downloaded from here.

One instance of question answering of this dataset can be found here.


How to guides

We provide documentations on common "How to" questions. You refer to one of the following docs for more information: