Skip to content

MMV-Lab/EfficientBioAI

Repository files navigation

EfficientBioAI

This package mainly focus on the efficiency (latency improvement, energy saving...) of BioImage AI tasks. For the moment we implemented quantization and pruning algorithm.

1. Introduction:

Overview of the toolbox

Fig. 1: Overview of the toolbox.

As illustrated by the figure above, the whole project contains two steps: compression and inference. In the first step, we prune the pretrained model and quantize it into int8 precision and then transform the model to a format suitable for the inference engine. The next step is to run the inference on the inference engine to make predictions. The inference engine that we choose is OpenVINO for Intel CPUs and TensorRT for NVIDIA GPUs.
We support several popular bioimage AI tools like (mmv_im2im,cellpose), as well as user-defined PyTorch models (see our paper for details on restrictions).

2. System requirements:

Hardware:

  • CPU inference: Processors with the AVX-512 Support (check here, Normally Intel Xeon® Processor Family)
  • GPU inference: Nvidia GPU with int8 Support (check here)

Operating System:

Linux system (Ubuntu 20.04, Debian 10) and Windows 10 are tested. At the moment, we cannot support MacOS.

Dependencies:

The dependencies will be installed automatically.

Versions:

The stable version is 0.0.6

3. Installation:

Typical installation time:

around 5 mins for linux users and 20 mins for windows users.

Prerequisite for Windows users:

If you want to use GPU inference, several things should be checked:

  1. Make sure the cuda and cudnn are installed properly. cuda 11.3 and cudnn 8.9.0 have been tested successfully.
  2. Currently tensorrt cannot be installed from pip in windows system. Users have to install through zip file. version 8.4.3.1 is tested successfully by the author.
  3. to properly install pycuda, ms build tools may be required.

install from scratch:

Installation videos are here: Linux, Windows

First create a virtual environment using conda:

conda config --add channels conda-forge
conda create -n efficientbioai python=3.8 setuptools=59.5.0
conda activate efficientbioai

Then we can install the package:

git clone https://github.com/MMV-Lab/EfficientBioAI.git
# optional: give users the right to read/write/execute the setup script.
chmod 777 ./EfficientBioAI/installation/setup.sh
# for cpu: (Intel, AMD that supports AVX-512)
./EfficientBioAI/installation/setup.sh cpu
# for gpu: (NVIDIA)
./EfficientBioAI/installation/setup.sh gpu
# for both CPU and GPU:
./EfficientBioAI/installation/setup.sh all

For Windows users, please substitute ./EfficientBioAI/installation/setup.sh with .\EfficientBioAI\installation\setup.bat

install via Docker:

We use different Docker images for CPU and GPU. To install Docker, please check: desktop, command line. GPU version also requires nvidia-docker (or from here) to communicate to GPU hardware.

We recommend users to use VSCode Docker plugin to run our tutorial.

Here is the video demonstrating how to install and how to use our tool via Docker: video tutorials for Docker versions.

  • for CPU:
# 1. Pull the image
docker pull mmvlab/efficientbioai:cpu
# 2. Start the container. Your current folder is mounted to /workspace in the container.
docker run -it --rm --name efficientbioai_cpu --shm-size=2gb -v ./:/workspace/ mmvlab/efficientbioai:cpu /bin/bash
  • for CPU+GPU:
# 1. Pull the image
docker pull mmvlab/efficientbioai:all
# 2. Start the container. Your current folder is mounted to /workspace/tmp in the container.
docker run -it --rm --gpus all --name efficientbioai_all --shm-size=2gb -v ./:/workspace/tmp mmvlab/efficientbioai:all /bin/bash

4. Quick Start:

There is an example from ZerocostDL4Mic.

5. Tricks:

Suppose you alreadly have a pretrained model and you want to compress it using our toolkit, several things need to be satisfied:

  • The model should be in the torch.nn.Module format.
  • The model contains no dynamic flow (see here for more details, and here for examples).
  • Avoid explicit self-defined model class member calls outside the class during the quantization process.(see here for description and cases).

If satisfied, just check the 6. Instructions for use section to see how to run the code. There is also an example from ZerocostDL4Mic.

If not, check the following examples to see how to get rid of the problems:

expected run time:

  • Compression:
    • quantization: PTQ will take several minutes.
    • pruning: pruning itself takes several minutes. Fine-tuning will take longer time based on the iterations and training data size.
  • Inference: Latencies on several different tasks can be found in table 1 of our preprint.

6. Instructions for use:

There are two ways to run the code. Using the provided scripts or just using the api. Both requires the config yaml file and you can find an example here: config file example.

Use script:

  • compression:
python efficientbioai/compress.py --config path/to/the/config.yaml --exp_path experiment/save_path

All the intermediate files will be saved in the exp_path folder.

  • inference:
python efficientbioai/inference.py --config path/to/the/config.yaml

Use API:

A minium code snippet is listed as below. Basically, you need to define a calibration function for quantization and fine-tuning function for pruning. Then you can use the api to do the compression and inference.

  • to compress:
from pathlib import Path
from copy import deepcopy
import yaml
from efficientbioai.compress_ppl import Pipeline
from efficientbioai.utils.misc import Dict2ObjParser

# read in the data, should be iterable(dataloader, list...).
data = ... 

def calibrate(model, data, device):
    # define how the inference is done, like 'result = model(data)'
    ...
    return model
def fine_tune(model, data, device):
    # define how the training is done, like 'pl.Trainer('max_epochs'=100).fit(model, data)'
    ...
    return model

# read in the config file
cfg_path = Path("./custom_config.yaml")
with open(cfg_path, "r") as stream:
    config_yml = yaml.safe_load(stream)
exp_path = Path("./exp")
Path.mkdir(exp_path, exist_ok=True)

# do the compression:
pipeline = Pipeline.setup(config_yml)
pipeline(deepcopy(net), val_data, fine_tune, calibrate, exp_path)
pipeline.network2ir()    
  • to inference:
from efficientbioai.infer.backend.openvino import create_opv_model, create_trt_model
# for cpu:
infer_path = exp_path / "academic_deploy_model.xml"
quantized_model = create_opv_model(infer_path)
# for gpu:
infer_path = exp_path / "academic_deploy_model.trt"
quantized_model = create_trt_model(infer_path)
# Then do the inference as normal

Reproduction instructions

All the supplenmentary data can be downloaded from [anonymous] (under review; upon paper acceptance, the data will be released on Zenodo), which includes all the model checkpoints, data for training and test, files for the experiment. So you can use our pretrained model to test the performance of our toolbox on the provided test data for the specific task.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published