# 2D classification task with FLaVor inference service

This guide will walk you through tailoring the FLaVor inference service for 2D classification tasks using the model from [cft-chexpert](https://github.com/maxium0526/cft-chexpert).

## Prerequisite

As for the working environment, please ensure you have the following dependencies installed:

```txt
python > 3.8
torch > 1.13
```

or simply run:

```sh
poetry install --with examples --extras infer 
```

Next, clone [cft-chexpert](https://github.com/maxium0526/cft-chexpert).
```sh
# pwd: examples/inference
git clone https://github.com/maxium0526/cft-chexpert.git chexpert
```


## Implementation

### Setup imports

In [None]:
import os
from typing import Any, List, Sequence, Tuple

import cv2
import numpy as np

from chexpert.utils.wrappers import Wrapper
from flavor.serve.apps import InferAPP
from flavor.serve.inference import (
    BaseAiCOCOInferenceModel,
    BaseAiCOCOInputDataModel,
    BaseAiCOCOOutputDataModel,
)
from flavor.serve.models import AiImage, InferCategory
from flavor.serve.strategies import AiCOCOClassificationOutputStrategy

### Setup inference model

In this section, we would create `ClassificationInferenceModel` inheriting from `BaseAiCOCOInferenceModel`. There are few abstract methods that we must override such as `define_inference_network`, `set_categories`, `set_regressions`, `data_reader` and `output_formatter`. As for `preprocess`, `inference` and `postprocess`, it is optional but here we override them since we are executing a 3D model.

Firstly, we need to implement submethods: `define_inference_network`, `set_categories` and `set_regressions`. These are defined in the `__init__()` constructor of the parent class `BaseAiCOCOInferenceModel`. `define_inference_network` defines your inference network and loads its pre-trained weight. `set_categories` and `set_regressions` define category and regression information. For example, a classification output would contain `c` channels. We need to show the exact meaning of each channel by specifying in `set_categories`. Refer to the following example for more detail.

Next, we implement other submethods that would be used in the `__call__` function of our inference model. See below workflow.

### `__call__` function workflow for the inference model
![__call__](images/call.png "inference workflow")

In [None]:

class ClassificationInferenceModel(BaseAiCOCOInferenceModel):
    def __init__(self):
        self.formatter = AiCOCOClassificationOutputStrategy()

        self.thresholds = {
            "Atelectasis": 0.3,
            "Cardiomegaly": 0.04,
            "Consolidation": 0.17,
            "Edema": 0.12,
            "Enlarged Cardiomediastinum": 0.09,
            "Fracture": 0.07,
            "Lung Lesion": 0.05,
            "Lung Opacity": 0.26,
            "No Finding": 0.06,
            "Pleural Effusion": 0.14,
            "Pleural Other": 0.02,
            "Pneumonia": 0.04,
            "Pneumothorax": 0.31,
            "Support Devices": 0.49,
        }
        super().__init__()

    def define_inference_network(self):
        return Wrapper(os.path.join(os.getcwd(), "chexpert/instances/optimized_model.h5"))

    def set_categories(self):
        categories = [
            {"name": "Atelectasis"},
            {"name": "Cardiomegaly"},
            {"name": "Consolidation"},
            {"name": "Edema"},
            {"name": "Enlarged Cardiomediastinum"},
            {"name": "Fracture"},
            {"name": "Lung Lesion"},
            {"name": "Lung Opacity"},
            {"name": "No Finding"},
            {"name": "Pleural Effusion"},
            {"name": "Pleural Other"},
            {"name": "Pneumonia"},
            {"name": "Pneumothorax"},
            {"name": "Support Devices"},
        ]
        return categories

    def set_regressions(self):
        return None

    def data_reader(self, files: Sequence[str], **kwargs) -> Tuple[np.ndarray, None, None]:
        img = cv2.imread(files[0], cv2.IMREAD_GRAYSCALE)
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
        img = cv2.resize(img, (224, 224), interpolation=cv2.INTER_AREA)
        img = img.astype(np.float32)

        return img, None, None

    def inference(self, x: np.ndarray) -> np.ndarray:
        return self.network.predict(x)

    def output_formatter(
        self,
        model_out: np.ndarray,
        images: Sequence[AiImage],
        categories: List[InferCategory],
        **kwargs
    ) -> Any:
        format_output = np.zeros(len(categories))
        for i, category in enumerate(categories):
            name = category["name"]
            format_output[i] = int(model_out[name] > self.thresholds[name])

        output = self.formatter(model_out=format_output, images=images, categories=categories)
        return output


### Integration with InferAPP
We could integrate our defined inference model with FLaVor `InferAPP`, a FastAPI application. To initiate the application, users have to define `input_data_model` and `output_data_model` which are the standard input and output structure for the service. Then, provide `infer_function` as the main inference operation. After initiate the service, `/invocations` API end point would be available to process the inference request. We encourge users to implement a stand-alone python script based on this jupyter notebook tutorial.

#### (Optional) to initiate application in jupyter notebook, you have to run the following block.

```python
# This block is only for jupyter notebook. You don't need this in stand-alone script.
import nest_asyncio
nest_asyncio.apply()
```

#### Initiate the service

In [None]:
app = InferAPP(
    infer_function=ClassificationInferenceModel(),
    input_data_model=BaseAiCOCOInputDataModel,
    output_data_model=BaseAiCOCOOutputDataModel,
)

In [None]:
app.run(port=int(os.getenv("PORT", 9111)))

### Send request
We can send request to the running server by `send_request.py` which opens the input files and the coresponding JSON file and would be sent via formdata. We expect to have response in AiCOCO format.

```bash
# pwd: examples/inference
python send_request.py -f chexpert/demo_img.jpg -d test_data/cls/input.json
```

## Setup Dockerfile
In order to interact with other services, we have to wrap the inference model into a docker container. Here's an example of the dockerfile.

```dockerfile
FROM nvidia/cuda:12.2.2-runtime-ubuntu20.04

RUN apt-get update \
    && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
        python3.9 \
        python3-pip \
    && ln -sf /usr/bin/python3.9 /usr/bin/python
    && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends wget git\

RUN pip install torch==2.1.0+cu121 --extra-index-url https://download.pytorch.org/whl/cu121 --default-timeout=1000
RUN pip install https://github.com/ailabstw/FLaVor/archive/refs/heads/release/stable.zip -U && pip install "flavor[infer]"

WORKDIR /app

RUN git clone https://github.com/maxium0526/cft-chexpert.git /app/chexpert

COPY your_script.py  /app/

CMD ["python", "your_script.py"]

```