# ML4CV project work

Summary: Improving and explaining instance segmentation on a litter detection dataset

Members:
- Dell'Olio Domenico
- Delvecchio Giovanni Pio
- Disabato Raffaele

The project was developed in order to improve instance segmentation results on the [TACO Dataset](http://tacodataset.org/).

We decided to implement and test various architectures, among the highest scoring on COCO instance segmentation datasets, in order to compare their performances.
We also tested some explainability methods on these models to try and explain model predictions.

## This notebook contains:
- SOLOv2 model training

In [None]:
# Installing Detectron2 and required libraries
!pip install 'git+https://github.com/facebookresearch/detectron2.git@5aeb252b194b93dc2879b4ac34bc51a31b5aee13'
!pip install rapidfuzz==2.15.1
!python -m pip install numpy==1.23.1
import numpy as np
np.bool = np.bool_

In [None]:
# Repository cloning and requirements installation
%cd /content/
!git clone https://github.com/DomMcOyle/TACO-expl.git
%cd /content/TACO-expl
!git checkout solov2
!git pull origin solov2
%cd /content/TACO-expl/AdelaiDet/
!python setup.py build develop

In [None]:
# loading drive
from google.colab import drive
drive.mount("/content/MyDrive/", force_remount = True)

Mounted at /content/MyDrive/


In [None]:
# checking detectron2 installation and nvcc version
import torch, detectron2
!nvcc --version
TORCH_VERSION = ".".join(torch.__version__.split(".")[:2])
CUDA_VERSION = torch.__version__.split("+")[-1]
print("torch: ", TORCH_VERSION, "; cuda: ", CUDA_VERSION)
print("detectron2:", detectron2.__version__)

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0
torch:  2.2 ; cuda:  cu121
detectron2: 0.6


In this Notebook we explore the capabilites of the SOLOv2 architecture for instance segmentation presented by Wang et Al. in this [paper](https://arxiv.org/pdf/2003.10152).

![](./res/solov2.png)

The architecture is an improved version of [SOLO](https://arxiv.org/pdf/1912.04488) (Segmenting Objects by LOcations) by the same authors. The original work proposes a one-shot instance segmentation architecture based on the idea of sub-dividing the input in a SxS grid and for each patch the network produces in parallel a classification distribution (for the object in the cell) and a segmentation mask (class-agnostic). Both prediction are conditioned to the position of the object (normalized pixel coordinates are concatenated to the input) and are produced at different scales. In fact, the network uses a backbone network and a FPN to extract features at different scales. Instances are finally selected with NMS.

The second version of this architecture improves the efficiency of the NMS techinique and the time and space efficiency of the network itself. In fact, considering that the object are sparsely distributed on the image, its highly inefficient to produce a mask for all the locations. For this reason, locations are pre-filtered based on their features and then their maps are obtained by convoluting a feature map that condenses information from all the FPN scales and a learned kernel (also conditioned on the location) that avoids differentiating the last convolution layer for each scale.

The framework is pre-implemented in [Detectron 2](https://github.com/facebookresearch/detectron2) and we exploit this library to perform our experiments.

In [None]:
%cd /content/
# Some basic setup:
# Setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common libraries
import numpy as np
import os, json, cv2, random
from google.colab.patches import cv2_imshow

# import some common detectron2 utilities
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog

from detectron2.modeling import build_model
%cd /content/TACO-expl/AdelaiDet/adet/modeling/solov2/
from solov2 import SOLOv2

/content
/content/TACO-expl/AdelaiDet/adet/modeling/solov2


We didn't train the model from scratch, because we opted for a fine-tuning of the weights already provided within the repository.

The chosen backbone for the model was ResNet50.

In [None]:
import sys

cfg_solo_base_path = '/content/TACO-expl/AdelaiDet/configs/SOLOv2/Base-SOLOv2.yaml'
cfg_solo_r50_path = '/content/TACO-expl/AdelaiDet/configs/SOLOv2/R50_3x.yaml'

The training is handled by a config file in yaml format which has to be modified and loaded beforehand.

In [None]:
sys.path.append('/content/TACO-expl/AdelaiDet/adet/config')
from defaults import _C

def setup_cfg(cfg_base, cfg_backbone):
  """
  loads config from file and command-line arguments
  :param cfg_base: config file path
  :param cfg_backbone: uconfig file path of the backbone
  """
  cfg = _C
  cfg.merge_from_file(cfg_base)
  cfg.merge_from_file(cfg_backbone)
  return cfg

cfg_solov2 = setup_cfg(cfg_solo_base_path, cfg_solo_r50_path)
print(cfg_solov2)

CUDNN_BENCHMARK: False
DATALOADER:
  ASPECT_RATIO_GROUPING: True
  FILTER_EMPTY_ANNOTATIONS: True
  NUM_WORKERS: 4
  REPEAT_THRESHOLD: 0.0
  SAMPLER_TRAIN: TrainingSampler
DATASETS:
  PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
  PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
  PROPOSAL_FILES_TEST: ()
  PROPOSAL_FILES_TRAIN: ()
  TEST: ('coco_2017_val',)
  TRAIN: ('coco_2017_train',)
GLOBAL:
  HACK: 1.0
INPUT:
  CROP:
    CROP_INSTANCE: True
    ENABLED: False
    SIZE: [0.9, 0.9]
    TYPE: relative_range
  FORMAT: BGR
  HFLIP_TRAIN: True
  IS_ROTATE: False
  MASK_FORMAT: bitmask
  MAX_SIZE_TEST: 1333
  MAX_SIZE_TRAIN: 1333
  MIN_SIZE_TEST: 800
  MIN_SIZE_TRAIN: (640, 672, 704, 736, 768, 800)
  MIN_SIZE_TRAIN_SAMPLING: choice
  RANDOM_FLIP: horizontal
MODEL:
  ANCHOR_GENERATOR:
    ANGLES: [[-90, 0, 90]]
    ASPECT_RATIOS: [[0.5, 1.0, 2.0]]
    NAME: DefaultAnchorGenerator
    OFFSET: 0.0
    SIZES: [[32, 64, 128, 256, 512]]
  BACKBONE:
    ANTI_ALIAS: False
    FREEZE_AT: 2
    NAME: build_resnet_fpn

We leave the following cell reporting the number of trained parameters for the SOLOv2

In [None]:
model = build_model(cfg_solov2)
num_trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
print(num_trainable_params)

46317392


In [None]:
# sets the radnom seed
DEFAULT_RANDOM_SEED = 42
# basic random seed
def seedBasic(seed=DEFAULT_RANDOM_SEED):
    random.seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
    np.random.seed(seed)
# torch random seed
def seedTorch(seed=DEFAULT_RANDOM_SEED):
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
# combine
def seedEverything(seed=DEFAULT_RANDOM_SEED):
    seedBasic(seed)
    seedTorch(seed)

seedEverything()

In [None]:
#!mkdir /content/official/
!cp -r /content/MyDrive/MyDrive/official/ /content/
!mkdir /content/output/
!cp -r /content/MyDrive/MyDrive/solo_models/chkpt/ /content/output/

We also load and register the dataset, generated in the "MFDETR Training Notebook.ipynb" and modified in "validation resizing.ipynb" as a coco dataset instance.

Differently from the one employed in the Mask-Frozen DETR Training, the images in the validation set and test set are already rotated and resized to a 800x1333 maximum size due to resources requirements. Also the images in the training set are already rotated. This was done to simplify the Detectron2 training pipeline, which is already quite convoluted.

In [None]:
from detectron2.data.datasets import register_coco_instances
from detectron2.data import MetadataCatalog

with open("/content/TACO-expl/data/annotations_off_0_train.json", "r") as f:
    dataset = json.loads(f.read())
classes = [elem["name"] for elem in dataset["categories"]]

train_annotation_file = '/content/TACO-expl/data/annotations_off_0_train.json'
val_annotation_file = '/content/TACO-expl/data/annotations_off_0_val.json'

img_dir = '/content/official/'


register_coco_instances("TACO_train", {}, train_annotation_file, img_dir)
MetadataCatalog.get("TACO_train").set(thing_classes = classes)
dataset_dicts_train = DatasetCatalog.get("TACO_train")

register_coco_instances("TACO_val", {}, val_annotation_file, img_dir)
MetadataCatalog.get("TACO_val").set(thing_classes = classes)
dataset_dicts_val = DatasetCatalog.get("TACO_val")

[05/10 06:07:46 d2.data.datasets.coco]: Loaded 1200 images in COCO format from /content/TACO-expl/data/annotations_off_0_train.json
[05/10 06:07:46 d2.data.datasets.coco]: Loaded 150 images in COCO format from /content/TACO-expl/data/annotations_off_0_val.json




Here we check some of the configuration values employed for training. The training recipe provided in the repository was used as starting point for the one we actually employed in the end, which consists of:

- Image normalization.
- Data augmentation consisting of Random Horizondtal flip, random resizing with maximum size 1333x800, Random crop.
- Optimizer is AdamW with 1e-4 weight decay and gradient clipping to 1
- Learning rate is set to 1e-3
and scheduled with a Warm-up cosine scheduler. The minimum learning rate was 1e-4.
- Batch size set to 8.
- Maximum Number of epochs set to 12.

In [None]:
train_cfg = cfg_solov2.clone()
with open("/content/TACO-expl/taco_train_solov2.yaml", "w") as f:
  f.write(train_cfg.dump())

In [None]:
train_cfg_loaded = get_cfg()
train_cfg_loaded.set_new_allowed(True)
train_cfg_loaded.merge_from_file("/content/TACO-expl/solov2_config/taco_train_solov2.yaml")
print(train_cfg_loaded.SOLVER.IMS_PER_BATCH)
print(train_cfg_loaded.DATALOADER.NUM_WORKERS)
print(train_cfg_loaded.OUTPUT_DIR)
print(train_cfg_loaded.MODEL.WEIGHTS)

8
4
/content/output/chkpt/
/content/output/chkpt/model_0001949.pth


By running the following cells, the model can be fine-tuned. Our experiments on this model were actually pretty scarce as the initial results were quite poor even when compared with those in the TACO paper.

In fact, we tried to change the recipe by modifying learning rate and the number of stages of the backbone to be frozen, but the results were always around 12-13% MAP, so we abandoned the idea of adopting this model, because even if in is indesputably faster than MaskDINO and Mask-Frozen DETR, the tradeoff in performances is too high.

In [None]:
%cd /content/TACO-expl/AdelaiDet/tools/
from train_net import Trainer
%cd /content/

In [None]:
trainer = Trainer(train_cfg_loaded)
trainer.build_hooks()

[05/10 06:29:47 d2.engine.defaults]: Model:
SOLOv2(
  (backbone): FPN(
    (fpn_lateral2): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral3): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral4): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output4): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral5): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output5): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (top_block): LastLevelMaxPool()
    (bottom_up): ResNet(
      (stem): BasicStem(
        (conv1): Conv2d(
          3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False
          (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
        )
      )
      (res2): Seque



[<detectron2.engine.hooks.IterationTimer at 0x7b40e0ab1d80>,
 <detectron2.engine.hooks.LRScheduler at 0x7b40e0a8a050>,
 None,
 <detectron2.engine.hooks.PeriodicCheckpointer at 0x7b40e0a88cd0>,
 <detectron2.engine.hooks.EvalHook at 0x7b40e0a89ea0>,
 <detectron2.engine.hooks.PeriodicWriter at 0x7b40e0a88e20>]

In [None]:
trainer.resume_or_load(resume = True)  # load last checkpoint or MODEL.WEIGHTS
trainer.train()

[05/10 07:19:26 d2.checkpoint.c2_model_loading]: Following weights matched with model:
| Names in Model                              | Names in Checkpoint                                                                                  | Shapes                                          |
|:--------------------------------------------|:-----------------------------------------------------------------------------------------------------|:------------------------------------------------|
| backbone.bottom_up.res2.0.conv1.*           | backbone.bottom_up.res2.0.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight}    | (64,) (64,) (64,) (64,) (64,64,1,1)             |
| backbone.bottom_up.res2.0.conv2.*           | backbone.bottom_up.res2.0.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight}    | (64,) (64,) (64,) (64,) (64,64,3,3)             |
| backbone.bottom_up.res2.0.conv3.*           | backbone.bottom_up.res2.0.conv3.{norm.bias,norm.running_mean,

  self.pid = os.fork()


[05/10 07:20:08 d2.utils.events]:  eta: 3:27:05  iter: 1959  total_loss: 1.25  loss_ins: 0.9572  loss_cate: 0.2914  time: 3.7122  data_time: 2.2751  lr: 0.00076578  max_mem: 8286M
[05/10 07:21:13 d2.utils.events]:  eta: 2:58:10  iter: 1979  total_loss: 1.426  loss_ins: 1.122  loss_cate: 0.2944  time: 3.3850  data_time: 1.6352  lr: 0.00076145  max_mem: 8419M
[05/10 07:22:24 d2.data.datasets.coco]: Loaded 150 images in COCO format from /content/TACO-expl/data/annotations_off_0_val.json
[05/10 07:22:24 d2.data.build]: Distribution of instances among all 10 categories:
|  category  | #instances   |   category    | #instances   |  category  | #instances   |
|:----------:|:-------------|:-------------:|:-------------|:----------:|:-------------|
|   Bottle   | 50           |  Bottle cap   | 30           |    Can     | 21           |
| Cigarette  | 42           |      Cup      | 21           |    Lid     | 8            |
|   Other    | 135          | Plastic bag.. | 77           |  Pop tab   

  self.pid = os.fork()


[05/10 07:22:58 d2.evaluation.evaluator]: Inference done 11/150. Dataloading: 0.0019 s/iter. Inference: 0.2404 s/iter. Eval: 4.2811 s/iter. Total: 4.5234 s/iter. ETA=0:10:28
[05/10 07:23:04 d2.evaluation.evaluator]: Inference done 12/150. Dataloading: 0.0021 s/iter. Inference: 0.2417 s/iter. Eval: 4.4006 s/iter. Total: 4.6453 s/iter. ETA=0:10:41
[05/10 07:23:15 d2.evaluation.evaluator]: Inference done 15/150. Dataloading: 0.0036 s/iter. Inference: 0.2469 s/iter. Eval: 4.1582 s/iter. Total: 4.4099 s/iter. ETA=0:09:55
[05/10 07:23:23 d2.evaluation.evaluator]: Inference done 18/150. Dataloading: 0.0033 s/iter. Inference: 0.2304 s/iter. Eval: 3.7105 s/iter. Total: 3.9452 s/iter. ETA=0:08:40
[05/10 07:23:33 d2.evaluation.evaluator]: Inference done 19/150. Dataloading: 0.0033 s/iter. Inference: 0.2293 s/iter. Eval: 4.1802 s/iter. Total: 4.4139 s/iter. ETA=0:09:38
[05/10 07:23:48 d2.evaluation.evaluator]: Inference done 21/150. Dataloading: 0.0033 s/iter. Inference: 0.2462 s/iter. Eval: 4.558

  self.pid = os.fork()


[05/10 07:33:46 d2.evaluation.evaluator]: Total inference time: 0:11:15.079947 (4.655724 s / iter per device, on 1 devices)
[05/10 07:33:46 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:30 (0.207239 s / iter per device, on 1 devices)
[05/10 07:33:47 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[05/10 07:33:47 d2.evaluation.coco_evaluation]: Saving results to /content/output/chkpt/inference/coco_instances_results.json
[05/10 07:33:47 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API...
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
[05/10 07:33:47 d2.evaluation.fast_eval_api]: Evaluate annotation type *bbox*
[05/10 07:33:47 d2.evaluation.fast_eval_api]: COCOeval_opt.evaluate() finished in 0.10 seconds.
[05/10 07:33:47 d2.evaluation.fast_eval_api]: Accumulating evaluation results...
[05/10 07:33:47 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 0.04 seconds.
 

  self.pid = os.fork()


[05/10 08:01:56 d2.evaluation.evaluator]: Inference done 11/150. Dataloading: 0.0028 s/iter. Inference: 0.2040 s/iter. Eval: 4.3886 s/iter. Total: 4.5955 s/iter. ETA=0:10:38
[05/10 08:02:01 d2.evaluation.evaluator]: Inference done 13/150. Dataloading: 0.0027 s/iter. Inference: 0.2028 s/iter. Eval: 3.8762 s/iter. Total: 4.0821 s/iter. ETA=0:09:19
[05/10 08:02:16 d2.evaluation.evaluator]: Inference done 15/150. Dataloading: 0.0038 s/iter. Inference: 0.2079 s/iter. Eval: 4.5357 s/iter. Total: 4.7480 s/iter. ETA=0:10:40
[05/10 08:02:22 d2.evaluation.evaluator]: Inference done 18/150. Dataloading: 0.0046 s/iter. Inference: 0.2041 s/iter. Eval: 3.8998 s/iter. Total: 4.1092 s/iter. ETA=0:09:02
[05/10 08:02:27 d2.evaluation.evaluator]: Inference done 20/150. Dataloading: 0.0044 s/iter. Inference: 0.1992 s/iter. Eval: 3.7039 s/iter. Total: 3.9081 s/iter. ETA=0:08:28
[05/10 08:02:33 d2.evaluation.evaluator]: Inference done 21/150. Dataloading: 0.0044 s/iter. Inference: 0.1964 s/iter. Eval: 3.837

  self.pid = os.fork()


[05/10 08:37:21 d2.evaluation.evaluator]: Inference done 11/150. Dataloading: 0.0023 s/iter. Inference: 0.2129 s/iter. Eval: 4.2606 s/iter. Total: 4.4759 s/iter. ETA=0:10:22
[05/10 08:37:26 d2.evaluation.evaluator]: Inference done 12/150. Dataloading: 0.0025 s/iter. Inference: 0.2134 s/iter. Eval: 4.3381 s/iter. Total: 4.5550 s/iter. ETA=0:10:28
[05/10 08:37:40 d2.evaluation.evaluator]: Inference done 15/150. Dataloading: 0.0042 s/iter. Inference: 0.2162 s/iter. Eval: 4.3072 s/iter. Total: 4.5287 s/iter. ETA=0:10:11
[05/10 08:37:49 d2.evaluation.evaluator]: Inference done 19/150. Dataloading: 0.0040 s/iter. Inference: 0.2142 s/iter. Eval: 3.6855 s/iter. Total: 3.9048 s/iter. ETA=0:08:31
[05/10 08:37:59 d2.evaluation.evaluator]: Inference done 21/150. Dataloading: 0.0038 s/iter. Inference: 0.2106 s/iter. Eval: 3.7991 s/iter. Total: 4.0145 s/iter. ETA=0:08:37
[05/10 08:38:07 d2.evaluation.evaluator]: Inference done 22/150. Dataloading: 0.0039 s/iter. Inference: 0.2118 s/iter. Eval: 4.060

  self.pid = os.fork()


[05/10 09:14:20 d2.evaluation.evaluator]: Inference done 11/150. Dataloading: 0.0024 s/iter. Inference: 0.1846 s/iter. Eval: 4.1170 s/iter. Total: 4.3040 s/iter. ETA=0:09:58
[05/10 09:14:25 d2.evaluation.evaluator]: Inference done 13/150. Dataloading: 0.0024 s/iter. Inference: 0.1871 s/iter. Eval: 3.7135 s/iter. Total: 3.9034 s/iter. ETA=0:08:54
[05/10 09:14:34 d2.evaluation.evaluator]: Inference done 15/150. Dataloading: 0.0024 s/iter. Inference: 0.1938 s/iter. Eval: 3.7927 s/iter. Total: 3.9894 s/iter. ETA=0:08:58
[05/10 09:14:40 d2.evaluation.evaluator]: Inference done 18/150. Dataloading: 0.0031 s/iter. Inference: 0.1982 s/iter. Eval: 3.2827 s/iter. Total: 3.4848 s/iter. ETA=0:07:39
[05/10 09:14:46 d2.evaluation.evaluator]: Inference done 19/150. Dataloading: 0.0032 s/iter. Inference: 0.1999 s/iter. Eval: 3.4561 s/iter. Total: 3.6604 s/iter. ETA=0:07:59
[05/10 09:14:54 d2.evaluation.evaluator]: Inference done 21/150. Dataloading: 0.0031 s/iter. Inference: 0.1963 s/iter. Eval: 3.543

  self.pid = os.fork()


[05/10 09:22:47 d2.evaluation.evaluator]: Total inference time: 0:08:52.923383 (3.675334 s / iter per device, on 1 devices)
[05/10 09:22:47 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:28 (0.196062 s / iter per device, on 1 devices)
[05/10 09:22:47 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[05/10 09:22:47 d2.evaluation.coco_evaluation]: Saving results to /content/output/chkpt/inference/coco_instances_results.json
[05/10 09:22:48 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API...
Loading and preparing results...
DONE (t=0.02s)
creating index...
index created!
[05/10 09:22:48 d2.evaluation.fast_eval_api]: Evaluate annotation type *bbox*
[05/10 09:22:48 d2.evaluation.fast_eval_api]: COCOeval_opt.evaluate() finished in 0.07 seconds.
[05/10 09:22:48 d2.evaluation.fast_eval_api]: Accumulating evaluation results...
[05/10 09:22:48 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 0.04 seconds.
 

  self.pid = os.fork()


[05/10 09:50:53 d2.evaluation.evaluator]: Inference done 11/150. Dataloading: 0.0046 s/iter. Inference: 0.1952 s/iter. Eval: 3.4714 s/iter. Total: 3.6712 s/iter. ETA=0:08:30
[05/10 09:51:03 d2.evaluation.evaluator]: Inference done 15/150. Dataloading: 0.0084 s/iter. Inference: 0.1992 s/iter. Eval: 3.0137 s/iter. Total: 3.2216 s/iter. ETA=0:07:14
[05/10 09:51:15 d2.evaluation.evaluator]: Inference done 19/150. Dataloading: 0.0082 s/iter. Inference: 0.2074 s/iter. Eval: 2.9169 s/iter. Total: 3.1329 s/iter. ETA=0:06:50
[05/10 09:51:23 d2.evaluation.evaluator]: Inference done 21/150. Dataloading: 0.0078 s/iter. Inference: 0.2057 s/iter. Eval: 3.0314 s/iter. Total: 3.2456 s/iter. ETA=0:06:58
[05/10 09:51:28 d2.evaluation.evaluator]: Inference done 22/150. Dataloading: 0.0075 s/iter. Inference: 0.2060 s/iter. Eval: 3.1352 s/iter. Total: 3.3495 s/iter. ETA=0:07:08
[05/10 09:51:37 d2.evaluation.evaluator]: Inference done 24/150. Dataloading: 0.0073 s/iter. Inference: 0.2079 s/iter. Eval: 3.240

  self.pid = os.fork()


[05/10 09:57:31 d2.evaluation.evaluator]: Total inference time: 0:06:59.909442 (2.895927 s / iter per device, on 1 devices)
[05/10 09:57:31 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:29 (0.203233 s / iter per device, on 1 devices)
[05/10 09:57:31 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[05/10 09:57:31 d2.evaluation.coco_evaluation]: Saving results to /content/output/chkpt/inference/coco_instances_results.json
[05/10 09:57:32 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API...
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
[05/10 09:57:32 d2.evaluation.fast_eval_api]: Evaluate annotation type *bbox*
[05/10 09:57:32 d2.evaluation.fast_eval_api]: COCOeval_opt.evaluate() finished in 0.11 seconds.
[05/10 09:57:32 d2.evaluation.fast_eval_api]: Accumulating evaluation results...
[05/10 09:57:32 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 0.06 seconds.
 

  self.pid = os.fork()


[05/10 10:25:34 d2.evaluation.evaluator]: Inference done 11/150. Dataloading: 0.0058 s/iter. Inference: 0.2458 s/iter. Eval: 3.0899 s/iter. Total: 3.3414 s/iter. ETA=0:07:44
[05/10 10:25:46 d2.evaluation.evaluator]: Inference done 15/150. Dataloading: 0.0055 s/iter. Inference: 0.2135 s/iter. Eval: 3.0403 s/iter. Total: 3.2597 s/iter. ETA=0:07:20
[05/10 10:25:52 d2.evaluation.evaluator]: Inference done 18/150. Dataloading: 0.0073 s/iter. Inference: 0.2270 s/iter. Eval: 2.6651 s/iter. Total: 2.9000 s/iter. ETA=0:06:22
[05/10 10:25:59 d2.evaluation.evaluator]: Inference done 20/150. Dataloading: 0.0071 s/iter. Inference: 0.2320 s/iter. Eval: 2.7407 s/iter. Total: 2.9805 s/iter. ETA=0:06:27
[05/10 10:26:07 d2.evaluation.evaluator]: Inference done 22/150. Dataloading: 0.0069 s/iter. Inference: 0.2285 s/iter. Eval: 2.8779 s/iter. Total: 3.1139 s/iter. ETA=0:06:38
[05/10 10:26:16 d2.evaluation.evaluator]: Inference done 24/150. Dataloading: 0.0065 s/iter. Inference: 0.2276 s/iter. Eval: 3.035

  self.pid = os.fork()


[05/10 11:00:05 d2.evaluation.evaluator]: Inference done 11/150. Dataloading: 0.0026 s/iter. Inference: 0.2203 s/iter. Eval: 3.3308 s/iter. Total: 3.5538 s/iter. ETA=0:08:13
[05/10 11:00:11 d2.evaluation.evaluator]: Inference done 14/150. Dataloading: 0.0037 s/iter. Inference: 0.2196 s/iter. Eval: 2.7652 s/iter. Total: 2.9890 s/iter. ETA=0:06:46
[05/10 11:00:18 d2.evaluation.evaluator]: Inference done 15/150. Dataloading: 0.0036 s/iter. Inference: 0.2317 s/iter. Eval: 3.2287 s/iter. Total: 3.4645 s/iter. ETA=0:07:47
[05/10 11:00:24 d2.evaluation.evaluator]: Inference done 18/150. Dataloading: 0.0042 s/iter. Inference: 0.2350 s/iter. Eval: 2.8718 s/iter. Total: 3.1119 s/iter. ETA=0:06:50
[05/10 11:00:30 d2.evaluation.evaluator]: Inference done 19/150. Dataloading: 0.0041 s/iter. Inference: 0.2338 s/iter. Eval: 3.0646 s/iter. Total: 3.3038 s/iter. ETA=0:07:12
[05/10 11:00:38 d2.evaluation.evaluator]: Inference done 21/150. Dataloading: 0.0040 s/iter. Inference: 0.2265 s/iter. Eval: 3.139

  self.pid = os.fork()


[05/10 11:07:07 d2.evaluation.evaluator]: Total inference time: 0:07:23.646504 (3.059631 s / iter per device, on 1 devices)
[05/10 11:07:07 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:30 (0.212418 s / iter per device, on 1 devices)
[05/10 11:07:08 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[05/10 11:07:08 d2.evaluation.coco_evaluation]: Saving results to /content/output/chkpt/inference/coco_instances_results.json
[05/10 11:07:08 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API...
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
[05/10 11:07:08 d2.evaluation.fast_eval_api]: Evaluate annotation type *bbox*
[05/10 11:07:08 d2.evaluation.fast_eval_api]: COCOeval_opt.evaluate() finished in 0.07 seconds.
[05/10 11:07:08 d2.evaluation.fast_eval_api]: Accumulating evaluation results...
[05/10 11:07:08 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 0.04 seconds.
 

  self.pid = os.fork()


[05/10 11:35:05 d2.evaluation.evaluator]: Inference done 11/150. Dataloading: 0.0043 s/iter. Inference: 0.2676 s/iter. Eval: 3.9940 s/iter. Total: 4.2659 s/iter. ETA=0:09:52
[05/10 11:35:11 d2.evaluation.evaluator]: Inference done 14/150. Dataloading: 0.0043 s/iter. Inference: 0.2349 s/iter. Eval: 3.1709 s/iter. Total: 3.4105 s/iter. ETA=0:07:43
[05/10 11:35:16 d2.evaluation.evaluator]: Inference done 15/150. Dataloading: 0.0050 s/iter. Inference: 0.2322 s/iter. Eval: 3.3601 s/iter. Total: 3.5978 s/iter. ETA=0:08:05
[05/10 11:35:21 d2.evaluation.evaluator]: Inference done 18/150. Dataloading: 0.0062 s/iter. Inference: 0.2368 s/iter. Eval: 2.9312 s/iter. Total: 3.1749 s/iter. ETA=0:06:59
[05/10 11:35:26 d2.evaluation.evaluator]: Inference done 19/150. Dataloading: 0.0061 s/iter. Inference: 0.2349 s/iter. Eval: 3.0712 s/iter. Total: 3.3132 s/iter. ETA=0:07:14
[05/10 11:35:34 d2.evaluation.evaluator]: Inference done 21/150. Dataloading: 0.0057 s/iter. Inference: 0.2353 s/iter. Eval: 3.146

  self.pid = os.fork()


[05/10 11:55:56 d2.evaluation.evaluator]: Inference done 11/150. Dataloading: 0.0076 s/iter. Inference: 0.2225 s/iter. Eval: 3.1479 s/iter. Total: 3.3780 s/iter. ETA=0:07:49
[05/10 11:56:06 d2.evaluation.evaluator]: Inference done 15/150. Dataloading: 0.0074 s/iter. Inference: 0.2088 s/iter. Eval: 2.7651 s/iter. Total: 2.9817 s/iter. ETA=0:06:42
[05/10 11:56:14 d2.evaluation.evaluator]: Inference done 19/150. Dataloading: 0.0077 s/iter. Inference: 0.2241 s/iter. Eval: 2.5038 s/iter. Total: 2.7363 s/iter. ETA=0:05:58
[05/10 11:56:20 d2.evaluation.evaluator]: Inference done 21/150. Dataloading: 0.0079 s/iter. Inference: 0.2178 s/iter. Eval: 2.5584 s/iter. Total: 2.7850 s/iter. ETA=0:05:59
[05/10 11:56:27 d2.evaluation.evaluator]: Inference done 23/150. Dataloading: 0.0077 s/iter. Inference: 0.2163 s/iter. Eval: 2.6286 s/iter. Total: 2.8535 s/iter. ETA=0:06:02
[05/10 11:56:37 d2.evaluation.evaluator]: Inference done 25/150. Dataloading: 0.0074 s/iter. Inference: 0.2198 s/iter. Eval: 2.816

  self.pid = os.fork()


[05/10 12:01:51 d2.evaluation.evaluator]: Total inference time: 0:06:14.999883 (2.586206 s / iter per device, on 1 devices)
[05/10 12:01:51 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:31 (0.215733 s / iter per device, on 1 devices)
[05/10 12:01:51 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[05/10 12:01:51 d2.evaluation.coco_evaluation]: Saving results to /content/output/chkpt/inference/coco_instances_results.json
[05/10 12:01:51 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API...
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
[05/10 12:01:51 d2.evaluation.fast_eval_api]: Evaluate annotation type *bbox*
[05/10 12:01:51 d2.evaluation.fast_eval_api]: COCOeval_opt.evaluate() finished in 0.10 seconds.
[05/10 12:01:51 d2.evaluation.fast_eval_api]: Accumulating evaluation results...
[05/10 12:01:52 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 0.06 seconds.
 

OrderedDict([('bbox',
              {'AP': 0.0,
               'AP50': 0.0,
               'AP75': 0.0,
               'APs': 0.0,
               'APm': 0.0,
               'APl': 0.0,
               'AP-Bottle': 0.0,
               'AP-Bottle cap': 0.0,
               'AP-Can': 0.0,
               'AP-Cigarette': 0.0,
               'AP-Cup': 0.0,
               'AP-Lid': 0.0,
               'AP-Other': 0.0,
               'AP-Plastic bag & wrapper': 0.0,
               'AP-Pop tab': 0.0,
               'AP-Straw': 0.0}),
             ('segm',
              {'AP': 12.38762151415335,
               'AP50': 21.96755226726416,
               'AP75': 10.452759541690543,
               'APs': 0.00825082508250825,
               'APm': 4.278477916381564,
               'APl': 16.71542642024696,
               'AP-Bottle': 27.679730827190497,
               'AP-Bottle cap': 24.59293810272514,
               'AP-Can': 8.017971522106816,
               'AP-Cigarette': 1.595264536529435,
      