These are the steps to train faster rcnn on the given garments-object data. First upload the tar which contains the images and labels generated by `convert_to_coco_format.py`, as well as the extra config file and `paths_catalog.py`. Then run the following cells.

In [0]:
!git clone https://github.com/facebookresearch/maskrcnn-benchmark.git
!pip install ninja yacs cython matplotlib

Cloning into 'maskrcnn-benchmark'...
remote: Enumerating objects: 842, done.[K
remote: Total 842 (delta 0), reused 0 (delta 0), pack-reused 842[K
Receiving objects: 100% (842/842), 3.87 MiB | 2.67 MiB/s, done.
Resolving deltas: 100% (461/461), done.
Collecting ninja
[?25l  Downloading https://files.pythonhosted.org/packages/20/6c/8bf281550ca984d673f76a6f59e5f2566cf21d5a8c7b2936e2cbc75488d3/ninja-1.8.2.post2-cp36-cp36m-manylinux1_x86_64.whl (98kB)
[K    100% |████████████████████████████████| 102kB 2.6MB/s 
[?25hCollecting yacs
  Downloading https://files.pythonhosted.org/packages/54/d4/7b12a88a06adef912f95d1e08edbdb70ee63e2160244bc311b6e51ef3842/yacs-0.1.5-py3-none-any.whl
Installing collected packages: ninja, yacs
Successfully installed ninja-1.8.2.post2 yacs-0.1.5


In [0]:
cd maskrcnn-benchmark

/content/maskrcnn-benchmark


In [0]:
!python setup.py build develop
!mkdir -p datasets

running build
running build_py
creating build
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/maskrcnn_benchmark
copying maskrcnn_benchmark/__init__.py -> build/lib.linux-x86_64-3.6/maskrcnn_benchmark
creating build/lib.linux-x86_64-3.6/maskrcnn_benchmark/structures
copying maskrcnn_benchmark/structures/bounding_box.py -> build/lib.linux-x86_64-3.6/maskrcnn_benchmark/structures
copying maskrcnn_benchmark/structures/__init__.py -> build/lib.linux-x86_64-3.6/maskrcnn_benchmark/structures
copying maskrcnn_benchmark/structures/segmentation_mask.py -> build/lib.linux-x86_64-3.6/maskrcnn_benchmark/structures
copying maskrcnn_benchmark/structures/boxlist_ops.py -> build/lib.linux-x86_64-3.6/maskrcnn_benchmark/structures
copying maskrcnn_benchmark/structures/image_list.py -> build/lib.linux-x86_64-3.6/maskrcnn_benchmark/structures
creating build/lib.linux-x86_64-3.6/maskrcnn_benchmark/utils
copying maskrcnn_benchmark/utils/collect_env.py -> build/lib.linux-x86_64-3.6/ma

In [0]:
cd datasets

/content/maskrcnn-benchmark/datasets


In [0]:
!mv /content/simple_clothes.tar .
!tar xvf simple_clothes.tar

./simple_clothes/
./simple_clothes/__pycache__/
./simple_clothes/e2e_faster-rcnn_colab.yaml
./simple_clothes/e2e_faster-rcnn_inference.yaml
./simple_clothes/images/
./simple_clothes/labels/
./simple_clothes/paths_catalog.py
./simple_clothes/labels/test.json
./simple_clothes/labels/train.json
./simple_clothes/labels/val.json
./simple_clothes/images/1.jpg
./simple_clothes/images/10.jpg
./simple_clothes/images/100.jpg
./simple_clothes/images/101.jpg
./simple_clothes/images/102.jpg
./simple_clothes/images/103.jpg
./simple_clothes/images/104.jpg
./simple_clothes/images/105.jpg
./simple_clothes/images/106.jpg
./simple_clothes/images/107.jpg
./simple_clothes/images/108.jpg
./simple_clothes/images/109.jpg
./simple_clothes/images/11.jpg
./simple_clothes/images/110.jpg
./simple_clothes/images/111.jpg
./simple_clothes/images/112.jpg
./simple_clothes/images/113.jpg
./simple_clothes/images/114.jpg
./simple_clothes/images/115.jpg
./simple_clothes/images/116.jpg
./simple_clothes/images/117.jpg
./simp

In [0]:
cd /content/maskrcnn-benchmark

/content/maskrcnn-benchmark


In [0]:
# modified from maskrcnn-benchmark/tools/train_net.py

from maskrcnn_benchmark.utils.env import setup_environment  # noqa F401 isort:skip

import argparse
import os

import torch
from maskrcnn_benchmark.config import cfg
from maskrcnn_benchmark.data import make_data_loader
from maskrcnn_benchmark.solver import make_lr_scheduler
from maskrcnn_benchmark.solver import make_optimizer
from maskrcnn_benchmark.engine.inference import inference
from maskrcnn_benchmark.engine.trainer import do_train
from maskrcnn_benchmark.modeling.detector import build_detection_model
from maskrcnn_benchmark.utils.checkpoint import DetectronCheckpointer
from maskrcnn_benchmark.utils.collect_env import collect_env_info
from maskrcnn_benchmark.utils.comm import synchronize, get_rank
from maskrcnn_benchmark.utils.imports import import_file
from maskrcnn_benchmark.utils.logger import setup_logger
from maskrcnn_benchmark.utils.miscellaneous import mkdir


def train(cfg, local_rank, distributed):
    model = build_detection_model(cfg)
    device = torch.device(cfg.MODEL.DEVICE)
    model.to(device)

    optimizer = make_optimizer(cfg, model)
    scheduler = make_lr_scheduler(cfg, optimizer)

    if distributed:
        model = torch.nn.parallel.DistributedDataParallel(
            model, device_ids=[local_rank], output_device=local_rank,
            # this should be removed if we update BatchNorm stats
            broadcast_buffers=False,
        )

    arguments = {}
    arguments["iteration"] = 0

    output_dir = cfg.OUTPUT_DIR

    save_to_disk = get_rank() == 0
    checkpointer = DetectronCheckpointer(
        cfg, model, optimizer, scheduler, output_dir, save_to_disk
    )
    extra_checkpoint_data = checkpointer.load(cfg.MODEL.WEIGHT)
    arguments.update(extra_checkpoint_data)

    data_loader = make_data_loader(
        cfg,
        is_train=True,
        is_distributed=distributed,
        start_iter=arguments["iteration"],
    )

    checkpoint_period = cfg.SOLVER.CHECKPOINT_PERIOD

    do_train(
        model,
        data_loader,
        optimizer,
        scheduler,
        checkpointer,
        device,
        checkpoint_period,
        arguments,
    )

    return model


def test(cfg, model, distributed):
    if distributed:
        model = model.module
    torch.cuda.empty_cache()  # TODO check if it helps
    iou_types = ("bbox",)
    if cfg.MODEL.MASK_ON:
        iou_types = iou_types + ("segm",)
    output_folders = [None] * len(cfg.DATASETS.TEST)
    dataset_names = cfg.DATASETS.TEST
    if cfg.OUTPUT_DIR:
        for idx, dataset_name in enumerate(dataset_names):
            output_folder = os.path.join(cfg.OUTPUT_DIR, "inference", dataset_name)
            mkdir(output_folder)
            output_folders[idx] = output_folder
    data_loaders_val = make_data_loader(cfg, is_train=False, is_distributed=distributed)
    for output_folder, dataset_name, data_loader_val in zip(output_folders, dataset_names, data_loaders_val):
        inference(
            model,
            data_loader_val,
            dataset_name=dataset_name,
            iou_types=iou_types,
            box_only=cfg.MODEL.RPN_ONLY,
            device=cfg.MODEL.DEVICE,
            expected_results=cfg.TEST.EXPECTED_RESULTS,
            expected_results_sigma_tol=cfg.TEST.EXPECTED_RESULTS_SIGMA_TOL,
            output_folder=output_folder,
        )
        synchronize()


def main():
    config_file = "./datasets/simple_clothes/e2e_faster-rcnn_colab.yaml"
    #config_file = "/content/e2e_faster-rcnn_colab.yaml"
    skip_test = False
    local_rank = 0
    opts = None

    num_gpus = int(os.environ["WORLD_SIZE"]) if "WORLD_SIZE" in os.environ else 1
    #args.distributed = num_gpus > 1
    distributed = num_gpus > 1
    
    if distributed:
        torch.cuda.set_device(local_rank)
        torch.distributed.init_process_group(
            backend="nccl", init_method="env://"
        )
        synchronize()

    cfg.merge_from_file(config_file)
    #cfg.merge_from_list(opts)
    cfg.freeze()

    output_dir = cfg.OUTPUT_DIR
    if output_dir:
        mkdir(output_dir)

    logger = setup_logger("maskrcnn_benchmark", output_dir, get_rank())
    logger.info("Using {} GPUs".format(num_gpus))
    #logger.info(args)

    logger.info("Collecting env info (might take some time)")
    logger.info("\n" + collect_env_info())

    logger.info("Loaded configuration file {}".format(config_file))
    with open(config_file, "r") as cf:
        config_str = "\n" + cf.read()
        logger.info(config_str)
    logger.info("Running with config:\n{}".format(cfg))

    model = train(cfg, local_rank, distributed)

    if not skip_test:
        test(cfg, model, distributed)


if __name__ == "__main__":
    main()

2019-02-03 23:40:48,239 maskrcnn_benchmark.trainer INFO: eta: 1:27:57  iter: 20  loss: 1.2624 (1.5991)  loss_classifier: 0.4663 (0.7930)  loss_box_reg: 0.1230 (0.1308)  loss_objectness: 0.5927 (0.5922)  loss_rpn_box_reg: 0.0629 (0.0831)  time: 1.9984 (2.0454)  data: 0.0033 (0.0313)  lr: 0.000359  max mem: 3639
2019-02-03 23:41:29,064 maskrcnn_benchmark.trainer INFO: eta: 1:27:10  iter: 40  loss: 0.8061 (1.2615)  loss_classifier: 0.2619 (0.5555)  loss_box_reg: 0.1177 (0.1260)  loss_objectness: 0.3859 (0.4943)  loss_rpn_box_reg: 0.0746 (0.0858)  time: 2.0410 (2.0434)  data: 0.0034 (0.0176)  lr: 0.000385  max mem: 3639
2019-02-03 23:42:09,865 maskrcnn_benchmark.trainer INFO: eta: 1:26:27  iter: 60  loss: 0.6592 (1.1172)  loss_classifier: 0.2333 (0.4629)  loss_box_reg: 0.1230 (0.1296)  loss_objectness: 0.2476 (0.4282)  loss_rpn_box_reg: 0.0551 (0.0965)  time: 2.0410 (2.0422)  data: 0.0034 (0.0130)  lr: 0.000412  max mem: 3639
2019-02-03 23:42:50,727 maskrcnn_benchmark.trainer INFO: eta: 1:

100%|██████████| 67/67 [00:32<00:00,  2.08it/s]

2019-02-04 01:08:20,712 maskrcnn_benchmark.inference INFO: Total inference time: 0:00:32.859985 (0.4904475354436618 s / img per device, on 1 devices)
2019-02-04 01:08:20,725 maskrcnn_benchmark.inference INFO: Preparing results for COCO format
2019-02-04 01:08:20,726 maskrcnn_benchmark.inference INFO: Preparing bbox results
2019-02-04 01:08:20,740 maskrcnn_benchmark.inference INFO: Evaluating predictions





Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.13s).
Accumulating evaluation results...
DONE (t=0.05s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.152
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.405
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.075
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.095
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.152
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.246
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.321
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.321
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=10