# **Satellite imagery Object Detection in FAIR1M Dataset Using MMrotate**
- **Author: 임새란**
- **Date: 2022/08/20**
- **Description: mmrotate를 사용하여 fair1m 데이터셋의 object detection 진행하는 코드입니다.**
- **목차** 
  1. Prepare Required Libraries 
  2. Prepare Customized evaluation metric
  3. Prepare Customized Dataset
  4. Prepare customized models
  5. Model Training


## **1. Prepare Required Libraries**
---

In [None]:
######### MM 라이브러리 설치 ##############
 
!pip3 install -U openmim #openmim: mm패키지 호환성 문제 해결해주는 라이브러리
!mim install mmcv-full==1.6.0
!mim install mmdet
!mim install mmengine

!git clone https://github.com/open-mmlab/mmrotate.git
%cd /content/mmrotate
!pip install -r requirements/build.txt
!pip install -e .

######### wandb 연동 ##############

!pip install wandb -qU
import wandb
wandb.login()


######### MM 라이브러리 설치 확인 ##############

# Check mmcv installation
from mmcv import collect_env
collect_env()
from mmcv.ops import get_compiling_cuda_version, get_compiler_version
from tqdm import tqdm 

print(get_compiling_cuda_version())
print(get_compiler_version())

# Check MMDetection installation=99
import mmdet
print(mmdet.__version__)

# Check MMengine installation
import mmengine
print(mmengine.__version__)

# Check MMRotate installation
import mmrotate
print(mmrotate.__version__)

# Give permission to scripts
!chmod 775 /content/mmrotate


import os
os.environ['CUDA_LAUNCH_BLOCKING'] = "1"
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting openmim
  Downloading openmim-0.3.2-py2.py3-none-any.whl (50 kB)
[K     |████████████████████████████████| 50 kB 1.6 MB/s 
[?25hCollecting model-index
  Downloading model_index-0.1.11-py3-none-any.whl (34 kB)
Collecting rich
  Downloading rich-12.5.1-py3-none-any.whl (235 kB)
[K     |████████████████████████████████| 235 kB 4.0 MB/s 
[?25hCollecting colorama
  Downloading colorama-0.4.5-py2.py3-none-any.whl (16 kB)
Collecting ordered-set
  Downloading ordered_set-4.1.0-py3-none-any.whl (7.6 kB)
Collecting commonmark<0.10.0,>=0.9.0
  Downloading commonmark-0.9.1-py2.py3-none-any.whl (51 kB)
[K     |████████████████████████████████| 51 kB 7.4 MB/s 
[?25hInstalling collected packages: ordered-set, commonmark, rich, model-index, colorama, openmim
Successfully installed colorama-0.4.5 commonmark-0.9.1 model-index-0.1.11 openmim-0.3.2 ordered-set-4.1.0 rich-12.5.1
Looking in in

<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


11.3
GCC 9.3
2.25.2
0.1.0
0.3.2


In [30]:
os.chdir('/content/mmrotate/configs/oriented_rcnn')
!mim download mmrotate --config oriented_rcnn_r50_fpn_1x_dota_le90 --dest .
os.chdir('/content')

processing oriented_rcnn_r50_fpn_1x_dota_le90...
[32moriented_rcnn_r50_fpn_1x_dota_le90-6d2b2ce0.pth exists in /content/mmrotate/configs/oriented_rcnn[0m
[32mSuccessfully dumped oriented_rcnn_r50_fpn_1x_dota_le90.py to /content/mmrotate/configs/oriented_rcnn[0m


## **2. Prepare Customized Evaluation Metric**
---

- 새로운 평가지표 정의: mAP 와 F1Score 를 구하는 코드를 작성하여 `eval_map_f1score.py` 생성합니다.
- 모듈 가져오기: 새로 생성한 평가지표를 사용하기 위하여 `__init__.py`를 수정합니다.


### **2-1. Define a new Evaluation Metric**


In [32]:
########## eval_map ############

eval_map = '''

# Copyright (c) OpenMMLab. All rights reserved.
from multiprocessing import get_context

import numpy as np
import torch
from mmcv.ops import box_iou_rotated
from mmcv.utils import print_log
from mmdet.core import average_precision
from terminaltables import AsciiTable

# tpfp 가려내는 함수 -> 밑에선 클래스 별로 계산함
def tpfp_default(det_bboxes,
                 gt_bboxes,
                 gt_bboxes_ignore=None,
                 iou_thr=0.5,
                 area_ranges=None):
    """Check if detected bboxes are true positive or false positive.

    Args:
        det_bboxes (ndarray): Detected bboxes of this image, of shape (m, 6).
        gt_bboxes (ndarray): GT bboxes of this image, of shape (n, 5).
        gt_bboxes_ignore (ndarray): Ignored gt bboxes of this image,
            of shape (k, 5). Default: None
        iou_thr (float): IoU threshold to be considered as matched.
            Default: 0.5.
        area_ranges (list[tuple] | None): Range of bbox areas to be evaluated,
            in the format [(min1, max1), (min2, max2), ...]. Default: None.

    Returns:
        tuple[np.ndarray]: (tp, fp) whose elements are 0 and 1. The shape of
            each array is (num_scales, m).
    """

    # an indicator of ignored gts
    det_bboxes = np.array(det_bboxes)
    gt_ignore_inds = np.concatenate(
        # gt_bboxes는 밑에서 결과에 따라 0,1로 바꾸는거고. gt_bboxes_ignore은 우선 ignore 이니까 다 1로 해놓음
        (np.zeros(gt_bboxes.shape[0], dtype=np.bool),
         np.ones(gt_bboxes_ignore.shape[0], dtype=np.bool)))
    # stack gt_bboxes and gt_bboxes_ignore for convenience
    gt_bboxes = np.vstack((gt_bboxes, gt_bboxes_ignore))

    num_dets = det_bboxes.shape[0]
    num_gts = gt_bboxes.shape[0]
    if area_ranges is None:
        area_ranges = [(None, None)]
    num_scales = len(area_ranges) # num_scales == 1 is None else 2???
    # tp and fp are of shape (num_scales, num_gts), each row is tp or fp of
    # a certain scale [[ 0, 이 num_dets개 ]]
    tp = np.zeros((num_scales, num_dets), dtype=np.float32)
    fp = np.zeros((num_scales, num_dets), dtype=np.float32)

####################### 이미지에 gt가 없음 ########################################
    # 이번 이미지에서 gt가 없으면 det전부를 fp로 변경
    if gt_bboxes.shape[0] == 0:
        if area_ranges == [(None, None)]:
            # fp 전체를 1로 바꿈
            fp[...] = 1
        else:
            raise NotImplementedError
        return tp, fp

######################## 이미지에 gt가 있음 #########################################
    # 이미지에 gt가 있는 상태에서 iou에따른 tp, fp 계산
    ious = box_iou_rotated(
        torch.from_numpy(det_bboxes).float(),
        torch.from_numpy(gt_bboxes).float()).numpy()

    # for each det, the max iou with all gts
    # 열 중에서(axis = 1) 큰 원소를 반환
    ious_max = ious.max(axis=1)
    # for each det, which gt overlaps most with it
    # 열 중에서(axis = 1) 큰 원소의 인덱스를 반환
    ious_argmax = ious.argmax(axis=1)
    # sort all dets in descending order by scores 스코어 제일 높은 값 sort_inds??
    sort_inds = np.argsort(-det_bboxes[:, -1]) # shape of det_bboxes : (m, 6)
    for k, (min_area, max_area) in enumerate(area_ranges):
        gt_covered = np.zeros(num_gts, dtype=bool)
        # area_ranges가 지정되지 않은 경우 gt_area_ignore는 모두 False입니다.
        if min_area is None:
            gt_area_ignore = np.zeros_like(gt_ignore_inds, dtype=bool)
        else:
            raise NotImplementedError
        for i in sort_inds:
            # 스코어 높은 값을 하나씩 꺼내서 그 값이 IOU thres 보다 높으면 그걸로 gt랑 매칭함
            if ious_max[i] >= iou_thr:
                matched_gt = ious_argmax[i]
                if not (gt_ignore_inds[matched_gt]
                        or gt_area_ignore[matched_gt]):
                    if not gt_covered[matched_gt]:
                        gt_covered[matched_gt] = True
                    # 매칭한 gt가 ignore이 아니고 gts에 있으면 tp / 아니면 fp
                        tp[k, i] = 1
                    else:
                        fp[k, i] = 1
                # otherwise ignore this detected bbox, tp = 0, fp = 0
            elif min_area is None:
                fp[k, i] = 1
            else:
                bbox = det_bboxes[i, :5]
                area = bbox[2] * bbox[3]
                if area >= min_area and area < max_area:
                    fp[k, i] = 1
    return tp, fp


# 디텍팅한 결과값을 가져와서 우리가 원하는 클래스에 있으면, 클래스 별로 
def get_cls_results(det_results, annotations, class_id):
    """Get det results and gt information of a certain class.

    Args:
        det_results (list[list]): 
            det_results (list[list]): [[cls1_det, cls2_det, ...], ...].
            The outer list indicates images, 
            and the inner list indicates per-class detected bboxes.
            => len(det_results) == 이미지 갯수, len(det_results[0]) == 클래스별로 감지된 bbox 갯수 
        annotations (list[dict]): Same as `eval_map()`.
        class_id (int): ID of a specific class.

    Returns:
        tuple[list[np.ndarray]]: detected bboxes, gt bboxes, ignored gt bboxes
    """
    # 디텍트 결과에서 이미지별 클래스 뽑아냄
    cls_dets = [img_res[class_id] for img_res in det_results]

    cls_gts = []
    cls_gts_ignore = []
    for ann in annotations:
        #어노테이션의 라벨이 클래스 id에 있으면 그 박스들을 cls_gt에 넣음
        gt_inds = ann['labels'] == class_id
        cls_gts.append(ann['bboxes'][gt_inds, :])
        # labels_ignore 값이 있으면 그 클래스 id는 cls_gts_ignore 값에 넣음
        if ann.get('labels_ignore', None) is not None:
            ignore_inds = ann['labels_ignore'] == class_id
            cls_gts_ignore.append(ann['bboxes_ignore'][ignore_inds, :])

        else:
            cls_gts_ignore.append(torch.zeros((0, 5), dtype=torch.float64))

    #cls_dets : 총 디텍트 결과, cls_gts : 디텍팅 결과에서 지정 클래스인것, cls_gts_ignore : 무시 값 
    return cls_dets, cls_gts, cls_gts_ignore


def eval_rbbox_map(det_results,
                   annotations,
                   scale_ranges=None,
                   iou_thr=0.5,
                   use_07_metric=True,
                   dataset=None,
                   logger=None,
                   nproc=4):
    """Evaluate mAP of a rotated dataset.

    Args:
        det_results (list[list]): 이미지 마다 디텍팅한 클래스 별 bbox
            [[cls1_det, cls2_det, ...], ...].
            The outer list indicates images, and the inner list indicates
            per-class detected bboxes.
        annotations (list[dict]): gt annotation an image
        Ground truth annotations where each item of the list indicates an image. 
        Keys of annotations are:
            - `bboxes`: numpy array of shape (n, 5) => n개가 5포인트로
            - `labels`: numpy array of shape (n, ) => n개의 클래스
            - `bboxes_ignore` (optional): numpy array of shape (k, 5) => 무시할 비박스
            - `labels_ignore` (optional): numpy array of shape (k, ) => 무시할 클래스

        scale_ranges (list[tuple] | None): 계산할 size min-max 지정
            Range of scales to be evaluated, in the format [(min1, max1), (min2, max2), ...]. 
            A range of (32, 64) means the area range between (32**2, 64**2).
            Default: None.

        iou_thr (float): IoU threshold to be considered as matched.
            Default: 0.5.

        use_07_metric (bool): Whether to use the voc07 metric.
        dataset (list[str] | str | None): Dataset name or dataset classes,
            there are minor differences in metrics for different datasets, e.g.
            "voc07", "imagenet_det", etc. Default: None.

        logger (logging.Logger | str | None): The way to print the mAP
            summary. See `mmcv.utils.print_log()` for details. Default: None.
        nproc (int): Processes used for computing TP and FP.
            Default: 4.

    Returns:
        tuple: (mAP, [dict, dict, ...])
    """
    #이미지수랑 어노테이션 수 안맞으면 오류!
    assert len(det_results) == len(annotations)


    num_imgs = len(det_results)
    num_scales = len(scale_ranges) if scale_ranges is not None else 1
    num_classes = len(det_results[0])  # positive class num // len(det_results[0]) == 클래스별로 감지된 bbox 갯수 
    area_ranges = ([(rg[0]**2, rg[1]**2) for rg in scale_ranges]
                   if scale_ranges is not None else None)

    pool = get_context('spawn').Pool(nproc)
    eval_results = []
    # 클래스 별로 계산
    for i in range(num_classes):
        # get gt and det bboxes of this class
        #cls_dets : 총 디텍트 결과, cls_gts : 디텍팅 결과에서 지정 클래스인것, cls_gts_ignore : 무시 값 
        cls_dets, cls_gts, cls_gts_ignore = get_cls_results(
            det_results, annotations, i)

        # compute tp and fp for each image with multiple processes
        tpfp = pool.starmap(
            tpfp_default,
            zip(cls_dets, cls_gts, cls_gts_ignore,
                [iou_thr for _ in range(num_imgs)],
                [area_ranges for _ in range(num_imgs)]))
        tp, fp = tuple(zip(*tpfp))

        # calculate gt number of each scale
        # ignored gts or gts beyond the specific scale are not counted
        # num_gts : area_ranges 없으면 디텍팅 결과에서 지정 클래스인 수, 있으면 사이즈에 맞는 수만 들어감
        num_gts = np.zeros(num_scales, dtype=int)
        for _, bbox in enumerate(cls_gts):
            if area_ranges is None:
                num_gts[0] += bbox.shape[0]
            else:
                gt_areas = bbox[:, 2] * bbox[:, 3]
                for k, (min_area, max_area) in enumerate(area_ranges):
                    num_gts[k] += np.sum((gt_areas >= min_area)
                                         & (gt_areas < max_area))
        # sort all det bboxes by score, also sort tp and fp
        cls_dets = np.vstack(cls_dets)
        num_dets = cls_dets.shape[0]
        sort_inds = np.argsort(-cls_dets[:, -1])
        tp = np.hstack(tp)[:, sort_inds]
        fp = np.hstack(fp)[:, sort_inds]

        # calculate recall and precision with tp and fp
        tp = np.cumsum(tp, axis=1)
        fp = np.cumsum(fp, axis=1)
        eps = np.finfo(np.float32).eps
        recalls = tp / np.maximum(num_gts[:, np.newaxis], eps) # num_gts가 gt수니까
        precisions = tp / np.maximum((tp + fp), eps)

        # calculate AP
        # scale_ranges 지정 되어 있으면 rec, prec, num_gts 값 변경
        if scale_ranges is None:
            recalls = recalls[0, :]
            precisions = precisions[0, :]
            num_gts = num_gts.item()
        mode = 'area' if not use_07_metric else '11points'
        ap = average_precision(recalls, precisions, mode)
        # calculate F1 추가
        f1_score = 2 * precisions * recalls / np.maximum((precisions+recalls),eps)
        f1_score = f1_score.mean()

        eval_results.append({
            'num_gts': num_gts,
            'num_dets': num_dets,
            'recall': recalls,
            'precision': precisions,
            'ap': ap,
            'F1' : f1_score
        })
    pool.close()

    # 클래스 별로 계산끝@ eval_results에 클래스별로 값 들어가있음
    
    # 만약 scale_ranges 가 지정되어 있으면,
    if scale_ranges is not None:
        # shape (num_classes, num_scales)
        ##### mF1 score 추가
        all_ap = np.vstack([cls_result['ap'] for cls_result in eval_results])
        all_f1 = np.vstack([cls_result['F1'] for cls_result in eval_results])
        all_num_gts = np.vstack(
            [cls_result['num_gts'] for cls_result in eval_results])
        mean_ap = []
        mean_f1 = []
        # 스케일별로 mean값 뽑아서 mean_리스트에 넣음 [1클래스 ap 평균, 2클래스 ap 평균, ...]
        for i in range(num_scales):
            if np.any(all_num_gts[:, i] > 0):
                mean_ap.append(all_ap[all_num_gts[:, i] > 0, i].mean())
                mean_f1.append(all_f1[all_num_gts[:, i] > 0, i].mean())
            else:
                mean_ap.append(0.0)
                mean_f1.append(0.0)

    # scale_ranges 없으면 클래스별로 리스트에 넣어서 평균냄
    else:
        aps = []
        f1s = []

        for cls_result in eval_results:
            # 클래스의 gt가 있으면, 각 클래스별 ap, f1값 각각 리스트에 넣음 aps = [1클래스 ap, 2클래스 ap, ...]
            if cls_result['num_gts'] > 0:
                aps.append(cls_result['ap'])
                f1s.append(cls_result['F1'])

        mean_ap = np.array(aps).mean() if aps else 0.0
        mean_f1 = np.array(f1s).mean() if f1s else 0.0

    print_map_summary(
        mean_ap, mean_f1, eval_results, dataset, area_ranges, logger=logger)

    return mean_ap, mean_f1, eval_results
    
    # eval_results에 클래스별로 걊들어가있음

def print_map_summary(mean_ap,
                      mean_f1,
                      results,
                      dataset=None,
                      scale_ranges=None,
                      logger=None):

    """Print mAP and results of each class.

    A table will be printed to show the gts/dets/recall/AP/F1!!!!! of each class and
    the mAP,mF1.

    Args:
        mean_ap (float): Calculated from `eval_map()`.
        mean_f1 (list): Calculated from `eval_map()`. 
            mean_f1 = [weighted_f1, macro_f1]
        results (list[dict]): Calculated from `eval_map()`.
        dataset (list[str] | str | None): Dataset name or dataset classes.
        scale_ranges (list[tuple] | None): Range of scales to be evaluated.
        logger (logging.Logger | str | None): The way to print the mAP
            summary. See `mmcv.utils.print_log()` for details. Default: None.
    """

    if logger == 'silent':
        return

    # scale_ranges 있었는지 확인하는 듯
    if isinstance(results[0]['ap'], np.ndarray):
        num_scales = len(results[0]['ap']) # 클래스 1의 ap?
    else:
        num_scales = 1

    # scale_ranges가 있으면, 계산할때랑 맞는지 확인함
    if scale_ranges is not None:
        assert len(scale_ranges) == num_scales

    num_classes = len(results)

    # 각 값들 (0,0,0)으로 세팅
    recalls = np.zeros((num_scales, num_classes), dtype=np.float32) 
    precisions = np.zeros((num_scales, num_classes), dtype=np.float32) 
    aps = np.zeros((num_scales, num_classes), dtype=np.float32)
    f1s = np.zeros((num_scales, num_classes), dtype=np.float32)
    num_gts = np.zeros((num_scales, num_classes), dtype=int)
    for i, cls_result in enumerate(results):
        #클래스별로 꺼내는데, recall값이 있으면,
        if cls_result['recall'].size > 0:
            recalls[:, i] = np.array(cls_result['recall'], ndmin=2)[:, -1]
            precisions[:, i] = np.array(cls_result['precision'], ndmin=2)[:, -1]
        aps[:, i] = cls_result['ap']
        f1s[:, i] = cls_result['F1']
        num_gts[:, i] = cls_result['num_gts']

    if dataset is None:
        label_names = [str(i) for i in range(num_classes)]
    else:
        label_names = dataset

    if not isinstance(mean_ap, list):
        mean_ap = [mean_ap]

    if not isinstance(mean_f1, list):
        mean_f1 = [mean_f1]

    header = ['class', 'gts', 'dets', 'recall', 'precision', 'ap', 'F1']
    for i in range(num_scales):
        if scale_ranges is not None:
            print_log(f'Scale range {scale_ranges[i]}', logger=logger)
        table_data = [header]
        for j in range(num_classes):
            row_data = [
                label_names[j], num_gts[i, j], results[j]['num_dets'],
                f'{recalls[i, j]:.3f}', f'{precisions[i, j]:.3f}', f'{aps[i, j]:.3f}', f'{f1s[i, j]:.3f}'
            ]
            table_data.append(row_data)
        append_data = ['mAP, mF1', '', '', '','', f'{mean_ap[i]:.3f}', f'{mean_f1[i]:.3f}']
        table_data.append(append_data)
        table = AsciiTable(table_data)
        table.inner_footing_row_border = True
        print_log('\\n' + table.table, logger=logger)

'''

eval_map_path = '/content/mmrotate/mmrotate/core/evaluation/eval_map.py'

f = open(eval_map_path, "w")
f.write(eval_map)
f.close()

## **3. Prepare Customized Dataset**
---


MMROTATE를 사용하기위해서는 데이터셋의 형태를 DOTA 포맷으로 변환하고, 디렉토리 구조를 아래와 같이 설정을 하는게 좋습니다. 


```
mmrotate
├── mmrotate
├── tools
├── configs
├── data
│   ├── fair1m
│   │   ├── train
│   │   │   ├── images
│   │   │   ├── annfiles
│   │   ├── val
│   │   │   ├── images
│   │   │   ├── annfiles

```
설정방법은 아래 코드를 순서대로 실행하면 됩니다.

**만약 원본 FAIR1M 데이터셋을 압축해제하고, 파일을 DOTA 형식으로 변환 및 디렉토리 구조까지 설정완료했다면 이 부분을 건너뛰고 2-4 .Define a new custom dataset 부분으로 넘어기세욤~!**





### **3-1. Unzip Fair1m Dataset**






In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
############# 압축해제하는 함수 정의 #################

from zipfile import ZipFile
import os
import time

def extract_zip(zip_path,output_dir= None, del_zip = True):
    """
    Extract .zip File
    Args
            - zip_path(str) : The path of zip file  
            - output_dir(str) : The directory where extracted file will be stored. (Default: None)
                                if None, it will parent path of the zip file.
            - del_zip(Bool): If True, zip file will be deleted. (Default: True)
    """
    output_dir = os.path.dirname(zip_path) if output_dir == None else output_dir 
    os.makedirs(output_dir, exist_ok=True) 

    try: 
      with ZipFile(zip_path, 'r') as zip:
          print('=======',zip_path,': Unzip Start! ======')
          start_time = time.time()
          zip.extractall(output_dir) 
          zip.close()
          print(zip_path,': Unzip Complete')
          print(" Unzip Time: {:.4f}sec".format((time.time() - start_time)))

          if del_zip:   
            os.remove(zip_path) 
            print(zip_path,': Delete Complete')
            
    except Exception as e:
        print(zip_path, ": 압축해제 오류", e)



In [None]:
############# unzip 실행하기 ###############



dataset_path = '/content/drive/MyDrive/Colab Notebooks/data/fair1m'
train1_img_path = os.path.join(dataset_path,'train_part1_images.zip')
train1_ann_path = os.path.join(dataset_path,'train_part1_labelXml.zip')
train2_img_path = os.path.join(dataset_path,'train_part2_images.zip')
train2_ann_path = os.path.join(dataset_path,'train_part2_labelXml.zip')
validation_img_path = os.path.join(dataset_path,'validation_images.zip')
validation_ann_path = os.path.join(dataset_path,'validation_labelXml.zip')

fair1m_dir = '/content/mmrotate/data/fair1m'

zip_file_list = [train1_img_path, train1_ann_path, train2_img_path, train2_ann_path, validation_img_path, validation_ann_path]


# zip_file_list = [train1_img_path, train1_ann_path]
for file in tqdm(zip_file_list):
# for file in tqdm(os.listdir(dataset_path)):
  output_dir = os.path.join(fair1m_dir,'train' if 'train' in file else 'val')
  extract_zip(file,output_dir= output_dir, del_zip = False)



  0%|          | 0/6 [00:00<?, ?it/s]



 17%|█▋        | 1/6 [01:32<07:43, 92.62s/it]

/content/drive/MyDrive/Colab Notebooks/data/fair1m/train_part1_images.zip : Unzip Complete
 Unzip Time: 87.2952sec


 33%|███▎      | 2/6 [01:36<02:40, 40.19s/it]

/content/drive/MyDrive/Colab Notebooks/data/fair1m/train_part1_labelXml.zip : Unzip Complete
 Unzip Time: 1.4274sec


 50%|█████     | 3/6 [09:44<12:14, 244.78s/it]

/content/drive/MyDrive/Colab Notebooks/data/fair1m/train_part2_images.zip : Unzip Complete
 Unzip Time: 485.2872sec


 67%|██████▋   | 4/6 [09:51<05:02, 151.01s/it]

/content/drive/MyDrive/Colab Notebooks/data/fair1m/train_part2_labelXml.zip : Unzip Complete
 Unzip Time: 2.8344sec


 83%|████████▎ | 5/6 [18:05<04:34, 274.81s/it]

/content/drive/MyDrive/Colab Notebooks/data/fair1m/validation_images.zip : Unzip Complete
 Unzip Time: 492.2228sec


100%|██████████| 6/6 [18:09<00:00, 181.61s/it]

/content/drive/MyDrive/Colab Notebooks/data/fair1m/validation_labelXml.zip : Unzip Complete
 Unzip Time: 1.9036sec





### **3-2. Convert Fair1m to Dota Format**

#### **3-2-1. Convert image extension: tif to png**

In [None]:
############# tif를 png로 변환하는 함수 정의 #################

import os
from pathlib import Path
from xml.etree import ElementTree as ET


def convert_tif_to_png(tif_path, output_dir= None):
    """
    Args
            - tif_path   : path of tif file (ex. './images/test.tif')
            - output_dir : directoty where converted file (.png) will be stored. (Default: None)
                           if None, tif will be replaced to png in the existing directory.
    """
    tif_file_name = os.path.basename(tif_path)
    output_dir = os.path.dirname(tif_path) if output_dir == None else output_dir 
    os.makedirs(output_dir, exist_ok=True) 
    output_file_name = tif_file_name.replace('.tif','.png')
    output_path = os.path.join(output_dir,output_file_name)

    try:
      if tif_file_name.endswith('.tif'):
        os.replace(tif_path, output_path)

    except Exception as e:
        print("PNG -> JPG 변환 오류", e)



In [None]:
############# convert_tif_to_png 함수 실행 #################



fair1m_dir = '/content/mmrotate/data/fair1m'

for dir in ['train','val']:
    tif_dir = f'{fair1m_dir}/{dir}/images'
    tif_files = os.listdir(tif_dir)

    for tif_file in tqdm(tif_files):
        tif_path = os.path.join(tif_dir,tif_file)
        convert_tif_to_png(tif_path=tif_path)




100%|██████████| 16488/16488 [00:00<00:00, 23578.16it/s]
100%|██████████| 8287/8287 [00:00<00:00, 23993.46it/s]


#### **3-2-2. Convert annotation: xml to txt(dota format)**

In [None]:
############# xml을 dota annotation 형식의 txt로 변환하는 함수 정의 #################

from pathlib import Path
from xml.etree import ElementTree as ET
def convert_xml_to_txt(xml_path, output_dir= None, convet_cls= True):
    """
    Args
            - xml_path   : path of xml file (ex. './labelXml/test.xml')
            - output_dir : directoty where converted file (.txt) will be stored. (Default: None)
            - convet_cls : (Bool) if True, class will be converted to a parent class. (Default: True)
    """
    
    classes = {
        "Passenger Ship": {"id": 0, "category": "Ship"},
        "Motorboat": {"id": 1, "category": "Ship"},
        "Fishing Boat": {"id": 2, "category": "Ship"},
        "Tugboat": {"id": 3, "category": "Ship"},
        "other-ship": {"id": 4, "category": "Ship"},
        "Engineering Ship": {"id": 5, "category": "Ship"},
        "Liquid Cargo Ship": {"id": 6, "category": "Ship"},
        "Dry Cargo Ship": {"id": 7, "category": "Ship"},
        "Warship": {"id": 8, "category": "Ship"},
        "Small Car": {"id": 9, "category": "Vehicle"},
        "Bus": {"id": 10, "category": "Vehicle"},
        "Cargo Truck": {"id": 11, "category": "Vehicle"},
        "Dump Truck": {"id": 12, "category": "Vehicle"},
        "other-vehicle": {"id": 13, "category": "Vehicle"},
        "Van": {"id": 14, "category": "Vehicle"},
        "Trailer": {"id": 15, "category": "Vehicle"},
        "Tractor": {"id": 16, "category": "Vehicle"},
        "Excavator": {"id": 17, "category": "Vehicle"},
        "Truck Tractor": {"id": 18, "category": "Vehicle"},
        "Boeing737": {"id": 19, "category": "Airplane"},
        "Boeing747": {"id": 20, "category": "Airplane"},
        "Boeing777": {"id": 21, "category": "Airplane"},
        "Boeing787": {"id": 22, "category": "Airplane"},
        "ARJ21": {"id": 23, "category": "Airplane"},
        "C919": {"id": 24, "category": "Airplane"},
        "A220": {"id": 25, "category": "Airplane"},
        "A321": {"id": 26, "category": "Airplane"},
        "A330": {"id": 27, "category": "Airplane"},
        "A350": {"id": 28, "category": "Airplane"},
        "other-airplane": {"id": 29, "category": "Airplane"},
        "Baseball Field": {"id": 30, "category": "Court"},
        "Basketball Court": {"id": 31, "category": "Court"},
        "Football Field": {"id": 32, "category": "Court"},
        "Tennis Court": {"id": 33, "category": "Court"},
        "Roundabout": {"id": 34, "category": "Road"},
        "Intersection": {"id": 35, "category": "Road"},
        "Bridge": {"id": 36, "category": "Road"},
    }


    output_dir = os.path.join(Path(xml_path).parent.parent,'annfiles')  if output_dir == None else output_dir
    os.makedirs(output_dir, exist_ok=True)
    file_name = os.path.splitext(os.path.split(xml_path)[-1])[0]
    output_file = file_name + '.txt'
    output_path = os.path.join(output_dir,output_file)
    
    mydoc = ET.parse(xml_path)
    root = mydoc.getroot()
    objects = root.find('objects')
    items = objects.findall('object')
    label_list = []
    with open(output_path, 'w') as f:
        ann_list = []
        
        for item in items:
            try:
              label = item.find('possibleresult').find('name').text
              
              cat_label = classes[label]['category']  if convet_cls  else  label.replace('-',' ')
              label_list.append(cat_label)
              points = item.find('points')
              points = [[float(item) for item in point.text.split(',')] for point in points.findall('point')]
              x1, y1 = points[0]
              x2, y2 = points[1]
              x3, y3 = points[2]
              x4, y4 = points[3]
              
              ann = [x1, y1, x2, y2, x3, y3, x4, y4, cat_label, 0]
              ann = [str(item) for item in ann]
              ann_list.append(' '.join(ann))

            except Exception as e:
              print(output_path,item, ": annotation 변환 오류: ", e)
                
        f.write('\n'.join(ann_list))
    return list(set(label_list))
        # print(file_name,'변환 완료')

In [None]:
############# convert_xml_to_txt 함수 실행 #################
# 상위클래스로 변환
fair1m_dir = r'/content/mmrotate/data/fair1m'

for dir in ['train','val']:
    xml_dir = f'{fair1m_dir}/{dir}/labelXml'
    xml_files= os.listdir(xml_dir)
    print(f'\n{"-" * 15} {dir} 폴더 작업 진행시작 {"-" * 15}')
    for xml_file in tqdm(xml_files):

        xml_path = os.path.join(xml_dir,xml_file)
        convert_xml_to_txt(xml_path)



--------------- train 폴더 작업 진행시작 ---------------


100%|██████████| 16488/16488 [00:21<00:00, 781.56it/s] 



--------------- validation 폴더 작업 진행시작 ---------------


100%|██████████| 8287/8287 [00:06<00:00, 1271.14it/s]


### **3-3. Crop Fair1m dataset**
이미지 사이즈가 너무 커서 모델에 바로 넣으면 연산이 힘드므로 crop 과정을 거칩니다.
- 다양한 사이즈로 crop을 진행하고, crop size별로 모델성능 비교해보면 좋을것 같습니다. 

순서는 다음과 같습니다. 
- split_config를 작성하고 저장합니다.
- img_split.py를 실행합니다.

In [None]:

########################## split_config 생성하기 ############################
# 이번에 시도해볼 crop 의 조건은 아래와 같습니다.
  #   - size: 1024
  #   - stride(gaps): 256, 
  #   - scales(rates): 1.0 

# 새로 생성할 split_config 파일을 생성하도록 하겠습니다. 
# /content/mmrotate/tools/data/dota/split/split_configs/ss_trainval.json 파일을 참고하여 작성합니다.
# https://github.com/open-mmlab/mmrotate/blob/6eb7a277ed62f64952587c9e41498f8e3c0cfe63/tools/data/dota/README.md


size = 1024
gap = 256
rate = 1.0
img_rate_thr = 0.6
iof_thr = 0.7

train_img_dir = r'/content/mmrotate/data/fair1m/train/images/'
train_ann_dir = r'/content/mmrotate/data/fair1m/train/annfiles/'

save_dir = f"/content/mmrotate/data/fair1m/split_{size}_{str(rate).replace('.','_')}/"
save_ext = '.tif'
split_config_text =  f'''



  "img_dirs": [
    "{train_img_dir}"
  ],
  "ann_dirs": [
    "{train_ann_dir}"
  ],
  "sizes": [
    {size}
  ],
  "gaps": [
    {gap}
  ],
  "rates": [
    {rate}
  ],
  "img_rate_thr": {img_rate_thr},
  "iof_thr": {iof_thr},
  "no_padding": false,
  "padding_value": [
    104,
    116,
    124
  ],
  "save_dir": "{save_dir}",
  "save_ext": "{save_ext}"


'''

split_config_dir = '/content/mmrotate/tools/data/fair1m/split/split_configs'
os.makedirs(split_config_dir, exist_ok=True)
split_config_name =  f"ss_train_{size}_{str(rate).replace('.','_')}.json"
split_config_path =  os.path.join(split_config_dir,split_config_name)
f = open(split_config_path, "w")
f.write(r"{")
f.write(split_config_text)
f.write(r"}")
f.close()

In [None]:
########################## img_split.py 생성하기 ############################


img_split_txt = '''
# Copyright (c) OpenMMLab. All rights reserved.
# Written by jbwang1997
# Reference: https://github.com/jbwang1997/BboxToolkit

import argparse
import codecs
import datetime
import itertools
import json
import logging
import os
import os.path as osp
import time
from functools import partial, reduce
from math import ceil
from multiprocessing import Manager, Pool

import cv2
import numpy as np
from PIL import Image

Image.MAX_IMAGE_PIXELS = None

try:
    import shapely.geometry as shgeo
except ImportError:
    shgeo = None


def add_parser(parser):
    """Add arguments."""
    parser.add_argument(
        '--base-json',
        type=str,
        default=None,
        help='json config file for split images')
    parser.add_argument(
        '--nproc', type=int, default=10, help='the procession number')

    # argument for loading data
    parser.add_argument(
        '--img-dirs',
        nargs='+',
        type=str,
        default=None,
        help='images dirs, must give a value')
    parser.add_argument(
        '--ann-dirs',
        nargs='+',
        type=str,
        default=None,
        help='annotations dirs, optional')

    # argument for splitting image
    parser.add_argument(
        '--sizes',
        nargs='+',
        type=int,
        default=[1024],
        help='the sizes of sliding windows')
    parser.add_argument(
        '--gaps',
        nargs='+',
        type=int,
        default=[512],
        help='the steps of sliding widnows')
    parser.add_argument(
        '--rates',
        nargs='+',
        type=float,
        default=[1.],
        help='same as DOTA devkit rate, but only change windows size')
    parser.add_argument(
        '--img-rate-thr',
        type=float,
        default=0.6,
        help='the minimal rate of image in window and window')
    parser.add_argument(
        '--iof-thr',
        type=float,
        default=0.7,
        help='the minimal iof between a object and a window')
    parser.add_argument(
        '--no-padding',
        action='store_true',
        help='not padding patches in regular size')
    parser.add_argument(
        '--padding-value',
        nargs='+',
        type=int,
        default=[0],
        help='padding value, 1 or channel number')

    # argument for saving
    parser.add_argument(
        '--save-dir',
        type=str,
        default='.',
        help='to save pkl and split images')
    parser.add_argument(
        '--save-ext',
        type=str,
        default='.tif',
        help='the extension of saving images')


def parse_args():
    """Parse arguments."""
    parser = argparse.ArgumentParser(description='Splitting images')
    add_parser(parser)
    args = parser.parse_args()

    if args.base_json is not None:
        with open(args.base_json, 'r') as f:
            prior_config = json.load(f)

        for action in parser._actions:
            if action.dest not in prior_config or \
                    not hasattr(action, 'default'):
                continue
            action.default = prior_config[action.dest]
        args = parser.parse_args()

    # assert arguments
    assert args.img_dirs is not None, "argument img_dirs can't be None"
    assert args.ann_dirs is None or len(args.ann_dirs) == len(args.img_dirs)
    assert len(args.sizes) == len(args.gaps)
    assert len(args.sizes) == 1 or len(args.rates) == 1
    assert args.save_ext in ['.png', '.jpg', 'bmp', '.tif']
    assert args.iof_thr >= 0 and args.iof_thr < 1
    assert args.iof_thr >= 0 and args.iof_thr <= 1
    assert not osp.exists(args.save_dir), \
        f'{osp.join(args.save_dir)} already exists'
    return args


def get_sliding_window(info, sizes, gaps, img_rate_thr):
    """Get sliding windows.
    Args:
        info (dict): Dict of image's width and height.
        sizes (list): List of window's sizes.
        gaps (list): List of window's gaps.
        img_rate_thr (float): Threshold of window area divided by image area.
    Returns:
        list[np.array]: Information of valid windows.
    """
    eps = 0.01
    windows = []
    width, height = info['width'], info['height']
    for size, gap in zip(sizes, gaps):
        assert size > gap, f'invaild size gap pair [{size} {gap}]'
        step = size - gap

        x_num = 1 if width <= size else ceil((width - size) / step + 1)
        x_start = [step * i for i in range(x_num)]
        if len(x_start) > 1 and x_start[-1] + size > width:
            x_start[-1] = width - size

        y_num = 1 if height <= size else ceil((height - size) / step + 1)
        y_start = [step * i for i in range(y_num)]
        if len(y_start) > 1 and y_start[-1] + size > height:
            y_start[-1] = height - size

        start = np.array(
            list(itertools.product(x_start, y_start)), dtype=np.int64)
        stop = start + size
        windows.append(np.concatenate([start, stop], axis=1))
    windows = np.concatenate(windows, axis=0)

    img_in_wins = windows.copy()
    img_in_wins[:, 0::2] = np.clip(img_in_wins[:, 0::2], 0, width)
    img_in_wins[:, 1::2] = np.clip(img_in_wins[:, 1::2], 0, height)
    img_areas = (img_in_wins[:, 2] - img_in_wins[:, 0]) * \
                (img_in_wins[:, 3] - img_in_wins[:, 1])
    win_areas = (windows[:, 2] - windows[:, 0]) * \
                (windows[:, 3] - windows[:, 1])
    img_rates = img_areas / win_areas
    if not (img_rates > img_rate_thr).any():
        max_rate = img_rates.max()
        img_rates[abs(img_rates - max_rate) < eps] = 1
    return windows[img_rates > img_rate_thr]


def poly2hbb(polys):
    """Convert polygons to horizontal bboxes.
    Args:
        polys (np.array): Polygons with shape (N, 8)
    Returns:
        np.array: Horizontal bboxes.
    """
    shape = polys.shape
    polys = polys.reshape(*shape[:-1], shape[-1] // 2, 2)
    lt_point = np.min(polys, axis=-2)
    rb_point = np.max(polys, axis=-2)
    return np.concatenate([lt_point, rb_point], axis=-1)


def bbox_overlaps_iof(bboxes1, bboxes2, eps=1e-6):
    """Compute bbox overlaps (iof).
    Args:
        bboxes1 (np.array): Horizontal bboxes1.
        bboxes2 (np.array): Horizontal bboxes2.
        eps (float, optional): Defaults to 1e-6.
    Returns:
        np.array: Overlaps.
    """
    rows = bboxes1.shape[0]
    cols = bboxes2.shape[0]

    if rows * cols == 0:
        return np.zeros((rows, cols), dtype=np.float32)

    hbboxes1 = poly2hbb(bboxes1)
    hbboxes2 = bboxes2
    hbboxes1 = hbboxes1[:, None, :]
    lt = np.maximum(hbboxes1[..., :2], hbboxes2[..., :2])
    rb = np.minimum(hbboxes1[..., 2:], hbboxes2[..., 2:])
    wh = np.clip(rb - lt, 0, np.inf)
    h_overlaps = wh[..., 0] * wh[..., 1]

    l, t, r, b = [bboxes2[..., i] for i in range(4)]
    polys2 = np.stack([l, t, r, t, r, b, l, b], axis=-1)
    if shgeo is None:
        raise ImportError('Please run "pip install shapely" '
                          'to install shapely first.')
    sg_polys1 = [shgeo.Polygon(p) for p in bboxes1.reshape(rows, -1, 2)]
    sg_polys2 = [shgeo.Polygon(p) for p in polys2.reshape(cols, -1, 2)]
    overlaps = np.zeros(h_overlaps.shape)
    for p in zip(*np.nonzero(h_overlaps)):
        overlaps[p] = sg_polys1[p[0]].intersection(sg_polys2[p[-1]]).area
    unions = np.array([p.area for p in sg_polys1], dtype=np.float32)
    unions = unions[..., None]

    unions = np.clip(unions, eps, np.inf)
    outputs = overlaps / unions
    if outputs.ndim == 1:
        outputs = outputs[..., None]
    return outputs


def get_window_obj(info, windows, iof_thr):
    """
    Args:
        info (dict): Dict of bbox annotations.
        windows (np.array): information of sliding windows.
        iof_thr (float): Threshold of overlaps between bbox and window.
    Returns:
        list[dict]: List of bbox annotations of every window.
    """
    bboxes = info['ann']['bboxes']
    iofs = bbox_overlaps_iof(bboxes, windows)

    window_anns = []
    for i in range(windows.shape[0]):
        win_iofs = iofs[:, i]
        pos_inds = np.nonzero(win_iofs >= iof_thr)[0].tolist()

        win_ann = dict()
        for k, v in info['ann'].items():
            try:
                win_ann[k] = v[pos_inds]
            except TypeError:
                win_ann[k] = [v[i] for i in pos_inds]
        win_ann['trunc'] = win_iofs[pos_inds] < 1
        window_anns.append(win_ann)
    return window_anns


def crop_and_save_img(info, windows, window_anns, img_dir, no_padding,
                      padding_value, save_dir, anno_dir, img_ext):
    """
    Args:
        info (dict): Image's information.
        windows (np.array): information of sliding windows.
        window_anns (list[dict]): List of bbox annotations of every window.
        img_dir (str): Path of images.
        no_padding (bool): If True, no padding.
        padding_value (tuple[int|float]): Padding value.
        save_dir (str): Save filename.
        anno_dir (str): Annotation filename.
        img_ext (str): Picture suffix.
    Returns:
        list[dict]: Information of paths.
    """
    img = cv2.imread(osp.join(img_dir, info['filename']))
    patch_infos = []
    for i in range(windows.shape[0]):
        patch_info = dict()
        for k, v in info.items():
            if k not in ['id', 'fileanme', 'width', 'height', 'ann']:
                patch_info[k] = v

        window = windows[i]
        x_start, y_start, x_stop, y_stop = window.tolist()
        patch_info['x_start'] = x_start
        patch_info['y_start'] = y_start
        patch_info['id'] = info['id'] + '__' + str(x_stop - x_start) + \
            '__' + str(x_start) + '___' + str(y_start)
        patch_info['ori_id'] = info['id']

        ann = window_anns[i]
        ann['bboxes'] = translate(ann['bboxes'], -x_start, -y_start)
        patch_info['ann'] = ann

        patch = img[y_start:y_stop, x_start:x_stop]
        if not no_padding:
            height = y_stop - y_start
            width = x_stop - x_start
            if height > patch.shape[0] or width > patch.shape[1]:
                padding_patch = np.empty((height, width, patch.shape[-1]),
                                         dtype=np.uint8)
                if not isinstance(padding_value, (int, float)):
                    assert len(padding_value) == patch.shape[-1]
                padding_patch[...] = padding_value
                padding_patch[:patch.shape[0], :patch.shape[1], ...] = patch
                patch = padding_patch
        patch_info['height'] = patch.shape[0]
        patch_info['width'] = patch.shape[1]

        cv2.imwrite(osp.join(save_dir, patch_info['id'] + img_ext), patch)
        patch_info['filename'] = patch_info['id'] + img_ext
        patch_infos.append(patch_info)

        bboxes_num = patch_info['ann']['bboxes'].shape[0]
        outdir = os.path.join(anno_dir, patch_info['id'] + '.txt')

        with codecs.open(outdir, 'w', 'utf-8') as f_out:
            if bboxes_num == 0:
                pass
            else:
                for idx in range(bboxes_num):
                    obj = patch_info['ann']
                    outline = ' '.join(list(map(str, obj['bboxes'][idx])))
                    diffs = str(
                        obj['diffs'][idx]) if not obj['trunc'][idx] else '2'
                    outline = outline + ' ' + obj['labels'][idx] + ' ' + diffs
                    f_out.write(outline)
                    f_out.write('')

    return patch_infos


def single_split(arguments, sizes, gaps, img_rate_thr, iof_thr, no_padding,
                 padding_value, save_dir, anno_dir, img_ext, lock, prog, total,
                 logger):
    """
    Args:
        arguments (object): Parameters.
        sizes (list): List of window's sizes.
        gaps (list): List of window's gaps.
        img_rate_thr (float): Threshold of window area divided by image area.
        iof_thr (float): Threshold of overlaps between bbox and window.
        no_padding (bool): If True, no padding.
        padding_value (tuple[int|float]): Padding value.
        save_dir (str): Save filename.
        anno_dir (str): Annotation filename.
        img_ext (str): Picture suffix.
        lock (object): Lock of Manager.
        prog (object): Progress of Manager.
        total (object): Length of infos.
        logger (object): Logger.
    Returns:
        list[dict]: Information of paths.
    """
    info, img_dir = arguments
    windows = get_sliding_window(info, sizes, gaps, img_rate_thr)
    window_anns = get_window_obj(info, windows, iof_thr)
    patch_infos = crop_and_save_img(info, windows, window_anns, img_dir,
                                    no_padding, padding_value, save_dir,
                                    anno_dir, img_ext)
    assert patch_infos

    lock.acquire()
    prog.value += 1
    msg = f'({prog.value / total:3.1%} {prog.value}:{total})'
    msg += ' - ' + f"Filename: {info['filename']}"
    msg += ' - ' + f"width: {info['width']:<5d}"
    msg += ' - ' + f"height: {info['height']:<5d}"
    msg += ' - ' + f"Objects: {len(info['ann']['bboxes']):<5d}"
    msg += ' - ' + f'Patches: {len(patch_infos)}'
    logger.info(msg)
    lock.release()

    return patch_infos


def setup_logger(log_path):
    """Setup logger.
    Args:
        log_path (str): Path of log.
    Returns:
        object: Logger.
    """
    logger = logging.getLogger('img split')
    formatter = logging.Formatter('%(asctime)s - %(message)s')
    now = datetime.datetime.now().strftime('%Y%m%d_%H%M%S')
    log_path = osp.join(log_path, now + '.log')
    handlers = [logging.StreamHandler(), logging.FileHandler(log_path, 'w')]

    for handler in handlers:
        handler.setFormatter(formatter)
        handler.setLevel(logging.INFO)
        logger.addHandler(handler)
    logger.setLevel(logging.INFO)
    return logger


def translate(bboxes, x, y):
    """Map bboxes from window coordinate back to original coordinate.
    Args:
        bboxes (np.array): bboxes with window coordinate.
        x (float): Deviation value of x-axis.
        y (float): Deviation value of y-axis
    Returns:
        np.array: bboxes with original coordinate.
    """
    dim = bboxes.shape[-1]
    translated = bboxes + np.array([x, y] * int(dim / 2), dtype=np.float32)
    return translated


def load_dota(img_dir, ann_dir=None, nproc=10):
    """Load DOTA dataset.
    Args:
        img_dir (str): Path of images.
        ann_dir (str): Path of annotations.
        nproc (int): number of processes.
    Returns:
        list: Dataset's contents.
    """
    assert osp.isdir(img_dir), f'The {img_dir} is not an existing dir!'
    assert ann_dir is None or osp.isdir(
        ann_dir), f'The {ann_dir} is not an existing dir!'

    print('Starting loading DOTA dataset information.')
    start_time = time.time()
    _load_func = partial(_load_dota_single, img_dir=img_dir, ann_dir=ann_dir)
    if nproc > 1:
        pool = Pool(nproc)
        contents = pool.map(_load_func, os.listdir(img_dir))
        pool.close()
    else:
        contents = list(map(_load_func, os.listdir(img_dir)))
    contents = [c for c in contents if c is not None]
    end_time = time.time()
    print(f'Finishing loading DOTA, get {len(contents)} iamges,',
          f'using {end_time - start_time:.3f}s.')

    return contents


def _load_dota_single(imgfile, img_dir, ann_dir):
    """Load DOTA's single image.
    Args:
        imgfile (str): Filename of single image.
        img_dir (str): Path of images.
        ann_dir (str): Path of annotations.
    Returns:
        dict: Content of single image.
    """
    img_id, ext = osp.splitext(imgfile)
    if ext not in ['.jpg', '.JPG', '.png', '.tif', '.bmp']:
        return None

    imgpath = osp.join(img_dir, imgfile)
    size = Image.open(imgpath).size
    txtfile = None if ann_dir is None else osp.join(ann_dir, img_id + '.txt')
    content = _load_dota_txt(txtfile)

    content.update(
        dict(width=size[0], height=size[1], filename=imgfile, id=img_id))
    return content


def _load_dota_txt(txtfile):
    """Load DOTA's txt annotation.
    Args:
        txtfile (str): Filename of single txt annotation.
    Returns:
        dict: Annotation of single image.
    """
    gsd, bboxes, labels, diffs = None, [], [], []
    if txtfile is None:
        pass
    elif not osp.isfile(txtfile):
        print(f"Can't find {txtfile}, treated as empty txtfile")
    else:
        with open(txtfile, 'r') as f:
            for line in f:
                if line.startswith('gsd'):
                    num = line.split(':')[-1]
                    try:
                        gsd = float(num)
                    except ValueError:
                        gsd = None
                    continue

                items = line.split(' ')
                if len(items) >= 9:
                    bboxes.append([float(i) for i in items[:8]])
                    labels.append(items[8])
                    diffs.append(int(items[9]) if len(items) == 10 else 0)

    bboxes = np.array(bboxes, dtype=np.float32) if bboxes else \
        np.zeros((0, 8), dtype=np.float32)
    diffs = np.array(diffs, dtype=np.int64) if diffs else \
        np.zeros((0,), dtype=np.int64)
    ann = dict(bboxes=bboxes, labels=labels, diffs=diffs)
    return dict(gsd=gsd, ann=ann)


def main():
    """Main function of image split."""
    args = parse_args()

    if args.ann_dirs is None:
        args.ann_dirs = [None for _ in range(len(args.img_dirs))]
    padding_value = args.padding_value[0] \
        if len(args.padding_value) == 1 else args.padding_value
    sizes, gaps = [], []
    for rate in args.rates:
        sizes += [int(size / rate) for size in args.sizes]
        gaps += [int(gap / rate) for gap in args.gaps]
    save_imgs = osp.join(args.save_dir, 'images')
    # save_files = osp.join(args.save_dir, 'annfiles')
    save_files = osp.join(args.save_dir, 'labelTxt') # annotation 파일을 저장할 폴더를 labelTxt 로 바꿔줌
    os.makedirs(save_imgs, exist_ok=True)
    os.makedirs(save_files, exist_ok=True )
    logger = setup_logger(args.save_dir)

    print('Loading original data!!!')
    infos, img_dirs = [], []
    for img_dir, ann_dir in zip(args.img_dirs, args.ann_dirs):
        _infos = load_dota(img_dir=img_dir, ann_dir=ann_dir, nproc=args.nproc)
        _img_dirs = [img_dir for _ in range(len(_infos))]
        infos.extend(_infos)
        img_dirs.extend(_img_dirs)

    print('Start splitting images!!!')
    start = time.time()
    manager = Manager()
    worker = partial(
        single_split,
        sizes=sizes,
        gaps=gaps,
        img_rate_thr=args.img_rate_thr,
        iof_thr=args.iof_thr,
        no_padding=args.no_padding,
        padding_value=padding_value,
        save_dir=save_imgs,
        anno_dir=save_files,
        img_ext=args.save_ext,
        lock=manager.Lock(),
        prog=manager.Value('i', 0),
        total=len(infos),
        logger=logger)


    if args.nproc > 1:
        pool = Pool(args.nproc)
        patch_infos = pool.map(worker, zip(infos, img_dirs))
        pool.close()
    else:
        patch_infos = list(map(worker, zip(infos, img_dirs)))

    patch_infos = reduce(lambda x, y: x + y, patch_infos)
    stop = time.time()
    print(f'Finish splitting images in {int(stop - start)} second!!!')
    print(f'Total images number: {len(patch_infos)}')


if __name__ == '__main__':
    main()
'''


img_split_dir = '/content/mmrotate/tools/data/fair1m/split' 
img_split_path = os.path.join(img_split_dir,'img_split.py')
os.makedirs(img_split_dir, exist_ok=True)

f = open(img_split_path, "w")
f.write(img_split_txt)
f.close()


In [None]:
################# crop 진행하기 ####################

# 생성해둔 파일을 실행하기 위해서 작업경로를 해당 파일이 존재하는 폴더로 이동합니다. 

print('working directory:',os.getcwd())

dest_path = r'/content/mmrotate/tools/data/fair1m/split'


os.chdir(dest_path)

print('working directory changed to ',os.getcwd())

# 위에서 생성한 json 파일 경로를 설정해줍니다. 
# 주의할점은 위에서 작업경로를 이미 dest_path 로 설정해줬기때문에 json 경로 또한 dest_path 아래로 타고 들어가야합니다. 
split_config_path =  f"split_configs/ss_train_{size}_{str(rate).replace('.','_')}.json"

try:
  fi = 'img_split.py'
  %run {fi} --base-json {split_config_path}
  print('crop complete!!')

except Exception as e:   
  print('crop fail!!ㅜ_ㅜ')
  print(e)

# 작업경로를 다시 /content 로 바꿔줍니다. 
os.chdir('/content')
print('working directory changed to ',os.getcwd())


            



working directory: /content/mmrotate
working directory changed to  /content/mmrotate/tools/data/fair1m/split
Loading original data!!!
Starting loading DOTA dataset information.
Finishing loading DOTA, get 16488 iamges, using 27.237s.
Start splitting images!!!


[1;30;43m스트리밍 출력 내용이 길어서 마지막 5000줄이 삭제되었습니다.[0m
2022-09-25 01:48:20,360 - (84.8% 13989:16488) - Filename: 12063.tif - width: 1000  - height: 1000  - Objects: 114   - Patches: 1
INFO:img split:(84.8% 13989:16488) - Filename: 12063.tif - width: 1000  - height: 1000  - Objects: 114   - Patches: 1
2022-09-25 01:48:20,414 - (84.8% 13990:16488) - Filename: 12025.tif - width: 1000  - height: 1000  - Objects: 18    - Patches: 1
INFO:img split:(84.8% 13990:16488) - Filename: 12025.tif - width: 1000  - height: 1000  - Objects: 18    - Patches: 1
2022-09-25 01:48:20,471 - (84.9% 13991:16488) - Filename: 1359.tif - width: 1000  - height: 1000  - Objects: 40    - Patches: 1
INFO:img split:(84.9% 13991:16488) - Filename: 1359.tif - width: 1000  - height: 1000  - Objects: 40    - Patches: 1
2022-09-25 01:48:20,511 - (84.9% 13992:16488) - Filename: 5605.tif - width: 600   - height: 800   - Objects: 4     - Patches: 1
INFO:img split:(84.9% 13992:16488) - Filename: 5605.tif - width: 600   - height: 80

Finish splitting images in 950 second!!!
Total images number: 23061
crop complete!!
working directory changed to  /content


### **3-4. Define a new dataset:: FAIR1MDataset**
FAIR1MDataset 클래스를 생성하는 코드를 작성하여 `fair1m.py` 에 저장합니다.   
DOTADataset을 상속받아 기본적인 클은 따라가고, 특정 부분만 수정을 진행합니다. 수정한 사항은 아래와 같습니다. 
- 클래스명, 색상팔레트
- 평가지표 수정 (Dota 기존 평가지표: mAP -> IOU별 mAP, F1Score)
  - iou 임계치를 리스트로 받아 for문을 돌려 iou별 평가지표값을 도출해냅니다. 



#### 3-4-1. Parent class Dataset
- 상위클래스 ('Airplane','Ship', 'Vehicle') 로 정의한 데이데셋 생성

In [33]:

########################## 새로운 데이터셋 정의 : FAIR1MDataset 클래스를 생성하여 fair1m.py 생성하기 ############################

FAIR1MDataset_text = '''

from .builder import ROTATED_DATASETS
from .dota import DOTADataset
import os
import os.path as osp
import xml.etree.ElementTree as ET
from collections import OrderedDict
import numpy as np
from PIL import Image
from mmrotate.core import eval_rbbox_map


@ROTATED_DATASETS.register_module()
class FAIR1MDataset(DOTADataset):
    """fair1m dataset for detection """
    CLASSES = ('Vehicle', 'Airplane', 'Ship'),
    PALETTE = [(165, 42, 42), (0, 225, 0), (0, 0, 225)]

    def __init__(self, **kwargs):
        super(FAIR1MDataset, self).__init__(**kwargs)

    def evaluate(self,
                 results,
                 metric= ['mAP','F1Score'],
                 logger=None,
                 proposal_nums=(100, 300, 1000),
                 iou_thr=0.5,
                 scale_ranges=None,
                 nproc=4):

        """Evaluate the dataset.
        Args:
            results (list): Testing results of the dataset.
            metric (str | list[str]): Metrics to be evaluated.
            logger (logging.Logger | None | str): Logger used for printing
                related information during evaluation. Default: None.
            proposal_nums (Sequence[int]): Proposal number used for evaluating
                recalls, such as recall@100, recall@1000.
                Default: (100, 300, 1000).
            iou_thr (float | list[float]): IoU threshold. It must be a float
                when evaluating mAP, and can be a list when evaluating recall.
                Default: 0.5.
            scale_ranges (list[tuple] | None): Scale ranges for evaluating mAP.
                Default: None.
            use_07_metric (bool): Whether to use the voc07 metric.
            nproc (int): Processes used for computing TP and FP.
                Default: 4.
        """
        
        # super().evaluate()
        nproc = min(nproc, os.cpu_count())

        # metric 이 없으면 mAP,F1Score 쓰기
        if not isinstance(metric, str):
            assert len(metric) == 2
            metric = metric
        
        # allowed_metrics 외 metric을 적으면 해당 메트릭은 지원하지 않는다는 예외처리하기   
        allowed_metrics = ['mAP','F1Score']
        if metric not in allowed_metrics:
            raise KeyError(f'metric {metric} is not supported')

        annotations = [self.get_ann_info(i) for i in range(len(self))]
        eval_results = {}


        # mAP, F1Score metric 정의 (metric으로 mAP, F1Score 두개 다 썼을때 )
        if ('mAP' and 'F1Score') in metric  :
            assert isinstance(iou_thr, float)

            mean_ap, mean_f1, _ = eval_rbbox_map(
                results,
                annotations,
                scale_ranges=scale_ranges,
                iou_thr=iou_thr,
                dataset=self.CLASSES,
                logger=logger,
                nproc=nproc)
            eval_results['mAP'] = mean_ap
            eval_results['F1Score'] = mean_f1

        # mAP metric 정의 (mAP 일때는 iou별 AP 도 같이 계산해야함)
        if metric == 'mAP' :
            assert isinstance(iou_thr, float)
            mean_ap, _, _ = eval_rbbox_map(
                results,
                annotations,
                scale_ranges=scale_ranges,
                iou_thr=iou_thr,
                dataset=self.CLASSES,
                logger=logger,
                nproc=nproc)
            eval_results['mAP'] = mean_ap


        # F1Score metric 정의    
        elif metric == 'F1Score':
            assert isinstance(iou_thr, float)
            _, mean_f1, _ = eval_rbbox_map(
                results,
                annotations,
                scale_ranges=scale_ranges,
                iou_thr=iou_thr,
                dataset=self.CLASSES,
                logger=logger,
                nproc=nproc)
            eval_results['F1Score'] = mean_f1


        else:
            raise NotImplementedError

        return eval_results
        
'''

FAIR1MDataset_path = '/content/mmrotate/mmrotate/datasets/fair1m.py'

f = open(FAIR1MDataset_path, "w")
f.write(FAIR1MDataset_text)
f.close()

### **3-5. Import the FAIR1MDataset Module**

- 모듈 가져오기: 새로 생성한 데이터셋을 사용하기 위하여 `__init__.py`를 수정합니다.


In [34]:

########################## datasets  __init__.py 수정하기 ############################

FAIR1MDataset_init = '''
# Copyright (c) OpenMMLab. All rights reserved.
from .builder import build_dataset  # noqa: F401, F403
from .dota import DOTADataset  # noqa: F401, F403
from .hrsc import HRSCDataset  # noqa: F401, F403
from .pipelines import *  # noqa: F401, F403
from .sar import SARDataset  # noqa: F401, F403
from .fair1m import FAIR1MDataset

__all__ = ['SARDataset', 'FAIR1MDataset', 'DOTADataset', 'build_dataset', 'HRSCDataset']
'''

FAIR1MDataset_init_path = '/content/mmrotate/mmrotate/datasets/__init__.py'

f = open(FAIR1MDataset_init_path, "w")
f.write(FAIR1MDataset_init)
f.close()

## **4. Prepare Customized Model**
---


- 모델을 커스텀하기 전 기본적으로 알아야 하는 건 config 파일의 구조입니다. 기본적인 config 구조만 이해한다면 다양한 구조의 모델을 생성 할 수 있습니다. 
- config 파일의 구조는 다음과 같습니다.
    ```
    mmrotate
    ├── configs
    │   ├── _base_
    │   │   ├── datasets
    |   │   │   ├── dotav1.py
    |   │   │   ├── ...
    │   │   ├── schedules
    |   │   │   ├── schedule_1x.py
    |   │   │   ├── ...
    │   │   ├── default_runtime.py
    |   |
    │   ├── oriented_rcnn
    |   │   ├── oriented_rcnn_r50_fpn_1x_fair1m_le90.py
    |   │   ├── ...
    │   ├── oriented_reppoints
    │   ├── ...

    ```

  - 기본이 되는 config 파일이 configs/_base_/ 디렉토리에 있습니다.
  - 해당 디렉토리는 dataset, model, schedule, default_runtime 총 4개로 구성되며 사용되는 config들은 이들을 base로 합니다. 
  - _base_ 안에 있는 config로만 구성된 config를 primitive라 합니다.
  - 실제로 사용할 config는 _base_ 내의 기본 config 또는 다른 config를 상속받아 구성할 수 있습니다. 

### **4-1. Define a new Model config: oriented_rcnn_r50_fpn_1x_fair1m_le90**






- 이제 본격적으로 base모델로 정한 oriented_rcnn 기반 모델을 만들어 보도록 하겠습니다. 해당 모델의 구조는 다음과 같습니다.
- 
  - model: oriented_rcnn
  - backbone: r50(Resnet 50) 
  - neck: fpn
  - rpn_head: OrientedRPNHead
  - roi_head: OrientedStandardRoIHead
  - schedule: 1x (12 에포크) 
  - dataset: fair1m
  - angle version: role90 (Long Edge Definition (90°))
- 일단 원본데이터로 생성한 dataset config파일인 `fair1m.py`로 모델생성을 진행하도록 하겠습니다. 
- 원본 데이터가 아닌 위에서 생성한 split진행한 데이터로 모델을 생성하고 싶으시다면 해당 더이터셋 config를 활용하시면 됩니다. ex.`fair1m_split_1024_1_0.py` 


In [47]:
wandb_name = 'oriented_rcnn_r50_fpn_1x_fair1m_le90'
data_root = '/content/mmrotate/data/fair1m/'

model_config_txt = f"""

dataset_type = 'FAIR1MDataset'
data_root = '{data_root}'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='RResize', img_scale=(1024, 1024)),
    dict(
        type='RRandomFlip',
        flip_ratio=[0.25, 0.25, 0.25],
        direction=['horizontal', 'vertical', 'diagonal'],
        version='le90'),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_rgb=True),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1024, 1024),
        flip=False,
        transforms=[
            dict(type='RResize'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad', size_divisor=32),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img'])
        ])
]

data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type=dataset_type,
        ann_file=data_root + 'train/annfiles/',
        img_prefix=data_root + 'train/images/',
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'val/annfiles/',
        img_prefix=data_root + 'val/images/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=data_root + 'val/annfiles/',
        img_prefix=data_root + 'val/images/',
        pipeline=test_pipeline))


evaluation = dict(interval=1, metric='mAP' and 'F1Score')

optimizer = dict(type='SGD', lr=0.005, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.3333333333333333,
    step=[8, 11])
runner = dict(type='EpochBasedRunner', max_epochs=12)
checkpoint_config = dict(interval=1)
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook', interval=50),
        # dict(type='WandbLoggerHook',interval=100,
        #     init_kwargs=dict(
        #         project='SIA_Project',
        #         entity = 'jhgsia2',
        #         name = '{wandb_name}' 
        #     )
        #     )
    ])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
opencv_num_threads = 0
mp_start_method = 'fork'



angle_version = 'le90'
model = dict(
    type='OrientedRCNN',
    backbone=dict(
        type='ResNet',
        depth=50,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        norm_cfg=dict(type='BN', requires_grad=True),
        norm_eval=True,
        style='pytorch',
        init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
    neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        num_outs=5),
    rpn_head=dict(
        type='OrientedRPNHead',
        in_channels=256,
        feat_channels=256,
        version='le90',
        anchor_generator=dict(
            type='AnchorGenerator',
            scales=[8],
            ratios=[0.5, 1.0, 2.0],
            strides=[4, 8, 16, 32, 64]),
        bbox_coder=dict(
            type='MidpointOffsetCoder',
            angle_range='le90',
            target_means=[0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
            target_stds=[1.0, 1.0, 1.0, 1.0, 0.5, 0.5]),
        loss_cls=dict(
            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
        loss_bbox=dict(
            type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)),
    roi_head=dict(
        type='OrientedStandardRoIHead',
        bbox_roi_extractor=dict(
            type='RotatedSingleRoIExtractor',
            roi_layer=dict(
                type='RoIAlignRotated',
                out_size=7,
                sample_num=2,
                clockwise=True),
            out_channels=256,
            featmap_strides=[4, 8, 16, 32]),
        bbox_head=dict(
            type='RotatedShared2FCBBoxHead',
            in_channels=256,
            fc_out_channels=1024,
            roi_feat_size=7,
            num_classes=3,
            bbox_coder=dict(
                type='DeltaXYWHAOBBoxCoder',
                angle_range='le90',
                norm_factor=None,
                edge_swap=True,
                proj_xy=True,
                target_means=(.0, .0, .0, .0, .0),
                target_stds=(0.1, 0.1, 0.2, 0.2, 0.1)),
            reg_class_agnostic=True,
            loss_cls=dict(
                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
            loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))),
    train_cfg=dict(
        rpn=dict(
            assigner=dict(
                type='MaxIoUAssigner',
                pos_iou_thr=0.7,
                neg_iou_thr=0.3,
                min_pos_iou=0.3,
                match_low_quality=True,
                ignore_iof_thr=-1),
            sampler=dict(
                type='RandomSampler',
                num=256,
                pos_fraction=0.5,
                neg_pos_ub=-1,
                add_gt_as_proposals=False),
            allowed_border=0,
            pos_weight=-1,
            debug=False),
        rpn_proposal=dict(
            nms_pre=2000,
            max_per_img=2000,
            nms=dict(type='nms', iou_threshold=0.8),
            min_bbox_size=0),
        rcnn=dict(
            assigner=dict(
                type='MaxIoUAssigner',
                pos_iou_thr=0.5,
                neg_iou_thr=0.5,
                min_pos_iou=0.5,
                match_low_quality=False,
                iou_calculator=dict(type='RBboxOverlaps2D'),
                ignore_iof_thr=-1),
            sampler=dict(
                type='RRandomSampler',
                num=512,
                pos_fraction=0.25,
                neg_pos_ub=-1,
                add_gt_as_proposals=True),
            pos_weight=-1,
            debug=False)),
    test_cfg=dict(
        rpn=dict(
            nms_pre=2000,
            max_per_img=2000,
            nms=dict(type='nms', iou_threshold=0.8),
            min_bbox_size=0),
        rcnn=dict(
            nms_pre=2000,
            min_bbox_size=0,
            score_thr=0.05,
            nms=dict(iou_thr=0.1),
            max_per_img=2000)))

"""

model_config_path = '/content/mmrotate/configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_dota_le90.py'

f = open(model_config_path, "w")
f.write(model_config_txt)
f.close()

## **5. Model Training**
- 위에서 생성한 모델 config 를 실행하도록 하겠습니다. 

In [48]:
import wandb
wandb.login(relogin=True)

<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


True

In [None]:
%cd /content/mmrotate
!python ./tools/train.py ./configs/oriented_rcnn/oriented_rcnn_r50_fpn_1x_fair1m_le90.py