# Introduction

Thanks to great competition!! Our team divided our two submissions into two models (1) FRCNN-ResNest and (2) EffDet. The FRCNN-ResNest is able to reach 0.7702 Public LB at the end but not perform well on Private LB. While EffDet safe us to get relatively great Private position :D

This notebook is on FRCNN-ResNest and shows how to improve  by combining Pseudo labeling techinque with ensemble by the following steps:

0. Train models on K Folds (see training notebook in [Colab here](https://colab.research.google.com/drive/1ckIi9A8npT3tazlixfQiHdhWWhLh1lQI?usp=sharing)
1. Ensemble all K models for prediction of pseudo labeling
2. Retrain each K models again with new pseudo labeling, together with each corresponding training fold
3. Recombine K models prediction again after finish training for all models

## Proper Credits
- The optimized version of this notebook can reach 0.770+ thanks to my teammates @nitindatta and @datahobbit for dedicated efforts while @yashchoudhary complete the Kfold training. 
- @kyoshioka47 (or arutema47) provided us "effective" EffDet responsible for our final position and many insightful discussions which you can see in his own thread / kernel to be published soon :D
- Original work of FRCNN-ResNest is due to amazing @whurobin (https://github.com/wuxinwang1997/wheatdetection) and @shonenkov who provided solid starter for everyone in this competition (needless to put links)

## What's go wrong in Private LB for this model ?
By looking at the boxes in the sample 10 test images (see Version 12 for real pseudo labeling run). I believe FRCNN-ResNest is too good detecting every blur and small wheat near the edges, which is don't count as valid labels according to the original paper -- so it has many false-positive but in fact work fine in my opinion

In [None]:
'''
Logs.
- v6[7638] original Ensemble --> K-Models Pseudo --> final ensemble
- v7[7636] add SatTTA again , with stricter post-wbf filter
- v8[QUICK] scale ACCUM before loss.backward() / fix train-val loaders for KFolds
- v9[QUICK] Disable SatTTA / identify bug when each checkpoints have different last epochs
- v10[7652] Run on the real Fold2!! fix bug when each checkpoints have different last epochs
- v12[7662] 4x4
'''

# Inference Kernel on Test Data

# 1. Prepare ResNest and Best Weights Data
Jung's already make ResNest repo as a dataset. We will move this to Kaggle's working directory so that we can modify (`/kaggle/input/` is read-only)

In [None]:
from object_detection_utils import show_Nimages

In [None]:
!cp -rf /kaggle/input/wheatdetection-resnest-develop-branch-july9/wheatdetection/wheatdetection /kaggle/working/
!ls /kaggle/working

In [None]:
CODE_PATH = '/kaggle/working/wheatdetection/'
!ls {CODE_PATH}
%cd {CODE_PATH}

Next, we define a path to our best checkpoint. Please modify the following cell to your own trained dataset.

In [None]:
!ls /kaggle/input/5fold-68-clear/
!ls /kaggle/input/best-models-frcnn/

In [None]:
import torch
import random
import numpy as np
import os

def seed_everything(seed=42):
    random.seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.backends.cudnn.deterministic = True


# 2. Inference on Test (to get Pseudo Labels)

Define hyperparameters here. If we don't use NMS, we simply filter out the boxes using `SCORE_THRESHOLD` ; If we use NMS, we will combine both `SCORE_THRESHOLD` and `IOU_THRESHOLD` to filter box according to NMS logic.

In the pseudo-labeling kernel, various parameters are added and commented.

In [None]:

# For simple implementations and best performance, 
# BEST_PATHS should be sorted from Fold0, Fold1, Fold2, ..., FoldK respectively [use this structure in Kfolds split]
BEST_PATHS = ["/kaggle/input/best-models-frcnn/F0_68_nofinetune_clear_best.bin", 
              "/kaggle/input/best-models-frcnn/F1_68_nofinetune_clear_best.bin",
              "/kaggle/input/5fold-68-clear/F2_68_nofinetune_clear_best.bin",
              "/kaggle/input/5fold-68-clear/F3_68_nofinetune_clear_best.bin"]

# print to see best weights information
for BEST_PATH in BEST_PATHS:
    ckp = torch.load(BEST_PATH)
    print(ckp.keys())
    print(ckp['epoch'], ckp['best_valid_loss'])

In [None]:
SEED=42
seed_everything(SEED)

APEX = False # NOT WORK at the moment
ACCUM = 2 # accumulative gradient epochs, 1=do nothing

# VALID_FOLD = 0 # Not specific on use-case of KFolds ensemble -- use corresponding valid fold for each fold model
SWAP_VALID_AND_TRAIN = False
N_VIZ = 10 # How many pictures you want to visualize (0-10)

'''POST-PROCESS PARAMETERS'''
USE_NMS = False
SCORE_THRESHOLD = 0.65
NMS_IOU_THRESHOLD = 0.5
IMG_SIZE = 1024
WBF_IOU, WBF_SKIP_BOX = 0.44, 0.38
PP_SHRINK = [-1,0] # [ Shrink for pseudo labeling, Shrink for the final prediction ]


WBF_SCORE_THRESHOLD = 0.265 # 0=UNUSED=BAD_CV , 0.3=BEST_CV, 0.25=BEST_LB

USE_BOUNDS_FILTER = True 
LOWER_BOUND, UPPER_BOUND = 70, 175000 # observed bounds from Train data (400, 150000)

'''PSEUDO-LABELING PARAMETERS'''
PSEUDO_EPOCHS = 4 #
PSEUDO_EPOCHS_COMMIT = 0 # if you just commit, not submit, it will run this #epochs (you can set to 0 or 1 for fast experiments)

HSV_H = 0.03
HSV_S = 0.68
HSV_V = 0.36
BC_B = 0.1
BC_C  = 0.1
BASE_LR = 0.00135
BIAS_LR_FACTOR = 0.5 # DEFAULT is 1, I don't really know the true effect of this parameter
MOMENTUM = 0.75
WARMUP_EPOCHS = 200

In [None]:
if APEX:
    ! pip install --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" /kaggle/input/nvidiaapex/.


In [None]:
if APEX:
    from apex import amp

In the repo, WheatDetector Class always need internet connection for `pretrained` weights, so below we rewrite the file for `pretrained=False`

In [None]:
%%writefile ./modeling/wheat_detector.py

import torch
from torch import nn
from layers import FasterRCNN
from layers.backbone_utils import resnest_fpn_backbone

class WheatDetector(nn.Module):
    def __init__(self, cfg, **kwargs):
        super(WheatDetector, self).__init__()
        self.backbone = resnest_fpn_backbone(pretrained=False) #change here
        self.base = FasterRCNN(self.backbone, num_classes=cfg.MODEL.NUM_CLASSES, **kwargs)

    def forward(self, images, targets=None):
        return self.base(images, targets)

Hides two cells of TTA and Tester codes

In [None]:
import matplotlib.pyplot as plt
import cv2

import warnings

from tqdm import tqdm
import pandas as pd
from itertools import product
import sys
sys.path.insert(0, "./external/wbf")
# sys.path.insert(0, "/kaggle/input/weighted-boxes-fusion-104")
import ensemble_boxes
from torchvision.transforms import functional_tensor as TF
from torchvision import transforms
warnings.filterwarnings("ignore")


class BaseWheatTTA:
    """ author: @shonenkov """
    image_size = IMG_SIZE

    def augment(self, image):
        raise NotImplementedError

    def batch_augment(self, images):
        raise NotImplementedError

    def deaugment_boxes(self, boxes):
        raise NotImplementedError

class TTAReduceSaturation(BaseWheatTTA):
    """ author: @ratthachat """

    def augment(self, image):
        image = TF.adjust_saturation(image, 0.9)
        return image

    def batch_augment(self, images):
        images_new = []
        for image in images:
            images_new.append(self.augment(image))
        return torch.stack(images_new)

    def deaugment_boxes(self, boxes):
        return boxes

class TTAHorizontalFlip(BaseWheatTTA):
    """ author: @shonenkov """

    def augment(self, image):
        return image.flip(1)

    def batch_augment(self, images):
        return images.flip(2)

    def deaugment_boxes(self, boxes):
        boxes[:, [1, 3]] = self.image_size - boxes[:, [3, 1]]
        return boxes


class TTAVerticalFlip(BaseWheatTTA):
    """ author: @shonenkov """

    def augment(self, image):
        return image.flip(2)

    def batch_augment(self, images):
        return images.flip(3)

    def deaugment_boxes(self, boxes):
        boxes[:, [0, 2]] = self.image_size - boxes[:, [2, 0]]
        return boxes


class TTARotate90(BaseWheatTTA):
    """ author: @shonenkov """

    def augment(self, image):
        return torch.rot90(image, 1, (1, 2))

    def batch_augment(self, images):
        return torch.rot90(images, 1, (2, 3))

    def deaugment_boxes(self, boxes):
        res_boxes = boxes.copy()
        res_boxes[:, [0, 2]] = self.image_size - boxes[:, [3, 1]]
        res_boxes[:, [1, 3]] = boxes[:, [0, 2]]
        return res_boxes


class TTACompose(BaseWheatTTA):
    """ author: @shonenkov """

    def __init__(self, transforms):
        self.transforms = transforms

    def augment(self, image):
        for transform in self.transforms:
            image = transform.augment(image)
        return image

    def batch_augment(self, images):
        for transform in self.transforms:
            images = transform.batch_augment(images)
        return images

    def prepare_boxes(self, boxes):
        result_boxes = boxes.copy()
        result_boxes[:, 0] = np.min(boxes[:, [0, 2]], axis=1)
        result_boxes[:, 2] = np.max(boxes[:, [0, 2]], axis=1)
        result_boxes[:, 1] = np.min(boxes[:, [1, 3]], axis=1)
        result_boxes[:, 3] = np.max(boxes[:, [1, 3]], axis=1)
        return result_boxes

    def deaugment_boxes(self, boxes):
        for transform in self.transforms[::-1]:
            boxes = transform.deaugment_boxes(boxes)
        return self.prepare_boxes(boxes)

In [None]:
# Tester is modified for multi-models
class Tester:
    def __init__(self, models, device, cfg, test_loader, n_viz=N_VIZ):
        self.config = cfg
        self.test_loader = test_loader

        self.base_dir = f'{self.config.OUTPUT_DIR}'
        if not os.path.exists(self.base_dir):
            os.makedirs(self.base_dir)

        self.log_path = f'{self.base_dir}/log.txt'
        self.score_threshold = SCORE_THRESHOLD
        self.iou_threshold = NMS_IOU_THRESHOLD
        self.use_nms = USE_NMS
        self.n_viz = n_viz
        
        self.device = device
        self.models = []
        for i,model in enumerate(models):
            self.models.append(model)
            self.models[-1].eval()
            self.models[-1].to(self.device)

        self.log(f'Tester prepared. Device is {self.device}')

    def test(self, pp_shrink=0):
        results,all_predictions = self.infer(pp_shrink)
        self.save_predictions(results)
        return all_predictions
    def process_det(self, index, outputs):
        boxes = outputs[index]['boxes'].data.cpu().numpy()
        scores = outputs[index]['scores'].data.cpu().numpy()
        boxes = (boxes).clip(min=0, max=1023).astype(int)
        indexes = np.where(scores > self.score_threshold)
        boxes = boxes[indexes]
        scores = scores[indexes]
        return boxes, scores

    def make_tta_predictions(self, tta_transforms, images, model_id):
        with torch.no_grad():
            images = torch.stack(images).float().cuda()
            predictions = []
            for tta_transform in tta_transforms:
                result = []
                outputs = self.models[model_id](tta_transform.batch_augment(images.clone()))
                
                
                for i, image in enumerate(images):
                    boxes = outputs[i]['boxes'].data.cpu().numpy()
                    scores = outputs[i]['scores'].data.cpu().numpy()
                    indexes = np.where(scores > self.score_threshold)[0]
                    boxes = tta_transform.deaugment_boxes(boxes.copy())
                    
                    if self.use_nms: 
                        labels = np.ones(scores.shape[0]).astype(int).tolist()
                        boxes, scores, labels = ensemble_boxes.ensemble_boxes_nms.nms_method([boxes], [scores], [labels], method=3,
                                                                                        weights=None, iou_thr=self.iou_threshold,
                                                                                        thresh=self.score_threshold)
                    else: # not use NMS, just filter by confidence score
                        boxes = boxes[indexes]
                        scores = scores[indexes]
                    result.append({
                        'boxes': boxes,
                        'scores': scores,
                    })
                predictions.append(result)
        return predictions # [TTA_NUM , BATCH_IMG_NUM, DICT[BOXES, SCORES]]
    
    '''
    run_wbf
    - predictions is in [TTA_NUM , BATCH_IMG_NUM, DICT[BOXES, SCORES]] format
    - BOXES are xyxy format 
    - BOXES are unnormalized -- and will get WRONG result for non-square non-1024 images!! (hidden BUG FOUND)
    '''
    def run_wbf(self, predictions, image_index, image_shape=(IMG_SIZE,IMG_SIZE), iou_thr=WBF_IOU, skip_box_thr=WBF_SKIP_BOX, weights=None):
#         boxes = [(prediction[image_index]['boxes'] / (image_size - 1)).tolist() for prediction in predictions] # for each TTA
        
        new_boxes_multi_models = []
        for prediction in predictions:
            pred_boxes = prediction[image_index]['boxes'] # each (TTA) model, specific image_id --> np-shape[NUM_BOXES, 4] 
            pred_boxes[:, 0] = pred_boxes[:, 0]/(image_shape[0]-1)
            pred_boxes[:, 1] = pred_boxes[:, 1]/(image_shape[1]-1)
            pred_boxes[:, 2] = pred_boxes[:, 2]/(image_shape[0]-1)
            pred_boxes[:, 3] = pred_boxes[:, 3]/(image_shape[1]-1)
            
            new_boxes_multi_models.append(pred_boxes.tolist())
        
        boxes = new_boxes_multi_models
        scores = [prediction[image_index]['scores'].tolist() for prediction in predictions]
        labels = [np.ones(prediction[image_index]['scores'].shape[0]).astype(int).tolist() for prediction in
                  predictions]
        boxes, scores, labels = ensemble_boxes.ensemble_boxes_wbf.weighted_boxes_fusion(boxes, scores, labels,
                                                                                        weights=None, iou_thr=iou_thr,
                                                                                        skip_box_thr=skip_box_thr)
#         boxes = boxes * (image_size - 1)
        boxes[:, 0] = boxes[:, 0]*(image_shape[0]-1)
        boxes[:, 1] = boxes[:, 1]*(image_shape[1]-1)
        boxes[:, 2] = boxes[:, 2]*(image_shape[0]-1)
        boxes[:, 3] = boxes[:, 3]*(image_shape[1]-1)
        return boxes, scores, labels

    def format_prediction_string(self, boxes, scores):
        pred_strings = []
        for j in zip(scores, boxes):
            pred_strings.append("{0:.4f} {1} {2} {3} {4}".format(j[0], j[1][0], j[1][1], j[1][2], j[1][3]))
        return " ".join(pred_strings)

    def infer(self, pp_shrink=0):
        for i in range(len(self.models)):
            self.models[i].eval()
        torch.cuda.empty_cache()

        tta_transforms = []
        for tta_combination in product([TTAHorizontalFlip(), None],
                                       [TTAVerticalFlip(), None],
#                                        [TTAReduceSaturation(), None],
                                       [TTARotate90(), None]):
            tta_transforms.append(TTACompose([tta_transform for tta_transform in tta_combination if tta_transform]))
        test_loader = tqdm(self.test_loader, total=len(self.test_loader), desc="Testing")
        results = []
        all_predictions = []
        boxes10 = []
        
        for images, image_ids in test_loader:
            predictions=[]
            for ii in range(len(self.models)):
                predictions.append(self.make_tta_predictions(tta_transforms, images,model_id=ii)) # [TTA_NUM , BATCH_IMG_NUM, DICT[BOXES, SCORES]]
            
            predictions = np.vstack(predictions)
            
            for i, image in enumerate(images):
#                 print('shape : ',image.shape)
                boxes, scores, labels = self.run_wbf(predictions, image_index=i,image_shape=[image.shape[1],image.shape[2]])

                #round and clip seems to be better before rather than after pp_shrink
                boxes = boxes.round().astype(np.int32).clip(min=0, max=1023)
                
                image_id = image_ids[i]
                
                if len(boxes10) < self.n_viz:
                    print('writing ... ',i,image_id)
                    sample = image.permute(1,2,0).cpu().numpy()

                    fig, ax = plt.subplots(1, 1, figsize=(16, 8))
                    boxes10.append((sample, boxes))
                    
                    for box, score in zip(boxes,scores):
                        cv2.rectangle(sample, (box[0], box[1]), (box[2], box[3]), (1, 0, 0), 5)
                        cv2.putText(sample, '%.2f'%(score), (box[0], box[1]), cv2.FONT_HERSHEY_SIMPLEX ,  
                   1, (255,255,255), 3, cv2.LINE_AA)
                    
                    ax.set_axis_off()
                    ax.imshow(sample);
                    plt.show()
                                
                #post-processing box size adjusting (host advised boxes in test are tight to image)
                boxes[:, 0] = boxes[:, 0] + pp_shrink
                boxes[:, 1] = boxes[:, 1] + pp_shrink
                boxes[:, 2] = boxes[:, 2] - pp_shrink
                boxes[:, 3] = boxes[:, 3] - pp_shrink
                
                img_shape = image.cpu().numpy().shape
                boxes[:, 0] = [max(min(x, img_shape[1]-1), 0) for x in boxes[:, 0]]
                boxes[:, 1] = [max(min(x, img_shape[2]-1), 0) for x in boxes[:, 1]]
                boxes[:, 2] = [max(min(x, img_shape[1]-1), 0) for x in boxes[:, 2]]
                boxes[:, 3] = [max(min(x, img_shape[2]-1), 0) for x in boxes[:, 3]]
                
                boxes[:, 2] = boxes[:, 2] - boxes[:, 0]
                boxes[:, 3] = boxes[:, 3] - boxes[:, 1]
                
                areas = boxes[:, 2]*boxes[:, 3]
                
                if USE_BOUNDS_FILTER==True:
                    #if boxes is filtered do we need to filter scores as well? Otherwise the score will not be aligned with the correct box?
                    #boxes = [boxes[i] for i in range(len(boxes)) if areas[i] > LOWER_BOUND and areas[i] < UPPER_BOUND]
                    print("Filter by area bounds")
                    print("Length of unfiltered boxes: " + str(len(boxes)))
                    print("Length of unfiltered scores: " + str(len(scores)))
                    filtered_boxes = []
                    filtered_scores = []
                    for i in range(len(boxes)):
                        if areas[i] > LOWER_BOUND and areas[i] < UPPER_BOUND and scores[i] >= WBF_SCORE_THRESHOLD:
                            filtered_boxes.append(boxes[i])
                            filtered_scores.append(scores[i])
                    print("Length of filtered boxes: " + str(len(filtered_boxes)))
                    print("Length of filtered scores: " + str(len(filtered_scores)))
                    boxes = filtered_boxes
                    scores = filtered_scores
                else:
                    print("area filtering not applied")
                        
                
                result = {
                    'image_id': image_id,
                    'PredictionString': self.format_prediction_string(boxes, scores),
                }
                
                all_prediction = {
                    'image_id': image_id,
                    'pred_boxes': boxes,
                    'scores': scores,
                    'img_shape' : img_shape
                }
                
                results.append(result)
                all_predictions.append(all_prediction)
        return results, all_predictions

    def format_prediction_string(self, boxes, scores):
        pred_strings = []
        for j in zip(scores, boxes):
            pred_strings.append("{0:.4f} {1} {2} {3} {4}".format(j[0], j[1][0], j[1][1], j[1][2], j[1][3]))

        return " ".join(pred_strings)

    def save_predictions(self, results):
        test_df = pd.DataFrame(results, columns=['image_id', 'PredictionString'])
        test_df.to_csv(f'{self.config.OUTPUT_DIR}/submission.csv', index=False)

    def load(self, paths):
        print(paths)
        for i in range(len(paths)):
            checkpoint = torch.load(paths[i])
            self.models[i].load_state_dict(checkpoint['model_state_dict'])
            print('finish loading ',paths[i])
    def log(self, message):
        if self.config.VERBOSE:
            print(message)
        with open(self.log_path, 'a+') as logger:
            logger.write(f'{message}\n')

In [None]:
from config import cfg
cfg['OUTPUT_DIR'] = "/kaggle/working/"
cfg['DATASETS']['ROOT_DIR'] = "/kaggle/input/global-wheat-detection"
cfg['TEST']['IMS_PER_BATCH'] = 1
cfg['TEST']['WEIGHT'] = BEST_PATHS
# cfg.DATASETS.VALID_FOLD = VALID_FOLD
cfg

In [None]:
import os
import sys

from os import mkdir
sys.path.append('.')
from data import make_test_data_loader
from modeling import build_model
from utils.logger import setup_logger

# start here!!
if True:
#     cfg.freeze()

    output_dir = cfg.OUTPUT_DIR
    if output_dir and not os.path.exists(output_dir):
        print('creating ',cfg.OUTPUT_DIR)
        mkdir(output_dir)
    
    models=[]
    for i, best in enumerate(BEST_PATHS):
        models.append(build_model(cfg)) # build_model uses cfg only on cfg.num_classes

    test_loader = make_test_data_loader(cfg)

# 3. Pseudo labeling and Training

## 3.1 Create the new marking for Train+Test Pseudo Data

In [None]:
device = cfg.MODEL.DEVICE
tester = Tester(models=models, device=device, cfg=cfg, test_loader=test_loader, n_viz=N_VIZ)
tester.load(cfg['TEST']['WEIGHT'])
    
all_preds = tester.test(pp_shrink=PP_SHRINK[0])
!ls /kaggle/working
!rm -f /kaggle/working/submission.csv
!ls -lh /kaggle/working

### create new directory and test dataframe

In [None]:
%%time
NEW_INPUT_PATH = '/kaggle/working/imgs/'
!mkdir {NEW_INPUT_PATH}
!mkdir {NEW_INPUT_PATH}train
!cp -rf /kaggle/input/global-wheat-detection/train/* {NEW_INPUT_PATH}train
!cp -rf /kaggle/input/global-wheat-detection/test/* {NEW_INPUT_PATH}train

!ls /kaggle/input/global-wheat-detection/train/ | wc
!ls {NEW_INPUT_PATH}train | wc

In [None]:
df_dict = {}
df_dict['x'],df_dict['y'],df_dict['w'],df_dict['h'],df_dict['image_id'] = [],[],[],[],[]
for i in range(len(all_preds)):
    if len(all_preds[i]['pred_boxes']) == 0: # handle empty-box case
        print('pass')
        continue
    
#     print(all_preds[i]['img_shape'])
    
    if all_preds[i]['img_shape'][1] != 1024 or all_preds[i]['img_shape'][2] != 1024: # handle non-1024 cases
        print('pass')
        continue
    
    for box in all_preds[i]['pred_boxes']:
        df_dict['image_id'].append(all_preds[i]['image_id'])
        df_dict['x'].append(box[0])
        df_dict['y'].append(box[1])
        df_dict['w'].append(box[2])
        df_dict['h'].append(box[3])
        
df = pd.DataFrame(df_dict)
df['source'] = 'test'
df['width'] = 1024
df['height'] = 1024
df['area'] = df['w']*df['h']

print(df.shape)
df.head()

In [None]:
from data.build import split_dataset

marking_list, train_ids_list, valid_ids_list = [], [], []

for ii in range(len(BEST_PATHS)):
    print('\n** weights -- #%d' % ii)
    cfg.DATASETS.VALID_FOLD = ii # THIS is WHY we recommend SORTED-by-fold weights in BEST_PATHS
    marking, train_ids0, valid_ids0 = split_dataset(cfg)

    if SWAP_VALID_AND_TRAIN:
        print('swap!!')
        valid_ids, train_ids =train_ids0, valid_ids0
    else:
        train_ids, valid_ids =train_ids0, valid_ids0

    marking.head(3)
    print(ii, marking.shape, df.shape)
    marking = pd.concat([marking,df])
    marking = marking.reset_index()
    marking = marking.drop(['index'], axis=1)
    print(ii, marking.shape, df.shape)
    marking.tail(3)

    print('#train before adding test',len(train_ids))
    train_ids = np.concatenate([train_ids, df.image_id.unique()])
    print('#train after  adding test',len(train_ids))
    marking_list.append(marking)
    train_ids_list.append(train_ids)
    valid_ids_list.append(valid_ids)

# marking will always the same, so marking_list is in fact irrelevant!
# the main important is train/val_ids_list
print(len(marking_list), len(train_ids_list), len(valid_ids_list))

In [None]:
for ii in range(len(BEST_PATHS)):
    print(valid_ids_list[ii][:5], valid_ids_list[ii].shape)
print(675*5)

In [None]:
'''
# NOTE: to my understanding, this train.csv only use by split_dataset , 
but on cells above we already splitted it using the original train.csv to make K folds.
Therefore, in fact, this train.csv is not used.
I confirm this fact by searching for train.csv in original repo:
https://github.com/wuxinwang1997/wheatdetection/search?q=train.csv&unscoped_q=train.csv
'''

# marking.to_csv(NEW_INPUT_PATH+'train.csv',index=False)
# pd.read_csv(NEW_INPUT_PATH+'train.csv').shape

## 3.2 Training with Pseudo Labels on Test
In the first hidden cell, I have to hack several functions :

- re-define data loader for new marking / new ids
- disable self.best_score_threshold , self.best_final_score
- disable colab dependency
- print out when best weights are updated

We also make the next two cells (DataLoader and Fitter) hidden.

In [None]:
%%writefile ./data/transforms/build.py

import albumentations as A
from albumentations.pytorch.transforms import ToTensorV2
from .transforms import RandomErasing

def get_train_transforms(cfg):
    return A.Compose(
        [
            A.Resize(1024, 1024, p=1.0),
            A.RandomSizedCrop(min_max_height=cfg.INPUT.RSC_MIN_MAX_HEIGHT, height=cfg.INPUT.RSC_HEIGHT,
                              width=cfg.INPUT.RSC_WIDTH, p=cfg.INPUT.RSC_PROB),
            A.OneOf([
                A.HueSaturationValue(hue_shift_limit=cfg.INPUT.HSV_H, sat_shift_limit=cfg.INPUT.HSV_S,
                                     val_shift_limit=cfg.INPUT.HSV_V, p=cfg.INPUT.HSV_PROB),
                A.RandomBrightnessContrast(brightness_limit=cfg.INPUT.BC_B,
                                           contrast_limit=cfg.INPUT.BC_C, p=cfg.INPUT.BC_PROB),
            ],p=cfg.INPUT.COLOR_PROB),
            A.ToGray(p=cfg.INPUT.TOFGRAY_PROB),
            A.HorizontalFlip(p=cfg.INPUT.HFLIP_PROB),
            A.VerticalFlip(p=cfg.INPUT.VFLIP_PROB),
            # A.Resize(height=512, width=512, p=1),
            A.Cutout(num_holes=cfg.INPUT.COTOUT_NUM_HOLES, max_h_size=cfg.INPUT.COTOUT_MAX_H_SIZE,
                     max_w_size=cfg.INPUT.COTOUT_MAX_W_SIZE, fill_value=cfg.INPUT.COTOUT_FILL_VALUE, p=cfg.INPUT.COTOUT_PROB),
            ToTensorV2(p=1.0),
        ],
        p=1.0,
        bbox_params=A.BboxParams(
            format='pascal_voc',
            min_area=0,
            min_visibility=0,
            label_fields=['labels']
        )
    )

def get_valid_transforms(cfg):
    return A.Compose(
        [
            # A.Resize(height=512, width=512, p=1.0),
            ToTensorV2(p=1.0),
        ],
        p=1.0,
        bbox_params=A.BboxParams(
            format='pascal_voc',
            min_area=0,
            min_visibility=0,
            label_fields=['labels']
        )
    )

def get_test_transform():
    return A.Compose([
        # A.Resize(512, 512),
        ToTensorV2(p=1.0)
    ])

def build_transforms(cfg, is_train=True):
    if is_train:
        transform = get_train_transforms(cfg)
    else:
        transform = get_valid_transforms(cfg)

    return transform


In [None]:
!ls -lh ./data/transforms/
!cat ./data/transforms/build.py

In [None]:

import time
import warnings
from datetime import datetime
from engine.average import AverageMeter
from evaluate.inference import inference
from evaluate.evaluate import evaluate
import data
from data.transforms import build_transforms
from solver.build import make_optimizer
from solver.lr_scheduler import make_scheduler
import logging
warnings.filterwarnings("ignore")
from data.collate_batch import  collate_batch
from data.datasets.train_wheat import train_wheat

def build_dataset(cfg, marking,train_ids, valid_ids):
#     marking, train_ids, valid_ids = split_dataset(cfg)
    train_dataset = train_wheat(
        root = cfg.DATASETS.ROOT_DIR,
        image_ids=train_ids,
        marking=marking,
        transforms=build_transforms(cfg, is_train=True),
        test=False,
    )

    validation_dataset = train_wheat(
        root=cfg.DATASETS.ROOT_DIR,
        image_ids=valid_ids,
        marking=marking,
        transforms=build_transforms(cfg, is_train=False),
        test=True,
    )

    return train_dataset, validation_dataset

def make_data_loader(cfg, marking,train_ids, valid_ids, is_train=True):
    if is_train:
        batch_size = cfg.SOLVER.IMS_PER_BATCH
    else:
        batch_size = cfg.TEST.IMS_PER_BATCH

    train_dataset, validation_dataset = build_dataset(cfg, marking,train_ids, valid_ids)

    num_workers = cfg.DATALOADER.NUM_WORKERS
    train_loader = torch.utils.data.DataLoader(
        train_dataset,
        batch_size=batch_size,
        sampler=torch.utils.data.sampler.RandomSampler(train_dataset),
        pin_memory=False,
        drop_last=True,
        num_workers=num_workers,
        collate_fn=collate_batch,
    )
    val_loader = torch.utils.data.DataLoader(
        validation_dataset,
        batch_size=batch_size,
        num_workers=num_workers,
        shuffle=False,
        sampler=torch.utils.data.sampler.SequentialSampler(validation_dataset),
        pin_memory=False,
        collate_fn=collate_batch,
    )

    return train_loader, val_loader


In [None]:
class Fitter:
    def __init__(self, model, device, cfg, train_loader, val_loader, logger, mixed_precision=APEX, accum=ACCUM):
        self.config = cfg
        self.epoch = 0
        self.train_loader = train_loader
        self.val_loader = val_loader

        self.base_dir = f'{self.config.OUTPUT_DIR}'
        if not os.path.exists(self.base_dir):
            os.makedirs(self.base_dir)

        self.logger = logger
        self.best_final_score = 0.0
        self.best_score_threshold = SCORE_THRESHOLD
        
        self.mixed_precision = mixed_precision
        self.accumulate = accum
        
        self.model = model
        self.device = device
        self.model.to(self.device)
        
        self.optimizer = make_optimizer(cfg, model)
        
        if self.mixed_precision:
            self.model, self.optimizer = amp.initialize(self.model, self.optimizer, opt_level="O1", verbosity=0)
        
        self.scheduler = make_scheduler(cfg, self.optimizer, train_loader)

        self.logger.info(f'Fitter prepared. Device is {self.device}')
        self.all_predictions = []
        self.early_stop_epochs = 0
        self.early_stop_patience = self.config.SOLVER.EARLY_STOP_PATIENCE
        self.do_scheduler = True
        self.logger.info("Start training")

    def fit(self):
        for epoch in range(self.epoch, self.config.SOLVER.MAX_EPOCHS ):
            if epoch < self.config.SOLVER.WARMUP_EPOCHS:
                lr_scale = min(1., float(epoch + 1) / float(self.config.SOLVER.WARMUP_EPOCHS))
                for pg in self.optimizer.param_groups:
                    pg['lr'] = lr_scale * self.config.SOLVER.BASE_LR
                self.do_scheduler = False
            else:
                self.do_scheduler = True
            if self.config.VERBOSE:
                lr = self.optimizer.param_groups[0]['lr']
                timestamp = datetime.utcnow().isoformat()
                self.logger.info(f'\n{timestamp}\nLR: {lr}')

            t = time.time()
            summary_loss = self.train_one_epoch()

            self.logger.info(f'[RESULT]: Train. Epoch: {self.epoch}, summary_loss: {summary_loss.avg:.5f}, time: {(time.time() - t):.5f}')
            self.save(f'{self.base_dir}/last-checkpoint.bin')

            t = time.time()
            best_score_threshold, best_final_score = self.validation()

            self.logger.info( f'[RESULT]: Val. Epoch: {self.epoch}, Best Score Threshold: {best_score_threshold:.2f}, Best Score: {best_final_score:.5f}, time: {(time.time() - t):.5f}')
            if best_final_score > self.best_final_score:
                self.logger.info('** UPDATE best weights **!')
                self.best_final_score = best_final_score
                self.best_score_threshold = best_score_threshold
                self.model.eval()
                self.save(f'{self.base_dir}/best-checkpoint.bin')
                self.save_model(f'{self.base_dir}/best-model.bin')
                self.save_predictions(f'{self.base_dir}/all_predictions.csv')

            self.early_stop(best_final_score)
            if self.early_stop_epochs > self.config.SOLVER.EARLY_STOP_PATIENCE:
                self.logger.info('Early Stopping!')
                break

            if self.epoch % self.config.SOLVER.CLEAR_OUTPUT == 0:
                pass # CHANGE ONLY ONE LINE
#                 output.clear()

            self.epoch += 1

    def validation(self):
        self.model.eval()
        t = time.time()
        self.all_predictions = []
        torch.cuda.empty_cache()
        valid_loader = tqdm(self.val_loader, total=len(self.val_loader), desc="Validating")
        with torch.no_grad():
            for step, (images, targets, image_ids) in enumerate(valid_loader):
                images = list(image.cuda() for image in images)
                outputs = self.model(images)
                inference(self.all_predictions, images, outputs, targets, image_ids)
                valid_loader.set_description(f'Validate Step {step}/{len(self.val_loader)}, ' + \
                                             f'time: {(time.time() - t):.5f}')
        best_score_threshold, best_final_score = evaluate(self.all_predictions)

        return best_score_threshold, best_final_score

    def train_one_epoch(self):
        self.model.train()
        summary_loss = AverageMeter()
        loss_box_reg = AverageMeter()
        loss_classifier = AverageMeter()
        loss_objectness = AverageMeter()
        loss_rpn_box_reg = AverageMeter()
        t = time.time()
        train_loader = tqdm(self.train_loader, total=len(self.train_loader), desc="Training")
        for step, (images, targets, image_ids) in enumerate(train_loader):
            images = torch.stack(images)
            
            if self.mixed_precision == False:
                images = images.to(self.device).float()
            else:
                images = images.to(self.device).half()
            
            batch_size = images.shape[0]
            targets = [{k: v.to(self.device) for k, v in t.items()} for t in targets]
            for i in range(len(targets)):
                if self.mixed_precision == False:
                    targets[i]['boxes'] = targets[i]['boxes'].float()
                else:
                    targets[i]['boxes'] = targets[i]['boxes'].half()
                
            loss_dict = self.model(images, targets)
            loss = sum(loss for loss in loss_dict.values())
            box_reg = loss_dict['loss_box_reg']
            classifier = loss_dict['loss_classifier']
            objectness = loss_dict['loss_objectness']
            rpn_box_reg = loss_dict['loss_rpn_box_reg']

            if self.mixed_precision:
                with amp.scale_loss(loss, self.optimizer) as scaled_loss:
                    scaled_loss = scaled_loss/ self.accumulate
                    scaled_loss.backward()
            else:
                loss = loss/ self.accumulate
                loss.backward()

            summary_loss.update(loss.item(), batch_size)
            loss_box_reg.update(box_reg.item(), batch_size)
            loss_classifier.update(classifier.item(), batch_size)
            loss_objectness.update(objectness.item(), batch_size)
            loss_rpn_box_reg.update(rpn_box_reg.item(), batch_size)
            
            if step % self.accumulate == 0:
                self.optimizer.step()
                self.optimizer.zero_grad()
            
            if self.do_scheduler:
                self.scheduler.step()
            train_loader.set_description(f'Train Step {step}/{len(self.train_loader)}, ' + \
                                         f'Learning rate {self.optimizer.param_groups[0]["lr"]}, ' + \
                                         f'summary_loss: {summary_loss.avg:.5f}, ' + \
                                         f'loss_box_reg: {loss_box_reg.avg:.5f}, ' + \
                                         f'loss_classifier: {loss_classifier.avg:.5f}, ' + \
                                         f'loss_objectness: {loss_objectness.avg:.5f}, ' + \
                                         f'loss_rpn_box_reg: {loss_rpn_box_reg.avg:.5f}, ' + \
                                         f'time: {(time.time() - t):.5f}')

        return summary_loss

    def save(self, path):
        self.model.eval()
        torch.save({
            'model_state_dict': self.model.state_dict(),
            'optimizer_state_dict': self.optimizer.state_dict(),
            'scheduler_state_dict': self.scheduler.state_dict(),
            'best_score_threshold': self.best_score_threshold,
            'best_final_score': self.best_final_score,
            'epoch': self.epoch,
        }, path)

    def save_model(self, path):
        self.model.eval()
        torch.save({
            'model_state_dict': self.model.state_dict(),
            'best_score_threshold': self.best_score_threshold,
            'best_final_score': self.best_final_score,
        }, path)

    def save_predictions(self, path):
        df = pd.DataFrame(self.all_predictions)
        df.to_csv(path, index=False)

    def load(self, path):
        checkpoint = torch.load(path)
        self.model.load_state_dict(checkpoint['model_state_dict'])
        self.optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
        self.scheduler.load_state_dict(checkpoint['scheduler_state_dict'])
        self.best_score_threshold = SCORE_THRESHOLD
        self.best_final_score = 0.69 # adhoc
        self.epoch = checkpoint['epoch'] + 1

    def early_stop(self, score):
        if score < self.best_final_score:
            self.early_stop_epochs += 1
        else:
            self.early_stop_epochs = 0

In [None]:
n_test = len(os.listdir('/kaggle/input/global-wheat-detection/test/'))
print(n_test)

In [None]:
cfg.defrost()

cfg['DATASETS']['ROOT_DIR'] = NEW_INPUT_PATH
cfg.INPUT.HSV_H = HSV_H
cfg.INPUT.HSV_S = HSV_S
cfg.INPUT.HSV_V = HSV_V
cfg.INPUT.BC_B = BC_B
cfg.INPUT.BC_C = BC_C
cfg.INPUT.COTOUT_NUM_HOLES=0 
cfg.SOLVER.BASE_LR = BASE_LR
cfg.SOLVER.BIAS_LR_FACTOR = BIAS_LR_FACTOR
cfg.SOLVER.MOMENTUM=MOMENTUM
cfg.SOLVER.WARMUP_EPOCHS=WARMUP_EPOCHS

OUTPUT_DIRS = []
for ii in range(len(models)):
    OUTPUT_DIRS.append("/kaggle/working/weights_%d/"%ii)
    if os.path.exists(OUTPUT_DIRS[-1]) == False:
        print('create ',OUTPUT_DIRS[-1])
        !mkdir {OUTPUT_DIRS[-1]}

!ls /kaggle/working/

In [None]:
from utils.logger import setup_logger

fitters=[]
for ii,path in enumerate(cfg['TEST']['WEIGHT']):
    
    cfg['OUTPUT_DIR'] = OUTPUT_DIRS[ii]
    
    checkpoint = torch.load(path)
    cfg.SOLVER.MAX_EPOCHS = checkpoint['epoch']+PSEUDO_EPOCHS+1 # 
    if n_test <11: #standard trick to save kernel committing time
        cfg.SOLVER.MAX_EPOCHS = checkpoint['epoch']+PSEUDO_EPOCHS_COMMIT+1
    print('epochs = %d+%d+%d'%(checkpoint['epoch'],PSEUDO_EPOCHS_COMMIT,1))
    print(cfg)
    
    print('*** weight path ***', path)
    model = build_model(cfg)
    model.load_state_dict(checkpoint['model_state_dict'])
    
    train_loader, val_loader = make_data_loader(cfg,marking_list[ii],train_ids_list[ii], valid_ids_list[ii])
    logger = setup_logger("logger", cfg['OUTPUT_DIR'], 0)
    
    
    
    fitter = Fitter(model=model, device="cuda", cfg=cfg, train_loader=train_loader, val_loader=val_loader, logger=logger)
    fitter.load(path)
    fitter.fit()
    
    !rm -f {OUTPUT_DIRS[ii]+'last-checkpoint.bin'} # remove last checkpoint
    
    seed_everything(ii+1) # change random states of each model
    cfg['SEED'] = ii+1

### Now we have a new best weights

In [None]:
!date
!ls -lh {cfg['OUTPUT_DIR']} # check the new weights, there will always be last-checkpoint.bin , but may not best-checkpoint.bin

In [None]:
!ls -lh {cfg['OUTPUT_DIR']+'/last-checkpoint.bin'}

In [None]:
best_paths = []

for ii in range(len(OUTPUT_DIRS)):
    if os.path.exists(OUTPUT_DIRS[ii]+'best-checkpoint.bin'):
        best_path = OUTPUT_DIRS[ii]+'best-checkpoint.bin'
    elif os.path.exists(OUTPUT_DIRS[ii]+'last-checkpoint.bin'):
        best_path = OUTPUT_DIRS[ii]+'last-checkpoint.bin'
    else:
        best_path = BEST_PATHS[ii] # use the non-pseudo-labeling path
    best_paths.append(best_path)
    
    !ls -lh {OUTPUT_DIRS[ii]}
    print(' ---- \n')
    
print('best paths are ', best_paths)

## 3.3 re-predict the test data with new weights

In [None]:
if True:
    cfg['OUTPUT_DIR'] = "/kaggle/working/"
    cfg['DATASETS']['ROOT_DIR'] = "/kaggle/input/global-wheat-detection"
    cfg['TEST']['WEIGHT'] = best_paths
    cfg['TEST']['IMS_PER_BATCH'] = 1
    print(cfg)
    test_loader = make_test_data_loader(cfg)
    
    # I load checkpoints twice to ensure that correct weights are loaded
#     checkpoint = torch.load(best_path)
#     model.load_state_dict(checkpoint['model_state_dict']) 
    tester = Tester(models=models, device=device, cfg=cfg, test_loader=test_loader, n_viz=N_VIZ)
    tester.load(best_paths)
    print('success load weights!', best_paths)
    
    tester.test(pp_shrink=PP_SHRINK[1])

In [None]:
# check whether submission.csv is re-created
!date
!ls -lh {cfg['OUTPUT_DIR']}

# Delete the repo when finish

In [None]:

%cd ..
!rm -rf {CODE_PATH}

In [None]:
!rm -rf {NEW_INPUT_PATH}

In [None]:
!ls /kaggle/working
!ls /kaggle/working/wheatdetection
