# Ensemble Notebook (Catalyst)

* Experiment on Test Time Augmentation

### Experiments

* [x] Efficientnet b5 **PL:0.652**
* [x] FP 16，refer to this [catalyst tutorial](https://github.com/catalyst-team/catalyst/blob/master/examples/notebooks/segmentation-tutorial.ipynb)
    * The model will have gradient overflow after 5th epoch, everything else is okay
* [x] Saving & Loading from JIT **PL:0.655**
* [x] Ensemble
* [x] 384x576
* [x] polygon convex
* [x] Test the funnel network again ==> It's not working really
* [x] Ranger optimizer 
    * [x] RADAM
    * [x] Look Ahead
* [x] Find threshold on a portion of train dataset
* [x] Test Time Augmentation (TTA)
* [ ] Use dataleak info

### Installing Apex for FP16

```shell
git clone https://github.com/NVIDIA/apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./apex
is_fp16_used = True
```

### Other Installations

```
pip install catalyst
pip install pretrainedmodels
pip install git+https://github.com/qubvel/segmentation_models.pytorch
pip install pip pytorch-toolbelt
pip install torchvision==0.4
```

Our starter kernel is from [this open kernel](https://www.kaggle.com/artgor/segmentation-in-pytorch-using-convenient-tools)

## ======== Below is from the original notebook ===========

## General information

In this kernel I work with the data from Understanding Clouds from Satellite Images competition.
```
Shallow clouds play a huge role in determining the Earth's climate. They’re also difficult to understand and to represent in climate models. By classifying different types of cloud organization, researchers at Max Planck hope to improve our physical understanding of these clouds, which in turn will help us build better climate models.
```

So in this competition we are tasked with multiclass segmentation task: finding 4 different cloud patterns in the images. On the other hand, we make predictions for each pair of image and label separately, so this could be treated as 4 binary segmentation tasks.
It is important to notice that images (and masks) are `1400 x 2100`, but predicted masks should be `350 x 525`.

In this kernel I'll use (or will use in next versions) the following notable libraries:
- [albumentations](https://github.com/albu/albumentations): this is a great library for image augmentation which makes it easier and more convenient
- [catalyst](https://github.com/catalyst-team/catalyst): this is a great library which makes using PyTorch easier, helps with reprodicibility and contains a lot of useful utils
- [segmentation_models_pytorch](https://github.com/qubvel/segmentation_models.pytorch): this is a great library with convenient wrappers for models, losses and other useful things
- [pytorch-toolbelt](https://github.com/BloodAxe/pytorch-toolbelt): this is a great library with many useful shortcuts for building pytorch models


UPD: Version 35 - changed calculation of optimal threshold and min size
![](https://i.imgur.com/EOvz5kd.png)

## Importing libraries

In [1]:
import os
import cv2
import collections
import time 
import tqdm
from PIL import Image
from functools import partial
train_on_gpu = True

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score

import torchvision
import torchvision.transforms as transforms
import torch
from torch.utils.data import TensorDataset, DataLoader,Dataset
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.optim import lr_scheduler
from torch.utils.data.sampler import SubsetRandomSampler
from torch.optim.lr_scheduler import StepLR, ReduceLROnPlateau, CosineAnnealingLR

import albumentations as albu
from albumentations import pytorch as AT

from catalyst.data import Augmentor
from catalyst.dl import utils
from catalyst.data.reader import ImageReader, ScalarReader, ReaderCompose, LambdaReader
from catalyst.dl.runner import SupervisedRunner
from catalyst.contrib.models.segmentation import Unet
from catalyst.dl.callbacks import DiceCallback, EarlyStoppingCallback, InferCallback, CheckpointCallback

import segmentation_models_pytorch as smp

from utils_google import download_blob,upload_blob,list_blobs_with_prefix

## Helper functions and classes

In [2]:
CUDA = True if torch.cuda.is_available() else False

def reverseDim(x, dim):
    """
    reverse a tensor on certain dim
    """
    dimlen = x.size()[dim]
    rev_idx = torch.LongTensor(list(dimlen-i-1 for i in range(dimlen)))
    if CUDA: rev_idx = rev_idx.cuda()
    return x.index_select(dim, rev_idx)

#### Configs and Hyper-Params

In [3]:
from utils_ucsi import *

In [4]:
TRAIN = False # Do we have training?
num_epochs = 40 # How many epochs are we going to train?

FP16 = False # Do we use half precision?
fp16_params = dict(opt_level = "O2") if FP16 else None

LOAD = False # Do we load a trained weights at the beginning
LOAD_PATH = "cata-eff-b5.pth" # The model weight path, if we load a trained weights at the begining

ENCODER = 'efficientnet-b5' # Encoder model name
ENCODER_WEIGHTS = 'imagenet' # Encoder pretrained weights
DEVICE = 'cuda' 

ACTIVATION = None

TH_FIND = True
#class_params = {0: (0.50, 9200), 1: (0.60, 9200), 2: (0.65, 9200), 3: (0.40, 9200)}
#class_params = {0: (0.55, 10000), 1: (0.65, 10000), 2: (0.7, 10000), 3: (0.45, 10000)}

MIN_SIZE_RANGE = 3
MIN_SIZE = [8000, 8500, 9000, 9500, 10000][-MIN_SIZE_RANGE:]
#MIN_SIZE = [0, 100, 1200, 5000,8000, 10000][-MIN_SIZE_RANGE:]

INPUT_SIZE = (384,576)

# Are we using train dataset to find the threshold
FIND_TRAIN = True
# How much percentage of train dataset are we using?
SAMPLE_RATIO = .4

In [5]:
list_blobs_with_prefix("milkyway","pth/")

Blobs:
pth/
pth/1109_b5_onecycle_0.01_r2_best.pth
pth/1110_b6_onecycle_0.01_r2_best.pth
pth/b5_0.72_best.pth
pth/b6_fold/
pth/b6_fold/b6_f0.pth
pth/b6_fold/b6_f1.pth
pth/b6_fold/b6_f2.pth
pth/b6_fold/b6_f3.pth
pth/b6_fold/b6_f4.pth
pth/dpn131_0.73_best.pth
pth/fpn_se_resnext.pth
pth/se_resnext101_32x4d_5_folds/
pth/se_resnext101_32x4d_5_folds/resnext_f0_best.pth
pth/se_resnext101_32x4d_5_folds/resnext_f1_best.pth
pth/se_resnext101_32x4d_5_folds/resnext_f2_best.pth
pth/se_resnext101_32x4d_5_folds/resnext_f3_best.pth
pth/se_resnext101_32x4d_5_folds/resnext_f4_best.pth
pth/se_resnext50_32x4d_256x384_3-folds/
pth/se_resnext50_32x4d_256x384_3-folds/seg_se_resnext50_32x4d_f0.pth
pth/se_resnext50_32x4d_256x384_3-folds/seg_se_resnext50_32x4d_f1.pth
pth/se_resnext50_32x4d_256x384_3-folds/seg_se_resnext50_32x4d_f2.pth
pth/se_resnext50_32x4d_384x576_3-folds/
pth/se_resnext50_32x4d_384x576_3-folds/se_resnext50_32x4d_f0.pth
pth/se_resnext50_32x4d_384x576_3-folds/se_resnext50_32x4d_f1.pth
pth/se_res

In [6]:
from pathlib import Path
HOME = Path(os.environ["HOME"])

path = HOME/'ucsi'
# os.listdir(path)

In [9]:
MODEL_PRED = True
# Ensemble Model Path List

MODEL_PATHS = [
    #384x576
    
    {"path":"pth/fpn_se_resnext.pth","encoder":"se_resnext101_32x4d"},#0.6596
    {"path":"pth/senet154_384x576.pth","encoder":"senet154"},  #0.6511
#   {"path":"pth/1109_b5_onecycle_0.01_r2_best.pth", "encoder":"efficientnet-b5"}, 0.6428
#   {"path":"pth/1110_b6_onecycle_0.01_r2_best.pth","encoder":"efficientnet-b6"}, 0.6552
    
    #se_resnext101_32x4d 0.6584
    {"path":"pth/se_resnext101_32x4d_5_folds/resnext_f0_best.pth","encoder":"se_resnext101_32x4d"},
    {"path":"pth/se_resnext101_32x4d_5_folds/resnext_f1_best.pth","encoder":"se_resnext101_32x4d"},
    {"path":"pth/se_resnext101_32x4d_5_folds/resnext_f2_best.pth","encoder":"se_resnext101_32x4d"},
    {"path":"pth/se_resnext101_32x4d_5_folds/resnext_f3_best.pth","encoder":"se_resnext101_32x4d"},
    {"path":"pth/se_resnext101_32x4d_5_folds/resnext_f4_best.pth","encoder":"se_resnext101_32x4d"},
    # efficientnet-b6 0.6598
    {"path":"pth/b6_fold/b6_f0.pth","encoder":"efficientnet-b6"},
    {"path":"pth/b6_fold/b6_f1.pth","encoder":"efficientnet-b6"},
    {"path":"pth/b6_fold/b6_f2.pth","encoder":"efficientnet-b6"},
    {"path":"pth/b6_fold/b6_f3.pth","encoder":"efficientnet-b6"},
    {"path":"pth/b6_fold/b6_f4.pth","encoder":"efficientnet-b6"},
    #se_resnext50_32x4d None
    {"path":"pth/se_resnext50_32x4d_384x576_3-folds/se_resnext50_32x4d_f0.pth","encoder":"se_resnext50_32x4d"},
    {"path":"pth/se_resnext50_32x4d_384x576_3-folds/se_resnext50_32x4d_f1.pth","encoder":"se_resnext50_32x4d"},
    {"path":"pth/se_resnext50_32x4d_384x576_3-folds/se_resnext50_32x4d_f2.pth","encoder":"se_resnext50_32x4d"},


]


# INFER_BS = 28

Download file from google cloud storage if not exist

In [10]:
def makeSureFile(gcs_path):
    if os.path.exists(path/gcs_path):
        print("File already exists:\t%s"%(gcs_path))
    else:
        download_blob("milkyway",gcs_path,path/gcs_path)
    

_ = list(makeSureFile(m["path"]) for m in MODEL_PATHS)

File already exists:	pth/fpn_se_resnext.pth
File already exists:	pth/senet154_384x576.pth
File already exists:	pth/se_resnext101_32x4d_5_folds/resnext_f0_best.pth
File already exists:	pth/se_resnext101_32x4d_5_folds/resnext_f1_best.pth
File already exists:	pth/se_resnext101_32x4d_5_folds/resnext_f2_best.pth
File already exists:	pth/se_resnext101_32x4d_5_folds/resnext_f3_best.pth
File already exists:	pth/se_resnext101_32x4d_5_folds/resnext_f4_best.pth
File already exists:	pth/b6_fold/b6_f0.pth
File already exists:	pth/b6_fold/b6_f1.pth
File already exists:	pth/b6_fold/b6_f2.pth
File already exists:	pth/b6_fold/b6_f3.pth
File already exists:	pth/b6_fold/b6_f4.pth
File already exists:	pth/se_resnext50_32x4d_384x576_3-folds/se_resnext50_32x4d_f0.pth
File already exists:	pth/se_resnext50_32x4d_384x576_3-folds/se_resnext50_32x4d_f1.pth
File already exists:	pth/se_resnext50_32x4d_384x576_3-folds/se_resnext50_32x4d_f2.pth


## Encoders
Encoder backbone of fpn/unet
#### VGG
vgg11, vgg13, vgg16, vgg19, vgg11bn, vgg13bn, vgg16bn, vgg19bn,
#### DenseNet
densenet121, densenet169, densenet201, densenet161, dpn68, dpn98, dpn131,
inceptionresnetv2,
#### ResNet
resnet18, resnet34, resnet50, resnet101, resnet152,
resnext50_32x4d, resnext101_32x8d,
#### Se ResNet
se_resnet50, se_resnet101, se_resnet152,
#### Se ResNext
se_resnext50_32x4d, se_resnext101_32x4d,
#### Se Net
senet154,
#### Efficient Net
efficientnet-b0, efficientnet-b1, efficientnet-b2, efficientnet-b3, efficientnet-b4, efficientnet-b5, efficientnet-b6, efficientnet-b7

## Data overview

Let's have a look at the data first.

We have folders with train and test images, file with train image ids and masks and sample submission.

In [11]:
train = pd.read_csv(f'{path}/train.csv')
sub = pd.read_csv(f'{path}/sample_submission.csv')

In [12]:
train.head()

Unnamed: 0,Image_Label,EncodedPixels
0,0011165.jpg_Fish,264918 937 266318 937 267718 937 269118 937 27...
1,0011165.jpg_Flower,1355565 1002 1356965 1002 1358365 1002 1359765...
2,0011165.jpg_Gravel,
3,0011165.jpg_Sugar,
4,002be4f.jpg_Fish,233813 878 235213 878 236613 878 238010 881 23...


In [13]:
n_train = len(os.listdir(f'{path}/train_images'))
n_test = len(os.listdir(f'{path}/test_images'))
print(f'There are {n_train} images in train dataset')
print(f'There are {n_test} images in test dataset')

There are 5546 images in train dataset
There are 3698 images in test dataset


In [14]:
train['Image_Label'].apply(lambda x: x.split('_')[1]).value_counts()

Flower    5546
Fish      5546
Gravel    5546
Sugar     5546
Name: Image_Label, dtype: int64

So we have ~5.5k images in train dataset and they can have up to 4 masks: Fish, Flower, Gravel and Sugar.

In [15]:
train.loc[train['EncodedPixels'].isnull() == False, 'Image_Label'].apply(lambda x: x.split('_')[1]).value_counts()

Sugar     3751
Gravel    2939
Fish      2781
Flower    2365
Name: Image_Label, dtype: int64

In [16]:
train.loc[train['EncodedPixels'].isnull() == False, 'Image_Label'].apply(lambda x: x.split('_')[0]).value_counts().value_counts()

2    2372
3    1560
1    1348
4     266
Name: Image_Label, dtype: int64

But there are a lot of empty masks. In fact only 266 images have all four masks. It is important to remember this.

In [17]:
train['label'] = train['Image_Label'].apply(lambda x: x.split('_')[1])
train['im_id'] = train['Image_Label'].apply(lambda x: x.split('_')[0])


sub['label'] = sub['Image_Label'].apply(lambda x: x.split('_')[1])
sub['im_id'] = sub['Image_Label'].apply(lambda x: x.split('_')[0])

Let's have a look at the images and the masks.

In [18]:
#fig = plt.figure(figsize=(25, 16))
#for j, im_id in enumerate(np.random.choice(train['im_id'].unique(), 4)):
#    for i, (idx, row) in enumerate(train.loc[train['im_id'] == im_id].iterrows()):
#        ax = fig.add_subplot(5, 4, j * 4 + i + 1, xticks=[], yticks=[])
#        im = Image.open(f"{path}/train_images/{row['Image_Label'].split('_')[0]}")
#        plt.imshow(im)
#       mask_rle = row['EncodedPixels']
#        try: # label might not be there!
#            mask = rle_decode(mask_rle)
#        except:
#            mask = np.zeros((1400, 2100))
#        plt.imshow(mask, alpha=0.5, cmap='gray')
#        ax.set_title(f"Image: {row['Image_Label'].split('_')[0]}. Label: {row['label']}")

We can see that masks can overlap. Also we can see that clouds are really similar to fish, flower and so on. Another important point: masks are often quite big and can have seemingly empty areas.

## Preparing data for modelling

At first, let's create a list of unique image ids and the count of masks for images. This will allow us to make a stratified split based on this count.

In [19]:
id_mask_count = train.loc[train['EncodedPixels'].isnull() == False, 'Image_Label'].apply(lambda x: x.split('_')[0]).value_counts().\
reset_index().rename(columns={'index': 'img_id', 'Image_Label': 'count'})
train_ids, valid_ids = train_test_split(id_mask_count['img_id'].values, random_state=42, stratify=id_mask_count['count'], test_size=0.1)
test_ids = sub['Image_Label'].apply(lambda x: x.split('_')[0]).drop_duplicates().values

## Setting up data for training in Catalyst

Now we define model and training parameters

In [20]:
model = smp.FPN(
    encoder_name=ENCODER, 
    encoder_weights=ENCODER_WEIGHTS, 
    classes=4, 
    activation=ACTIVATION,
)
preprocessing_fn = smp.encoders.get_preprocessing_fn(ENCODER, ENCODER_WEIGHTS)

In [21]:
num_workers = 0
bs = 10 if FP16 else 5
train_dataset = CloudDataset(df=train, datatype='train', img_ids=train_ids, transforms = get_training_augmentation(), preprocessing=get_preprocessing(preprocessing_fn))
valid_dataset = CloudDataset(df=train, datatype='valid', img_ids=valid_ids, transforms = get_validation_augmentation(), preprocessing=get_preprocessing(preprocessing_fn))

train_loader = DataLoader(train_dataset, batch_size=bs, shuffle=True, num_workers=num_workers)
valid_loader = DataLoader(valid_dataset, batch_size=bs, shuffle=False, num_workers=num_workers)

loaders = {
    "train": train_loader,
    "valid": valid_loader
}

SAMPLE_NUMBER = int(SAMPLE_RATIO * len(train_dataset))
print("sampling %s for threshold finding"%(SAMPLE_NUMBER))

sampling 1996 for threshold finding



Using lambda is incompatible with multiprocessing. Consider using regular functions or partial().



In [22]:

logdir = "./logs/segmentation"

# model, criterion, optimizer
# optimizer = torch.optim.Adam([
#     {'params': model.decoder.parameters(), 'lr': 1e-3}, 
#     {'params': model.encoder.parameters(), 'lr': 5e-4},  # Pretrained section of the model using smaller lr
# ], 
#     weight_decay=3e-4)
# scheduler = ReduceLROnPlateau(optimizer, factor=0.25, patience=2)
# criterion = smp.utils.losses.BCEDiceLoss(eps=1.)
runner = SupervisedRunner()

In [23]:
if LOAD:
    model.load_state_dict(torch.load(LOAD_PATH))

## Exploring predictions
Let's make predictions on validation dataset.

At first we need to optimize thresholds 

In [24]:
from catalyst.dl.core import Callback, CallbackOrder, RunnerState
from collections import defaultdict

# A modified version to save memory when do the inference
class InferCallback(Callback):
    def __init__(self, out_dir=None, out_prefix=None):
        super().__init__(CallbackOrder.Internal)
        self.out_dir = out_dir
        self.out_prefix = out_prefix
        self.predictions = defaultdict(lambda: [])
        self._keys_from_state = ["out_dir", "out_prefix"]

    def on_stage_start(self, state: RunnerState):
        for key in self._keys_from_state:
            value = getattr(state, key, None)
            if value is not None:
                setattr(self, key, value)
        # assert self.out_prefix is not None
        if self.out_dir is not None:
            self.out_prefix = str(self.out_dir) + "/" + str(self.out_prefix)
        if self.out_prefix is not None:
            os.makedirs(os.path.dirname(self.out_prefix), exist_ok=True)

    def on_loader_start(self, state: RunnerState):
        self.predictions = {"logits":list()}
    
    def on_batch_end(self, state: RunnerState):
        dct = state.output
        dct = {key: value.detach().cpu().numpy() for key, value in dct.items()}
        for key, value in dct.items():
            pred = np.zeros((len(value)*4, 350, 525), dtype = np.float64)
            for i,output in enumerate(value):
                for j, probability in enumerate(output):
                    probability = cv2.resize(probability, dsize=(525, 350), interpolation=cv2.INTER_LINEAR)
                    pred[i * 4 + j, :, :] = sigmoid(probability)
                    pred = pred.astype(np.float16)
            self.predictions["logits"].append(pred)
        print(">",end = "")

    def on_loader_end(self, state: RunnerState):
        self.predictions = {
            key: np.concatenate(value, axis=0)
            for key, value in self.predictions.items()
        }

In [25]:
infer_cb = []

In [26]:
class ensModel(nn.Module):
    def __init__(self, models):
        super().__init__()
        self.models = models
    
    def __call__(self, x):
        res = []
        x = x.cuda()
#         print(x.size())
        # Test Time Augmentation
        x2 = reverseDim(x,2) # flip the 3rd dimension
        x3 = reverseDim(x,3) # flip the 4th dimension
        with torch.no_grad():
            for m in self.models:
                
                y1_ = m(x)
                # flip back the prediction
                y2_ = reverseDim(m(x2),2)
                y3_  = reverseDim(m(x3),3)
                y_ = torch.mean(torch.stack([y1_,y2_,y3_]), dim = 0)
                res.append(y_)
                
        res = torch.stack(res)
        return torch.mean(res, dim=0)

In [27]:
if MODEL_PRED:
    print("Loading ensemble models")
    print("="*70)
    models = list(loadModel(path = p["path"],encoder = p["encoder"]).eval() for p in MODEL_PATHS)
    if torch.cuda.is_available():
        models = list(m.cuda() for m in models)
    model = ensModel(models)
else:
    infer_cb.append(CheckpointCallback(resume=f"{logdir}/checkpoints/best.pth"),)
infer_cb.append(InferCallback())

Loading ensemble models
loading se_resnext101_32x4d from path 'pth/fpn_se_resnext.pth'
loading senet154 from path 'pth/senet154_384x576.pth'
loading se_resnext101_32x4d from path 'pth/se_resnext101_32x4d_5_folds/resnext_f0_best.pth'
loading se_resnext101_32x4d from path 'pth/se_resnext101_32x4d_5_folds/resnext_f1_best.pth'
loading se_resnext101_32x4d from path 'pth/se_resnext101_32x4d_5_folds/resnext_f2_best.pth'
loading se_resnext101_32x4d from path 'pth/se_resnext101_32x4d_5_folds/resnext_f3_best.pth'
loading se_resnext101_32x4d from path 'pth/se_resnext101_32x4d_5_folds/resnext_f4_best.pth'
loading efficientnet-b6 from path 'pth/b6_fold/b6_f0.pth'
loading efficientnet-b6 from path 'pth/b6_fold/b6_f1.pth'
loading efficientnet-b6 from path 'pth/b6_fold/b6_f2.pth'
loading efficientnet-b6 from path 'pth/b6_fold/b6_f3.pth'
loading efficientnet-b6 from path 'pth/b6_fold/b6_f4.pth'
loading se_resnext50_32x4d from path 'pth/se_resnext50_32x4d_384x576_3-folds/se_resnext50_32x4d_f0.pth'
loadi

In [28]:
encoded_pixels = []

# Rebuild data loader

train_dataset = CloudDataset(df=train, datatype='train', img_ids=train_ids[:SAMPLE_NUMBER], transforms = get_training_augmentation(), preprocessing=get_preprocessing(preprocessing_fn))
train_loader = DataLoader(train_dataset, batch_size=bs*4, shuffle=False, num_workers=num_workers)
valid_loader = DataLoader(valid_dataset, batch_size=bs*4, shuffle=False, num_workers=num_workers)
if TH_FIND:    
    loaders = {"infer": train_loader if FIND_TRAIN else valid_loader}
    # Run inference through model
    print("Running inference:")
    print("="*(len(train_dataset if FIND_TRAIN else valid_dataset)//(bs*8)))
    runner.infer(
        model=model,
        loaders=loaders,
        callbacks=infer_cb,
    )
    valid_masks = []
    print("Build valid mask on :\t%s"%("train data" if FIND_TRAIN else "valid data"))
    for i, batch in enumerate(tqdm.tqdm(train_dataset if FIND_TRAIN else valid_dataset)):
        image, mask = batch
        for m in mask: # for each skeg class
            if m.shape != (350, 525):
                m = cv2.resize(m, dsize=(525, 350), interpolation=cv2.INTER_LINEAR)
            valid_masks.append(m)
    probabilities  = runner.callbacks[0].predictions["logits"]
else:
    print("Not running infer for threshold finding")

Running inference:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

  0%|          | 1/1996 [00:00<05:34,  5.96it/s]

Build valid mask on :	train data


100%|██████████| 1996/1996 [04:52<00:00,  6.82it/s]


## Find optimal values on train dataset

We found if we use the valid dataset to do the threshold finding, the threshold & min_size is just prone to volatility. Hence we are using larger dataset to find the threshold.

This means we have to store bigger predict logits in the process. I have to compose costomize a callback class to prevent memory shortage, by not storing the predict logits in the middle step.

The following is from the notebook we forked


```
========================================
```

## Find optimal values

First of all, my thanks to @samusram for finding a mistake in my validation
https://www.kaggle.com/c/understanding_cloud_organization/discussion/107711#622412

And now I find optimal values separately for each class.

In [29]:
def post_process(probability, threshold, min_size):
    """
    Post processing of each predicted mask, components with lesser number of pixels
    than `min_size` are ignored
    """
    # don't remember where I saw it
    mask = cv2.threshold(np.float32(probability), threshold, 1, cv2.THRESH_BINARY)[1]
    num_component, component = cv2.connectedComponents(mask.astype(np.uint8))
    predictions = np.zeros((350, 525), np.float32)
    num = 0
    for c in range(1, num_component):
        p = (component == c)
        if p.sum() > min_size:
            predictions[p] = 1
            num += 1
    return predictions, num

In [30]:
if TH_FIND:
    class_params = {}
    for class_id in range(4):
        print(class_id)
        attempts = []
        for t in range(30, 75, 5):
            t /= 100
            for ms in MIN_SIZE:
                masks = []
                for i in range(class_id, len(probabilities), 4):
                    probability = probabilities[i]
#                     predict, num_predict = post_process(sigmoid(probability), t, ms)
                    predict, num_predict = post_process(probability, t, ms)
                    masks.append(predict)
    
                d = []
                for i, j in zip(masks, valid_masks[class_id::4]):
                    if (i.sum() == 0) & (j.sum() == 0):
                        d.append(1)
                    else:
                        d.append(dice(i, j))
    
                attempts.append((t, ms, np.mean(d)))
    
        attempts_df = pd.DataFrame(attempts, columns=['threshold', 'size', 'dice'])
        attempts_df = attempts_df.sort_values('dice', ascending=False)
        print(attempts_df.head())
        best_threshold = attempts_df['threshold'].values[0]
        best_size = attempts_df['size'].values[0]
        
        class_params[class_id] = (best_threshold, best_size)
else:
    print("Not runnning threshold finding")

0
    threshold   size      dice
19       0.60   9500  0.547078
20       0.60  10000  0.547045
23       0.65  10000  0.546517
18       0.60   9000  0.546435
17       0.55  10000  0.546405
1
    threshold   size      dice
25       0.70   9500  0.651841
23       0.65  10000  0.651756
26       0.70  10000  0.651571
24       0.70   9000  0.650690
22       0.65   9500  0.650301
2
    threshold   size      dice
20       0.60  10000  0.525753
19       0.60   9500  0.525180
23       0.65  10000  0.525157
21       0.65   9000  0.524941
22       0.65   9500  0.524727
3
    threshold   size      dice
7        0.40   9500  0.387231
8        0.40  10000  0.387005
5        0.35  10000  0.386709
11       0.45  10000  0.386640
10       0.45   9500  0.386302


```
{0: (0.7, 10000), 1: (0.7, 10000), 2: (0.65, 10000), 3: (0.7, 10000)}
{0: (0.55, 10000), 1: (0.7, 10000), 2: (0.65, 10000), 3: (0.5, 10000)}
{0: (0.55, 10000), 1: (0.7, 10000), 2: (0.5, 10000), 3: (0.55, 10000)}
```

In [31]:
print(class_params)

{0: (0.6, 9500), 1: (0.7, 9500), 2: (0.6, 10000), 3: (0.4, 9500)}


In [32]:
#sns.lineplot(x='threshold', y='dice', hue='size', data=attempts_df);
#plt.title('Threshold and min size vs dice for one of the classes');

Now let's have a look at our masks.


## Predicting

Clear the unused gpu ram and cpu ram

In [33]:
import gc
torch.cuda.empty_cache()
gc.collect()

60

In [34]:
test_dataset = CloudDataset(df=sub, datatype='test', img_ids=test_ids, transforms = get_validation_augmentation(), preprocessing=get_preprocessing(preprocessing_fn))
test_loader = DataLoader(test_dataset, batch_size=36, shuffle=False, num_workers=0)

loaders = {"test": test_loader}

In [35]:
def predImg(batch, discount,image_id):
    rlist = []
    ctlist = []
    for probability in batch:
        probability = probability.cpu().detach().numpy()
        if probability.shape != (350, 525):
            probability = cv2.resize(probability, dsize=(525, 350), interpolation=cv2.INTER_LINEAR)
        th = class_params[image_id % 4][0]-discount*0.05
        ms = class_params[image_id % 4][1]-discount*800
#         if discount>0.:
#             print(image_id)
#             print(image_id%4)
#             print(th,ms)
        predict, num_predict = post_process(sigmoid(probability), 
                                            th, 
                                            ms, )
        if num_predict == 0:
            r = ''
            ct = 0
        else:
            r = mask2rle(predict)
            ct = 1
        rlist.append(r)
        ctlist.append(ct)
        image_id += 1
    return rlist,int(sum(ctlist)), discount,image_id

In [36]:
encoded_pixels = []
image_id = 0
runner.model = model
for i, test_batch in enumerate(tqdm.tqdm(loaders['test'])):
#     if i==1: break
    runner_out = runner.predict_batch({"features": test_batch[0].cuda()})['logits']
    for i, batch in enumerate(runner_out): # for each image
        discount = 0.
        rlist,ct,discount,image_id = predImg(batch,discount,image_id)
        while (ct==0) & (discount<7.): # if empty 
            discount+=1.
            image_id -= 4 
            rlist,ct,discount,image_id = predImg(batch,discount,image_id)
        for r in rlist:
            encoded_pixels.append(r)

100%|██████████| 103/103 [1:58:12<00:00, 68.86s/it]


Saving to CSV file

In [37]:
from datetime import datetime

In [38]:
sub['EncodedPixels'] = encoded_pixels
sub.to_csv('%s_submission.csv'%(datetime.now().strftime("%m%d_%H%M%S")), columns=['Image_Label', 'EncodedPixels'], index=False)