# ※ We didn't use any Hand-labeling for this competition
### In this notebook, we introduce our findings, and solutions to get best score for public LB (we got 2nd place at former public LB with this model).

# Agenda
## ・Magic1: Head-Shot Post Processing Stage1
## ・Head-Shot Post Processing Stage2
## ・Magic2: Brightness-based PreProcessing
## ・Training Information
## ・Pseudo Labeling
## ・Idea for advance: Rotation based Head-Shot Stage2

# Magic 1: Head-Shot Post Processing system
### Whole system is like below

In [None]:
from PIL import Image
im = Image.open('../input/headshotpp-diagram/hubmap_diagram0.png')
im

# Stage 1
### In this system, at first we predict each tile in the raw image.
### If image size==(X,Y), we predict (X//1024) * (Y//1024) tiles, and map them into 1 tiff-wise prediction by EfficientUnet-B5. We repeat & add this for 4 times. (x-shift(0,512), y-shift(0,512), total 2x2=4)
### we can roughly find each glomeruli's place in the tiff image like below.

In [None]:
im = Image.open('../input/stage-1/stage_1.png')
im

# Stage 2
### Nextly, we extract center positions of each glomeruli by using pseudo-morphological transformation.
### After that, we create a tile which center is same as each glomeruli, and make a prediction for each glomeruli.
### This system gave us significant improvement (+0.005〜) at last public LB(before updated)

In [None]:
im = Image.open('../input/stage2/stage2.png')
im

### By adding Stage2, we can fix weird prediction caused by random tile place which doesn't depend on glomeruli's place.

In [None]:
im = Image.open('../input/pseudo-c/pseudo_comparison.png')
im

# Magic 2: Brightness-based PreProcessing
### Below image shows this preprocessing

In [None]:
im = Image.open('../input/bright-01/bright_01.png')
im

### When we checked model predictions, we found there were 2 types of area that are really hard to predict, dark / bright place like below.

In [None]:
im = Image.open('../input/bright-dark/bd_00.png')
im

### Then we've created classifiers that judges whether tile is too bright/dark or not, based on mean/std for r, g, b of tile and made bright tile darker, dark tile brighter.
### Those CLS predictions are like below (black place are predicted to be too bright/dark).
### This preprocessing made us huge jump (+0.015〜) at last public LB

In [None]:
im = Image.open('../input/bd-cls/bd_cls.png')
im

### and below are predictions of before/after this preprocessing

In [None]:
im = Image.open('../input/prep-ba/prep_before_after.png')
im

### And below is the inference code(Brightness Preprocess + Headshot Post Process)

In [None]:
'''
import glob
import gc
import rasterio
from rasterio.windows import Window
import pathlib

identity = rasterio.Affine(1, 0, 0, 0, 1, 0)

p = pathlib.Path('../input/')
subm = {}
test_transform = albumentations.Compose([])

for i, filename in enumerate(p.glob('test/*.tiff')):
    
    print(f'{i+1} Predicting {filename.stem}')
    
    dataset = rasterio.open(filename.as_posix(), transform=identity)
    slices = make_grid(dataset.shape, window=1024)
    preds = np.zeros(dataset.shape, dtype=np.uint8)
    shape = dataset.shape
    
    for (x1, x2, y1, y2) in tqdm(slices):
        
        # shifted ensemble
        shift_x = [-1, 1]
        shift_y = [-1, 1]
        flags_x = [True, True]
        flags_y = [True, True]
        
        if x1//1024 == 0:
            flags_x[0] = False
        if y1//1024 == 0:
            flags_y[0] = False
        if x2//1024 == shape[0]//1024:
            flags_x[1] = False
        if y2//1024 == shape[1]//1024:
            flags_y[1] = False
        
        pred = np.zeros((1024, 1024)).astype(float)
        devide = np.ones((1024, 1024)).astype(float)
        
        raw_image = dataset.read([1, 2, 3], window=Window.from_slices((x1,x2),(y1,y2)))
        raw_image = np.moveaxis(raw_image, 0, -1)
        
        image = cv2.resize(raw_image, (512, 512), interpolation=cv2.INTER_AREA)
        image_mean = image.mean(-1)
        
        if ((image_mean==0).sum()>1000):
            continue
        
        image = image.astype(np.float32)
        
        # Dark-Bright CLS
        
        m = image.mean(0).mean(0)
        st = image.reshape(-1, 3).std(0)
        
        dark = m.mean()<100
        light = (((195<m[0])&(m[0]<215))&((160<m[1])&(m[1]<205))&((185<m[2])&(m[2]<205)))&(((8<st[0])&(st[0]<20))&((13<st[1])&(st[1]<25))&((8<st[2])&(st[2]<20)))
        
        if light:
            image = (image - 100.) * 1.2
        if dark:
            image = np.clip((image * 2.5), 0, 255)
        
        image = (image/255.0 - mean) / std
        image = np.expand_dims(image, 0)
        
        for fold_model in fold_models:
            pred += cv2.resize(fold_model.predict(image).reshape(512, 512), (1024, 1024)) / len(fold_models)
            
        if light:
            preds[x1:x2, y1:y2] += (pred > BTH).astype(np.uint8)
        if dark:
            preds[x1:x2, y1:y2] += (pred > DTH).astype(np.uint8)
        if (not light) & (not dark):
            preds[x1:x2, y1:y2] += (pred > THRESHOLD).astype(np.uint8)
        
        
        pred = np.zeros((1024, 1024)).astype(float)
        
        
        for n_, f1 in enumerate(flags_x):
            
            if f1==True:
                if n_==0:
                    raw_image = dataset.read([1, 2, 3], window=Window.from_slices((x1-512, x2-512), (y1, y2)))
                if n_==1:
                    raw_image = dataset.read([1, 2, 3], window=Window.from_slices((x1+512, x2+512), (y1, y2)))
                raw_image = np.moveaxis(raw_image, 0, -1)
                
                image = cv2.resize(raw_image, (512, 512), interpolation=cv2.INTER_AREA)
                image_mean = image.mean(-1)

                if ((image_mean==0).sum()>1000):
                    continue

                image = image.astype(np.float32)
                
                # Dark-Bright CLS

                m = image.mean(0).mean(0)
                st = image.reshape(-1, 3).std(0)

                dark = m.mean()<100
                light = (((195<m[0])&(m[0]<215))&((160<m[1])&(m[1]<205))&((185<m[2])&(m[2]<205)))&(((8<st[0])&(st[0]<20))&((13<st[1])&(st[1]<25))&((8<st[2])&(st[2]<20)))

                if light:
                    image = (image - 100.) * 1.2
                if dark:
                    image = np.clip((image * 2.5), 0, 255)

                image = (image/255.0 - mean) / std
                image = np.expand_dims(image, 0)
                
                for fold_model in fold_models:
                    
                    if f1 & (n_==0):
                        pred[:512, :] += (cv2.resize(fold_model.predict(image).reshape(512, 512), (1024, 1024)) / len(fold_models))[512:, :]
                        devide[:512, :] += 1
                    if f1 & (n_==1):
                        pred[512:, :] += (cv2.resize(fold_model.predict(image).reshape(512, 512), (1024, 1024)) / len(fold_models))[:512, :]
                        devide[512:, :] += 1

        if light:
            preds[x1:x2, y1:y2] += (pred > BTH).astype(np.uint8)
        if dark:
            preds[x1:x2, y1:y2] += (pred > DTH).astype(np.uint8)
        if (not light) & (not dark):
            preds[x1:x2, y1:y2] += (pred > THRESHOLD).astype(np.uint8)

            
        pred = np.zeros((1024, 1024)).astype(float)
                
        for n_, f1 in enumerate(flags_y):
            
            if f1==True:
                if n_==0:
                    raw_image = dataset.read([1, 2, 3], window=Window.from_slices((x1, x2),(y1-512, y2-512)))
                if n_==1:
                    raw_image = dataset.read([1, 2, 3], window=Window.from_slices((x1, x2),(y1+512, y2+512)))
                raw_image = np.moveaxis(raw_image, 0, -1)
                
                image = cv2.resize(raw_image, (512, 512), interpolation=cv2.INTER_AREA)
                image_mean = image.mean(-1)

                if ((image_mean==0).sum()>1000):
                    continue

                image = image.astype(np.float32)
                
                # Dark-Bright CLS

                m = image.mean(0).mean(0)
                st = image.reshape(-1,3).std(0)

                dark = m.mean()<100
                light = (((195<m[0])&(m[0]<215))&((160<m[1])&(m[1]<205))&((185<m[2])&(m[2]<205)))&(((8<st[0])&(st[0]<20))&((13<st[1])&(st[1]<25))&((8<st[2])&(st[2]<20)))

                if light:
                    image = (image - 100.)*1.2
                if dark:
                    image = np.clip((image * 2.5), 0, 255)

                image = (image/255.0 - mean) / std
                image = np.expand_dims(image, 0)
                
                for fold_model in fold_models:
                    
                    if f1 & (n_==0):
                        pred[:, :512] += (cv2.resize(fold_model.predict(image).reshape(512, 512), (1024, 1024)) / len(fold_models))[:, 512:]
                        devide[:, :512] += 1
                    if f1 & (n_==1):
                        pred[:, 512:] += (cv2.resize(fold_model.predict(image).reshape(512, 512),(1024, 1024)) / len(fold_models))[:, :512]
                        devide[:, 512:] += 1
                
        if light:
            preds[x1:x2, y1:y2] += (pred > BTH).astype(np.uint8)
        if dark:
            preds[x1:x2, y1:y2] += (pred > DTH).astype(np.uint8)
        if (not light) & (not dark):
            preds[x1:x2, y1:y2] += (pred > THRESHOLD).astype(np.uint8)
            
            
    # headshot post processing
    
    scale_factor = 8
    preds_small = cv2.resize(preds, (shape[1]//scale_factor, shape[0]//scale_factor))
    preds = np.zeros(dataset.shape, dtype=np.uint8)
    ret, preds_small= cv2.threshold((preds_small.astype(int) * 255.).astype(np.uint8), 127, 255, 0)
    contours, hierarchy = cv2.findContours(preds_small, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
    
    centers = []
    for j, c in enumerate(contours):
        center = c.reshape(-1, 2).mean(0)
        centers.append([int(center[1]*scale_factor), int(center[0]*scale_factor)])
        
    for c0, c1 in tqdm(centers):
        
        if c0 // 2 - 256 < 0:
            c0 = 512
        if shape[0] < c0 + 512:
            c0 = shape[0] - 512
        if c1 // 2 - 256 < 0:
            c1 = 512
        if shape[1] < c1 + 512:
            c1 = shape[1] - 512
        
        x1, y1, x2, y2 = c0-512, c1-512, c0+512, c1+512
        
        pred = np.zeros((1024, 1024)).astype(float)
        pred_2 = np.zeros((1024, 1024)).astype(float)
        raw_image = dataset.read([1, 2, 3], window=Window.from_slices((x1, x2), (y1, y2)))
        raw_image = np.moveaxis(raw_image, 0, -1)
        
        image = cv2.resize(raw_image, (512, 512), interpolation=cv2.INTER_AREA)
        image_mean = image.mean(-1)
        
        if ((image_mean==0).sum()>1000):
            continue
        
        image = image.astype(np.float32)
        
        # Dark-Bright CLS
        
        m = image.mean(0).mean(0)
        st = image.reshape(-1, 3).std(0)
        
        dark = m.mean()<100
        light = (((195<m[0])&(m[0]<215))&((160<m[1])&(m[1]<205))&((185<m[2])&(m[2]<205)))&(((8<st[0])&(st[0]<20))&((13<st[1])&(st[1]<25))&((8<st[2])&(st[2]<20)))
        
        if light:
            image = (image - 100.) * 1.2
        if dark:
            image = np.clip((image * 2.5), 0, 255)
        
        image = (image/255.0 - mean) / std
        image = np.expand_dims(image, 0)
        
        for fold_model in fold_models:
            pred += cv2.resize(fold_model.predict(image).reshape(512, 512), (1024, 1024)) / len(fold_models)
            
        if light:
            pred = (pred > BTH).astype(np.uint8)
            #pred_2 = (pred_2 > BTH).astype(np.uint8)
        if dark:
            pred = (pred > DTH).astype(np.uint8)
            #pred_2 = (pred_2 > DTH).astype(np.uint8)
        if (not light) & (not dark):
            pred = (pred > S2_TH).astype(np.uint8)
            #pred_2 = (pred_2 > S2_TH).astype(np.uint8)
            
        preds[x1+128:x2-128, y1+128:y2-128] += pred[128:-128, 128:-128]
        
    # clip duplicate
    preds = np.clip(preds, 0, 1)
    
    subm[i] = {'id':filename.stem, 'predicted': rle_encode_less_memory(preds)}
    del preds
    gc.collect()
'''
0

# Training Information

## Model
### TF Efficientnet B5 Unet
### For each model, TF was always better than pytorch (average +0.04~6)

## Augmentation
### As data augmentation for training, we applied 
### ・Virtical/Horizontal flip
### ・Make small % of image grayscale
### ・random saturation
### ・random contrast
### ・brightness augmentation
### They made us about +0.005~0.001 on LB

# Pseudo Labeling
### Hard-label based pseudo labeling, which means we concatenate predicted->thresholded public test masks to train dataset and use whole data for training.
### Instead of hand label, after we trained all fold models, we've created pseudo label for public test dataset. this pseudo label made us a little jump on public LB.

In [None]:
im = Image.open('../input/pseudo-00/pseudo_00.png')
im.resize((int(321*1.5),int(261*1.5)))

# ・Idea for advance: Rotation based Head-Shot Stage2
### We couldn't include this for last submission because of inference time, but found it effective test-time augmentation
### for Stage2 of Head-Shot Prediction, we rotate tile 90 x N degree (N = 0, 1, 2, 3), predict them with model, and back rotate prediction as it'll be like base tile prediction.

In [None]:
im = Image.open('../input/rotation-00/rotation_00.png')
im

# Conclusion
## In this notebook, we proposed our magics / solutions for this competition.
## We hope you enjoy our solution.