# Hacking the Kidney: Inference of Unet model

References:

- [256x256 images](https://www.kaggle.com/iafoss/256x256-images)

- [Making a successful submission](https://www.kaggle.com/igor14497/making-a-successful-submission])

- [HuBMAP: TF with TPU EfficientUNet 512x512[subm]](https://www.kaggle.com/wrrosa/hubmap-tf-with-tpu-efficientunet-512x512-subm)

- [Global Mask Shift](https://www.kaggle.com/tivfrvqhs5/global-mask-shift)

Many thanks to that notebooks' authors.

This notebook makes an inference for Unet model with EfficientNet backbone based on [this library](https://github.com/qubvel/segmentation_models). The code for training models is [here](https://github.com/vgarshin/kaggle_kidney/blob/master/kidney_train.ipynb).

In [1]:
!pip install ../input/keras-applications/Keras_Applications-1.0.8/ -f ./ --no-index
!pip install ../input/image-classifiers/image_classifiers-1.0.0/ -f ./ --no-index
!pip install ../input/efficientnet-1-0-0/efficientnet-1.0.0/ -f ./ --no-index
!pip install ../input/segmentation-models/segmentation_models-1.0.1/ -f ./ --no-index

Looking in links: ./
Processing /kaggle/input/keras-applications/Keras_Applications-1.0.8
Building wheels for collected packages: Keras-Applications
  Building wheel for Keras-Applications (setup.py) ... [?25l- \ done
[?25h  Created wheel for Keras-Applications: filename=Keras_Applications-1.0.8-py3-none-any.whl size=50704 sha256=1f68372b925206b4b23a7ef591c3bc2fb325fa97c5784348b1fef1c88d95f382
  Stored in directory: /root/.cache/pip/wheels/b4/02/72/232b9e664a461886108d09707ad8804690edba42586c42afec
Successfully built Keras-Applications
Installing collected packages: Keras-Applications
Successfully installed Keras-Applications-1.0.8
Looking in links: ./
Processing /kaggle/input/image-classifiers/image_classifiers-1.0.0
Building wheels for collected packages: image-classifiers
  Building wheel for image-classifiers (setup.py) ... [?25l- \ done
[?25h  Created wheel for image-classifiers: filename=image_classifiers-1.0.0-py3-none-any.whl size=19950 sha256=f8b5bb54b

In [2]:
%env SM_FRAMEWORK=tf.keras

env: SM_FRAMEWORK=tf.keras


In [3]:
import warnings
warnings.filterwarnings('ignore')
import os
import gc
import cv2
import sys
import json
import time
import pickle
import shutil
import numba
import numpy as np
import pandas as pd 
import tifffile as tiff
import rasterio
from rasterio.windows import Window
import tensorflow as tf
import tensorflow_addons as tfa
import matplotlib.pyplot as plt
import tensorflow.keras.backend as K
from tensorflow.keras import Model, Sequential
from tensorflow.keras.models import load_model
from tensorflow.keras.utils import Sequence
from tensorflow.keras.losses import binary_crossentropy
from tensorflow.keras.layers import *
from tensorflow.keras.optimizers import Adam, SGD
from tensorflow.keras.callbacks import *
import segmentation_models as sm
from segmentation_models import Unet, FPN, Linknet
from segmentation_models.losses import bce_jaccard_loss
from tqdm import tqdm
print('tensorflow version:', tf.__version__)
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
gpu_devices = tf.config.experimental.list_physical_devices('GPU')
if gpu_devices:
    for gpu_device in gpu_devices:
        print('device available:', gpu_device)
pd.set_option('display.max_columns', None)

Segmentation Models: using `tf.keras` framework.
tensorflow version: 2.3.1
device available: PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')


In [4]:
'../input/nuetb1-model'
'../input/linkb2-model'
'../input/fpnb0-0234'

'../input/fpnb0-0234'

In [5]:
TEST = True
KAGGLE = True
MDLS_FOLDS = {'nuetb1-model':[0,4], 'fpnb0-0234':[0]}
if KAGGLE:
    DATA_PATH = '../input/hubmap-kidney-segmentation'
    MDLS_PATHS = {ver: f'../input/{ver}' 
                  for ver, _ in MDLS_FOLDS.items()}
else:
    DATA_PATH = './data2'
    MDLS_PATHS = {ver: f'./models_{ver}' 
                  for ver, _ in MDLS_FOLDS.items()}
THRESHOLD = .4
VOTERS = 1
TTAS = [0, 1, 2]
EXPAND = 4
MIN_OVERLAP = 1 / 16
IDNT = rasterio.Affine(1, 0, 0, 0, 1, 0)
STRATEGY = tf.distribute.get_strategy() 
SUB_PATH = f'{DATA_PATH}/test' if TEST else f'{DATA_PATH}/train'

start_time = time.time()

In [6]:
params_dict = {}
for ver, _ in MDLS_FOLDS.items():
    with open(f'{MDLS_PATHS[ver]}/params.json') as file:
        params_dict[ver] = json.load(file)
for ver, params in params_dict.items():
    print('version:', ver, '| loaded params:', params_dict, '\n')

version: nuetb1-model | loaded params: {'nuetb1-model': {'version': 'version274', 'folds': 5, 'img_size': 256, 'resize': 4, 'batch_size': 24, 'epochs': 115, 'patience': 20, 'decay': False, 'backbone': 'efficientnetb1', 'bce_weight': 1, 'loss': 'bce_jaccard_loss', 'seed': 234234, 'split': 'kfold', 'mirror': False, 'aughard': True, 'umodel': 'unet', 'pseudo': False, 'lr': 0.0005, 'shift': True, 'external': 'ext', 'comments': 'new data'}, 'fpnb0-0234': {'version': 'version274', 'folds': 5, 'img_size': 256, 'resize': 4, 'batch_size': 24, 'epochs': 115, 'patience': 20, 'decay': False, 'backbone': 'efficientnetb0', 'bce_weight': 1, 'loss': 'bce_jaccard_loss', 'seed': 234234, 'split': 'kfold', 'mirror': False, 'aughard': True, 'umodel': 'fpn', 'pseudo': False, 'lr': 0.0005, 'shift': True, 'external': 'ext', 'comments': 'new data'}} 

version: fpnb0-0234 | loaded params: {'nuetb1-model': {'version': 'version274', 'folds': 5, 'img_size': 256, 'resize': 4, 'batch_size': 24, 'epochs': 115, 'patie

In [7]:
def enc2mask(encs, shape):
    img = np.zeros(shape[0] * shape[1], dtype=np.uint8)
    for m, enc in enumerate(encs):
        if isinstance(enc, np.float) and np.isnan(enc): continue
        s = enc.split()
        for i in range(len(s) // 2):
            start = int(s[2 * i]) - 1
            length = int(s[2 * i + 1])
            img[start : start + length] = 1 + m
    return img.reshape(shape).T

def rle_encode_less_memory(img):
    pixels = img.T.flatten()
    pixels[0] = 0
    pixels[-1] = 0
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 2
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)

In [8]:
def dice_coef(y_true, y_pred, smooth=1):
    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    intersection = K.sum(y_true_f * y_pred_f)
    return (2 * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)

def dice_loss(y_true, y_pred, smooth=1):
    return (1 - dice_coef(y_true, y_pred, smooth))

def bce_dice_loss(y_true, y_pred):
    return params['bce_weight'] * binary_crossentropy(y_true, y_pred) + \
        (1 - params['bce_weight']) * dice_loss(y_true, y_pred)
EFF = '../input/efficientnetb0b7-keras-weights'
def get_model(backbone, input_shape, path, 
              loss_type='bce_dice', umodel='unet', 
              classes=1, lr=.001):
    path = EFF
    if backbone == 'efficientnetb0':
        weights = f'{path}/efficientnet-b0_weights_tf_dim_ordering_tf_kernels_autoaugment_notop.h5'
    elif backbone == 'efficientnetb1':
        weights = f'{path}/efficientnet-b1_weights_tf_dim_ordering_tf_kernels_autoaugment_notop.h5'
    elif backbone == 'efficientnetb2':
        weights = f'{path}/efficientnet-b2_weights_tf_dim_ordering_tf_kernels_autoaugment_notop.h5'
    elif backbone == 'efficientnetb3':
        weights = f'{path}/efficientnet-b3_weights_tf_dim_ordering_tf_kernels_autoaugment_notop.h5'
    elif backbone == 'resnet34':
        weights = f'{path}/resnet34_imagenet_1000_no_top.h5'
    else:
        raise AttributeError('mode parameter error')
    with STRATEGY.scope():
        if loss_type == 'bce_dice': 
            loss = bce_dice_loss
        elif loss_type == 'bce_jaccard_loss':
            loss = bce_jaccard_loss
        else:
            raise AttributeError('loss mode parameter error')
        if umodel == 'unet':
            model = Unet(backbone_name=backbone, encoder_weights=weights,
                         input_shape=input_shape,
                         classes=classes, activation='sigmoid')
        elif umodel == 'fpn':
            model = FPN(backbone_name=backbone, encoder_weights=weights,
                        input_shape=input_shape,
                        classes=classes, activation='sigmoid')
        elif umodel == 'link':
            model = Linknet(backbone_name=backbone, encoder_weights=weights,
                            input_shape=input_shape,
                            classes=classes, activation='sigmoid')
        else:
            raise AttributeError('umodel mode parameter error')
        model.compile(
            optimizer=tfa.optimizers.Lookahead(
                tf.keras.optimizers.Adam(learning_rate=lr)
            ),
            loss=loss, 
            metrics=[dice_coef]
        )
    return model

In [9]:
def make_grid(shape, window=256, min_overlap=32):
    x, y = shape
    nx = x // (window - min_overlap) + 1
    x1 = np.linspace(0, x, num=nx, endpoint=False, dtype=np.int64)
    x1[-1] = x - window
    x2 = (x1 + window).clip(0, x)
    ny = y // (window - min_overlap) + 1
    y1 = np.linspace(0, y, num=ny, endpoint=False, dtype=np.int64)
    y1[-1] = y - window
    y2 = (y1 + window).clip(0, y)
    slices = np.zeros((nx, ny, 4), dtype=np.int64) 
    for i in range(nx):
        for j in range(ny):
            slices[i, j] = x1[i], x2[i], y1[j], y2[j]    
    return slices.reshape(nx * ny, 4)

def flip(img, axis=0):
    if axis == 1:
        return img[::-1, :, ]
    elif axis == 2:
        return img[:, ::-1, ]
    elif axis == 3:
        return img[::-1, ::-1, ]
    else:
        return img

In [10]:
img_files = [x for x in os.listdir(SUB_PATH) if '.tiff' in x]
print('images idxs:', img_files)

images idxs: ['aa05346ff.tiff', '2ec3f1bb9.tiff', '3589adb90.tiff', 'd488c759a.tiff', '57512b7f1.tiff']


In [11]:
subm = {}
for i_img, img_file in enumerate(img_files):
    THR = 0.2 if 'd488c759a' in img_file else THRESHOLD
    print('-' * 20, img_file, '-' * 20)
    img_data = rasterio.open(os.path.join(SUB_PATH, img_file), transform=IDNT)
    print('img shape:', img_data.shape)
    if img_data.count != 3:
        print('img file with subdatasets as channels')
        layers = [rasterio.open(subd) for subd in img_data.subdatasets]
    img_preds = np.zeros(img_data.shape, dtype=np.uint8)
    tile_size = int(params['img_size'] * EXPAND)
    tile_resized = int(tile_size * params['resize'])
    slices = make_grid(
        img_data.shape, 
        window=tile_resized, 
        min_overlap=int(tile_resized * MIN_OVERLAP)
    )
    models = []
    for ver, folds in MDLS_FOLDS.items():
        for n_fold in folds:
            checkpoint_path = f'{MDLS_PATHS[ver]}/model_{n_fold}.hdf5'
            model = get_model(
                params_dict[ver]['backbone'], 
                input_shape=(tile_size, tile_size, 3),
                path=MDLS_PATHS[ver],
                loss_type=params_dict[ver]['loss'],
                umodel=params_dict[ver]['umodel']
            )
            model.load_weights(checkpoint_path)
            models.append(model)
            print('ver:', ver, '| model loaded:', checkpoint_path)
    for (x1, x2, y1, y2) in tqdm(slices, desc=f'{img_file}'):
        if img_data.count == 3: # normal
            img = img_data.read(
                [1, 2, 3], 
                window=Window.from_slices((x1, x2), (y1, y2))
            )
            img = np.moveaxis(img, 0, -1)
        else: # with subdatasets/layers
            img = np.zeros((tile_resized, tile_resized, 3), dtype=np.uint8)
            for fl in range(3):
                img[:, :, fl] = layers[fl].read(
                    window=Window.from_slices((x1, x2), (y1, y2))
                )
        img = cv2.resize(img, (tile_size, tile_size))
        img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
        pred = np.zeros((tile_size, tile_size), dtype=np.float32)
        for tta_mode in TTAS:
            img_aug = flip(img, axis=tta_mode)
            img_aug = np.expand_dims(img_aug, 0)
            img_aug = img_aug.astype(np.float32) / 255
            for model in models:
                pred_aug = np.squeeze(model.predict(img_aug))
                pred += flip(pred_aug, axis=tta_mode)
        pred /= (len(models) * len(TTAS))
        pred = cv2.resize(pred, (tile_resized, tile_resized))
        img_preds[x1:x2, y1:y2] = img_preds[x1:x2, y1:y2] + \
            (pred > THR).astype(np.uint8)
    del model, models, img, pred, img_aug, pred_aug; gc.collect()
    print('img max:', np.max(img_preds), '| voters:', VOTERS)
    img_preds = (img_preds >= VOTERS).astype(np.uint8)
    rle_pred = rle_encode_less_memory(img_preds)
    subm[i_img] = {'id':img_file.replace('.tiff', ''), 'predicted': rle_pred}
    del img_preds, img_data, rle_pred; gc.collect()

elapsed_time = time.time() - start_time
print(f'time elapsed: {elapsed_time // 60:.0f} min {elapsed_time % 60:.0f} sec')

-------------------- aa05346ff.tiff --------------------
img shape: (30720, 47340)
img file with subdatasets as channels
ver: nuetb1-model | model loaded: ../input/nuetb1-model/model_0.hdf5
ver: nuetb1-model | model loaded: ../input/nuetb1-model/model_4.hdf5


aa05346ff.tiff:   0%|          | 0/117 [00:00<?, ?it/s]

ver: fpnb0-0234 | model loaded: ../input/fpnb0-0234/model_0.hdf5


aa05346ff.tiff: 100%|██████████| 117/117 [03:09<00:00,  1.62s/it]


img max: 4 | voters: 1
-------------------- 2ec3f1bb9.tiff --------------------
img shape: (23990, 47723)
ver: nuetb1-model | model loaded: ../input/nuetb1-model/model_0.hdf5
ver: nuetb1-model | model loaded: ../input/nuetb1-model/model_4.hdf5


2ec3f1bb9.tiff:   0%|          | 0/91 [00:00<?, ?it/s]

ver: fpnb0-0234 | model loaded: ../input/fpnb0-0234/model_0.hdf5


2ec3f1bb9.tiff: 100%|██████████| 91/91 [02:29<00:00,  1.64s/it]


img max: 4 | voters: 1
-------------------- 3589adb90.tiff --------------------
img shape: (29433, 22165)
ver: nuetb1-model | model loaded: ../input/nuetb1-model/model_0.hdf5
ver: nuetb1-model | model loaded: ../input/nuetb1-model/model_4.hdf5


3589adb90.tiff:   0%|          | 0/48 [00:00<?, ?it/s]

ver: fpnb0-0234 | model loaded: ../input/fpnb0-0234/model_0.hdf5


3589adb90.tiff: 100%|██████████| 48/48 [01:20<00:00,  1.68s/it]


img max: 4 | voters: 1
-------------------- d488c759a.tiff --------------------
img shape: (46660, 29020)
img file with subdatasets as channels
ver: nuetb1-model | model loaded: ../input/nuetb1-model/model_0.hdf5
ver: nuetb1-model | model loaded: ../input/nuetb1-model/model_4.hdf5


d488c759a.tiff:   0%|          | 0/104 [00:00<?, ?it/s]

ver: fpnb0-0234 | model loaded: ../input/fpnb0-0234/model_0.hdf5


d488c759a.tiff: 100%|██████████| 104/104 [02:46<00:00,  1.60s/it]


img max: 4 | voters: 1
-------------------- 57512b7f1.tiff --------------------
img shape: (33240, 43160)
img file with subdatasets as channels
ver: nuetb1-model | model loaded: ../input/nuetb1-model/model_0.hdf5
ver: nuetb1-model | model loaded: ../input/nuetb1-model/model_4.hdf5


57512b7f1.tiff:   0%|          | 0/108 [00:00<?, ?it/s]

ver: fpnb0-0234 | model loaded: ../input/fpnb0-0234/model_0.hdf5


57512b7f1.tiff: 100%|██████████| 108/108 [02:45<00:00,  1.53s/it]


img max: 4 | voters: 1
time elapsed: 14 min 44 sec


In [12]:
df_sub = pd.DataFrame(subm).T
df_sub

Unnamed: 0,id,predicted
0,aa05346ff,52856695 19 52887409 31 52918126 37 52948837 5...
1,2ec3f1bb9,60690334 11 60714318 22 60738300 36 60762280 5...
2,3589adb90,68600124 15 68629529 56 68658950 73 68688381 7...
3,d488c759a,534483952 12 534530598 48 534577249 64 5346238...
4,57512b7f1,328985799 19 329019036 26 329052274 30 3290855...


In [13]:
df_sub.to_csv('submission.csv', index=False)