This code needs a high RAM runtime. Otherwise, the session would crash. If you don't have a Colab Pro subscription, please switch to [Kaggle Kernels](https://www.kaggle.com/kernels). 

## Setup and imports

In [None]:
from pydensecrf.utils import unary_from_labels, create_pairwise_bilateral
import pydensecrf.densecrf as dcrf
from skimage.color import gray2rgb
from skimage.color import rgb2gray

from tqdm import tqdm_notebook
import numpy as np
import subprocess

import warnings
warnings.filterwarnings("ignore")

In [None]:
import psutil 
num_cpus = psutil.cpu_count(logical=False)

import ray
ray.shutdown()
ray.init(num_cpus=num_cpus)

print(f"CPUs: {num_cpus}")

## Load the submission file

Here, we load the load the submission file generated using the `Ensemble_Inference.ipynb` notebook. 

In [None]:
submissions = np.load("submission.npy")
submissions.shape

## Define CRF utility

The function below is taken from [this Kaggle Kernel](https://www.kaggle.com/meaninglesslives/apply-crf). We tuned the below hyperparameters:

* `sxy`
* `compat`
* `inference` steps

In [None]:
@ray.remote
def custom_crf(mask_img, shape=(256, 256)):
    # Converting annotated image to RGB if it is Gray scale
    if(len(mask_img.shape)<3):
        mask_img = gray2rgb(mask_img)
        
    # Converting the annotations RGB color to single 32 bit integer
    annotated_label = mask_img[:,:,0] + (mask_img[:,:,1]<<8) + (mask_img[:,:,2]<<16)
    
    # Convert the 32bit integer color to 0,1, 2, ... labels.
    colors, labels = np.unique(annotated_label, return_inverse=True)

    n_labels = 2
    
    # Setting up the CRF model
    d = dcrf.DenseCRF2D(shape[1], shape[0], n_labels)

    # Get unary potentials (neg log probability)
    U = unary_from_labels(labels, n_labels, gt_prob=0.7, zero_unsure=False)
    d.setUnaryEnergy(U)

    # This adds the color-independent term, features are the locations only.
    d.addPairwiseGaussian(sxy=(12, 12), compat=4, kernel=dcrf.DIAG_KERNEL,
                      normalization=dcrf.NORMALIZE_SYMMETRIC)
        
    # Run Inference for 20 steps 
    Q = d.inference(20)

    # Find out the most probable class for each pixel.
    MAP = np.argmax(Q, axis=0)

    return MAP.reshape((shape[0], shape[1]))

In our experiments, we found out that the higher the values were for `sxy`, `compat`, and `inference` the better the results were. But of course, this is not practically feasible when working in a resource-constrained environment. So, it's better to settle with a score that keeps the trade-offs well balanced. 

## Apply CRF and prepare the submission file

In [None]:
crf_ids = [] 
for submission in tqdm_notebook(submissions):
    submission_id = ray.put(submission)
    crf_ids.append(custom_crf.remote(submission_id))

crfs = ray.get(crf_ids)
crfs = np.array(crfs).astype("uint8")
crfs.shape, crfs.dtype

In [None]:
save_path = "submission_crf.npy"
np.save(save_path, crfs, fix_imports=True, allow_pickle=False)