# Kaggle GI Tract Inference

Using code snippets from:
https://www.kaggle.com/code/yiheng/3d-solution-with-monai-produce-3d-data/notebook
https://www.kaggle.com/code/israrahmed919/createmasksopencv
https://www.kaggle.com/code/clemchris/gi-seg-pytorch-train-infer




**Notes**
* See this post for how to submit.  https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/discussion/320541#1764874  The order of the submission has to match the sample_submission.csv file that appears during Kaggle's evaluation


### Submission packages
Submissions need to run with the internet connectivity off so we need to use Kaggle Datasets to hold any packages that are not in the standard Kaggle image.

Example:

1. On a local machine download the package

    `pip download einops -d ./einops/`

    `pip download segmentation-models-pytorch -d ./segmentation-models-pytorch/`

2. Pick through the packages and select the parts you need

3. add the files to a Kaggle dataset

4. Add the dataset to your notebook

5. Import packages from files

    `!pip install '../input/einops041/einops-0.4.1-py3-none-any.whl'`

Note - it looks like the below does not work in the latest Kaggle image - get an error.  So I took the orig notebook: https://www.kaggle.com/code/awsaf49/uwmgi-unet-infer-pytorch/notebook and made a copy.  It uses an image from Oct 2021 and that works.

In [None]:
!pip install -q '../input/gitractsegmentationmodels/pretrainedmodels-0.7.4/pretrainedmodels-0.7.4'
!pip install -q '../input/gitractsegmentationmodels/efficientnet_pytorch-0.6.3/efficientnet_pytorch-0.6.3'
!pip install -q '../input/gitractsegmentationmodels/timm-0.4.12-py3-none-any.whl'
!pip install -q '../input/gitractsegmentationmodels/segmentation_models_pytorch-0.2.1-py3-none-any.whl'




In [None]:
!pip install -q '../input/einops041/einops-0.4.1-py3-none-any.whl'

In [None]:
import os, glob
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import torch
from PIL import Image
from einops import rearrange, reduce, repeat
import segmentation_models_pytorch as smp
from tqdm import tqdm

from torchvision.transforms import PILToTensor
from torchvision import transforms
from torch.nn import functional as F

ROOT_FOLDER = '../input/uw-madison-gi-tract-image-segmentation/'
#ROOT_FOLDER = '/media/SSD/gi-tract/uw-madison-gi-tract-image-segmentation/'

MODEL_FOLDER = '../input/gi-tract-models'
#MODEL_FOLDER = '/media/SSD/gi-tract/uw-madison-gi-tract-image-segmentation/kaggle models'

model_file_base = 'Unet-1-1.pth'
n_folds = 5

DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
#DEVICE = "cpu"
#DEVICE = "cuda:0"

# Data

## Process the Test files

#### Set the debug mode
`debug = True` will result in using all the training images to do inference and also do a check on the resize+ pad transform needed to invert the basic training transform  (crop+resize)


`debug = False` will result in using the test images

In [None]:
sub_df = pd.read_csv(ROOT_FOLDER+'sample_submission.csv')
print("sub length:", len(sub_df))
if not len(sub_df):
    debug = True
else:
    debug = False
    
print("debug:", debug)

In [None]:
if debug:  # Use the training files as input
    test_fnames = glob.glob("{}train/*/*/scans/*png".format(ROOT_FOLDER))
    file_df = pd.DataFrame(test_fnames)
    file_df = file_df[:1000*3]
else:  # Use the test files as input
    test_fnames = glob.glob("{}test/*/*/scans/*png".format(ROOT_FOLDER))
    file_df = pd.DataFrame(test_fnames)
    
    


In [None]:
print("Samples to predict:",len(file_df))

In [None]:
file_df.columns = ["path"]

We need to submit a csv with the following columns

`['id', 'class', 'predicted']`


We need a way to get the `id` for the submission from the file paths

Build a function that does the string manipulation

In [None]:
def id_from_path(p):
    
    #p = '/media/SSD/gi-tract/uw-madison-gi-tract-image-segmentation/train/case24/case24_day25/scans/slice_0046_266_266_1.50_1.50.png'
    s1 = p.split('/')
    filename = s1[-1] # 'slice_0046_266_266_1.50_1.50.png'
    case_day_str = s1[-3] # 'case24_day25'
    slice_str = '_'.join(filename.split('_')[-7:-4]) #'slice_0046'
    id_str = case_day_str + '_' + slice_str #'case24_day25_slice_0046'
    return id_str
    
    

In [None]:
file_df["id"] = file_df["path"].apply(id_from_path)

In [None]:
file_df

## Get the NN Models

For each fold of the data we have a model, so we need to get all of them.  They are not huge so we can keep all of them in the GPU.

In [None]:
models = []
for fold in range(0,n_folds):
    
    model = smp.Unet(
    encoder_name="efficientnet-b0",        # choose encoder, e.g. mobilenet_v2 or efficientnet-b7
    encoder_weights=None,     # don't need initialization since we will load our models
    in_channels=1,                  # model input channels (1 for gray-scale images, 3 for RGB, etc.)
    classes=3)                    # model output channels (number of classes in your dataset)
    
    model_in_str = MODEL_FOLDER+ "/" + "fold-" + str(fold) + '-' + model_file_base 
    print(model_in_str)
    model.load_state_dict(torch.load(model_in_str))
    model.to(torch.device(DEVICE))
    model.eval()
    models.append(model)
    

## Create a Pytorch Dataset for inference

The main difference relative to training is we don't have the ground truth run length encoded mask.
We also provide the `id` used in the submission because its the key

In [None]:
class Dataset_from_df_inference(torch.utils.data.Dataset):
    def __init__(self, df, transform=None):
        self.df = df
        self.transform = transform
        self.pil_to_tensor = PILToTensor()

        
    def __len__(self):
        return self.df.shape[0]
        
    
    
    def __getitem__(self, idx):
        row = self.df.iloc[idx]
        
        
        img_path = row.path
        # Use PIL Image to read the image files since it handles 32 bit images
        img = self.pil_to_tensor(Image.open(img_path))
        #print(img.shape)
        
        
        
        # capture the shape of the original image because we want our final mask
        # used for RLE to match this shape
        mask_shape = img.shape

        
        
        # resize our image to a consistent size to provide as input to model
        if self.transform:
            img = self.transform(img)
    
        
        
        return {
            "image": img,
            "mask_shape": mask_shape,
            "id": row.id #return the row id used in the submission
        }
        
        
    
    


### Define our transform
We use the basic imaging sizing from Training because that recreates the conditions under which we trained

In [None]:
test_transforms = transforms.Compose(
    [transforms.CenterCrop((266,266)),
    transforms.ConvertImageDtype(torch.float32),
    transforms.Resize((288,288),interpolation=transforms.InterpolationMode.BICUBIC)])  # multiple of 32 for UUnet

### Create the Dataset
And ensure it returns what we expect

In [None]:
test_dataset = Dataset_from_df_inference(file_df,test_transforms)

In [None]:
test_dataset[0]['mask_shape'], test_dataset[0]['id']

In [None]:
plt.imshow(test_dataset[0]['image'].squeeze().numpy(),cmap='gray')

## Inference Processing

Create the dataloader. Kaggle reccommends `num_workers=2` when using the GPU.  We may have to change the `batch_size` and `pin_memory` settings for Kaggle virtual machine.

In [None]:
test_dataloader = torch.utils.data.DataLoader(dataset=test_dataset,
                                               batch_size=16,
                                               num_workers=2,
                                               pin_memory=True, #pagelock the memory for faster loads to GPU RAM
                                               shuffle=False)

In [None]:
# ref.: https://www.kaggle.com/stainsby/fast-tested-rle
def rle_encode(img):
    """ TBD
    
    Args:
        img (np.array): 
            - 1 indicating mask
            - 0 indicating background
    
    Returns: 
        run length as string formated
    """
    
    pixels = img.flatten()
    pixels = np.concatenate([[0], pixels, [0]])
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)

Threshold for turning the masking image prediction into an image with pixels of 0 or 1 per channel
Reflects how high we need the predicted probababilty before we count the pixel in the mask
Also the image resizing impacts the mask edges (we use bicubic interpolation to avoid some of this) so a lower threshold might be needed to have the masks with pixels of 0 or 1 register edges better.

Should be between 0 and 1.0, on the higher side.

In [None]:

threshold = .8

Loop through our test data and create the submission

In [None]:
sub = {'id':[], 'class':[], 'predicted':[]} # a dict to store our submission predictions
pbar = tqdm(total=len(test_dataloader))


for batch in test_dataloader:
    
    images = batch['image'].to(DEVICE)
    mask_shape = batch['mask_shape']
    ids = batch['id']
    

    #Take each of our fold models and average the outputs together
    with torch.no_grad():
        output = models[0](images)
        output = torch.nn.Sigmoid()(output)/n_folds
        mask = output
        for i in range(1,n_folds):
            output = models[i](images)
            output = torch.nn.Sigmoid()(output)/n_folds
            mask = mask + output
    
    #print("1", mask.shape)


    #From here we need to apply the invert of the the basic image crop and resize exactly to the mask image
    #and then apply run length encoding to that image
    
    mask = transforms.Resize((266,266),interpolation=transforms.InterpolationMode.BICUBIC)(mask) # undo resize to 288x288
    #print("3", mask.shape)
    
    #print(4,mask_shape)
    
    a0 = torch.div(mask_shape[1] - 266,2,rounding_mode = 'floor') # (mask_shape[1] - 266)//2
    a1 = torch.div(mask_shape[2] - 266,2,rounding_mode = 'floor')
    #print(a0,a1)
    
    # Since we are processing a batch, the final mask shape may change within a batch
    # So we can't store that as a 1 tensor per batch
    # Need to process each batch item individually from here

    #single_mask = torch.zeros_like(mask[0])
    batch_size = mask.shape[0]  # need to get for each batch since last batch may smaller
    for b in range(0,batch_size):
        single_mask = F.pad(mask[b],(a1[b], a1[b], a0[b], a0[b]),  "constant", 0) #padding param order confirmed
        #print("5", single_mask.shape)
            
        single_mask = (single_mask > threshold)*1.0  # Run Length encoding requires a mask with 0 or 1
        single_mask = single_mask.cpu().detach().numpy() # go from a tensor on the GPU to a numpy on CPU

        # create 3 submission rows, one for each organ
        sub_id = ids[b] #get the submission id from the batch
        
        large_bowel = rle_encode(single_mask[0])
        sub['id'].append(sub_id)
        sub['class'].append('large_bowel')
        sub['predicted'].append(large_bowel)

        small_bowel = rle_encode(single_mask[1])
        sub['id'].append(sub_id)
        sub['class'].append('small_bowel')
        sub['predicted'].append(small_bowel)

        stomach = rle_encode(single_mask[2])
        sub['id'].append(sub_id)
        sub['class'].append('stomach')
        sub['predicted'].append(stomach)    

    

        
   
    pbar.update(1)
    #break
pbar.close()
torch.cuda.empty_cache()


In [None]:
pred_df = pd.DataFrame(sub)

In [None]:
# replace NaNs with an empty string for the 'predicted' column
pred_df.predicted = pred_df.predicted.fillna('')

### Create submission file

We need to follow the id and class row order of the sample_submission file exactly to get a score

In [None]:
if not debug:
    sub_df = pd.read_csv('../input/uw-madison-gi-tract-image-segmentation/sample_submission.csv')
    del sub_df['predicted']
    sub_df = sub_df.merge(pred_df, on=['id','class'])
    sub_df.to_csv('./submission.csv',index=False)
else:
    sub_df = pred_df.copy()
    del sub_df['predicted']
    sub_df = sub_df.merge(pred_df, on=['id','class'])
    sub_df.to_csv('./submission.csv',index=False)
    display(sub_df)

In [None]:
pred_df