In this notebook I will be using [Weights and Biases](https://wandb.ai/site) overlay visualization tool to interactively visualize all the segmentation mask generated using the provided HPASegmentation tool.

Here's the W&B report summarizing the results: http://bit.ly/play-with-segmentation-masks

![](https://i.imgur.com/9kNLI4L.gif)

In this competition, we are provided with image-level labels, and the task is to classify each cell in a given image into one or multiple labels.

* Thus, each image has multiple numbers of cells. 

* Each cell consists of multiple [organelles](https://www.genome.gov/genetics-glossary/Organelle). In the previous [HPA competition](https://www.kaggle.com/c/human-protein-atlas-image-classification), 28 organelles were used as labels, and the task was to predict image-level labels (given input image predict the organelles). 

* In this competition, the task is to predict cell-level labels using signals from image-level labels. That's what makes it a more challenging problem statement. 

* **But what are we predicting?** There's a specific _protein of interest_(signal in the green channel) that can be present in an organelle or multiple organelles in each cell. The image-level labels point to the presence of that protein in the cells *in general*. Thus at the cell-level, 
the protein might not be present in the ground truth organelle. Interesting!

* This calls for cell segmentation. We have to know the presence of the cells in an image. But we also need to differentiate one cell from another cell. Thus it's instance segmentation. 

* The authors have provided with [HPA-Cell-Segmentation](https://github.com/CellProfiling/HPA-Cell-Segmentation) tool. **Is it good?** In a discussion thread, I read that it can accurately segment cells in ~90% of test set images. That's an excellent baseline to start with and focus on cell-level classification. 

# Imports and Setups

In [None]:
%%capture
# Install Weights and Biases.
!pip install wandb -q

# Install HPA Cell Segmentation tool.
!pip install https://github.com/CellProfiling/HPA-Cell-Segmentation/archive/master.zip

In [None]:
import os
# To silent W&B logs
os.environ['WANDB_SILENT'] = 'true'

import re
import cv2
import glob
import imageio
import numpy as np
import pandas as pd
from PIL import Image
from tqdm.notebook import tqdm
import matplotlib.pyplot as plt
from skimage.transform import resize

%matplotlib inline

# HPA Segmentation tool related imports
import hpacellseg.cellsegmentator as cellsegmentator
from hpacellseg.utils import label_cell, label_nuclei

### W&B Setup

* Install wandb. ✔️
* Create an account on https://wandb.ai (it's free)
* Input your personal authentication token key. You can get your auth key [here](https://wandb.ai/authorize).

In [None]:
import wandb

from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
wandb_api = user_secrets.get_secret("wandb_api")

wandb.login(key=wandb_api)

### (Hyper)parameters

In [None]:
WORKING_DIR_PATH = '../input/hpa-single-cell-image-classification/'
VISUALIZE_SAMPLES = 32

# Prepare Dataset

### Get `train.csv`

In [None]:
df_train = pd.read_csv(WORKING_DIR_PATH+'train.csv')
df_train.head()

In [None]:
label_names= {
0: "Nucleoplasm",
1: "Nuclear membrane",
2: "Nucleoli",
3: "Nucleoli fibrillar center",
4: "Nuclear speckles",
5: "Nuclear bodies",
6: "Endoplasmic reticulum",
7: "Golgi apparatus",
8: "Intermediate filaments",
9: "Actin filaments",
10: "Microtubules",
11: "Mitotic spindle",
12: "Centrosome",
13: "Plasma membrane",
14: "Mitochondria",
15: "Aggresome",
16: "Cytosol",
17: "Vesicles and punctate cytosolic patterns",
18: "Negative"
}

### Path to images

In [None]:
red_images = sorted(glob.glob(WORKING_DIR_PATH+'train/*_red.png'))
green_images = sorted(glob.glob(WORKING_DIR_PATH+'train/*_green.png'))
blue_images = sorted(glob.glob(WORKING_DIR_PATH+'train/*_blue.png'))
yellow_images = sorted(glob.glob(WORKING_DIR_PATH+'train/*_yellow.png'))

print(len(red_images), len(green_images), len(blue_images), len(yellow_images))

In [None]:
mt = red_images[:VISUALIZE_SAMPLES] 
er = yellow_images[:VISUALIZE_SAMPLES]
nu = blue_images[:VISUALIZE_SAMPLES]
pr = green_images[:VISUALIZE_SAMPLES]

* We will not be needing the green channel to get the segmentation mask as shown in [this kernel](https://www.kaggle.com/lnhtrang/hpa-public-data-download-and-hpacellseg).

* There are a total of 21,806 training images. We will visualize a small fraction of the images. Feel free to use Weights and Biases overlay tool to visualize more images.

# Initialize Segmentation Model

In [None]:
NUC_MODEL = "./nuclei-model.pth"
CELL_MODEL = "./cell-model.pth"

segmentator = cellsegmentator.CellSegmentator(
    NUC_MODEL,
    CELL_MODEL,
    scale_factor=0.25,
    device="cuda",
    padding=False,
    multi_channel_model=True,
)

# Log Segmentation Masks

In [None]:
# utility function that returns W&B Image. 
# Learn more about overlay logging here: https://docs.wandb.ai/library/log#images-and-overlays
def wandb_mask(bg_img, pred_mask):
  return wandb.Image(bg_img, masks={
      "prediction" : {
          "mask_data" : pred_mask,
      }
    }
  )

In [None]:
for i in tqdm(range(VISUALIZE_SAMPLES)):
    # Image ID 
    image_id = re.findall(r'[^\/]+(?=\_.)', mt[i])[0]
    
    # Initialize W&B
    run = wandb.init(project='hpa-segmentation-mask', name=image_id)

    # Get red, blue and yellow channel images. 
    microtubule = np.array(Image.open(mt[i]))
    endoplasmicrec = np.array(Image.open(er[i]))
    nuclei = np.array(Image.open(nu[i]))
    protein = np.array(Image.open(pr[i]))
    # Stack the channels to form image.
    image = np.dstack((microtubule, endoplasmicrec, nuclei))
    
    # Get the label
    labels = df_train.loc[df_train['ID'] == image_id].Label.values[0]
    labels = labels.replace("|", " ").split()
    labels = '-'.join([label_names[int(label)] for label in labels])
    
    # For nuclei segmentation only blue channel is required.
    nuc_segmentation = segmentator.pred_nuclei([nu[i]])
    # For full cells all the three reference(except green) channels are required.
    cell_segmentation = segmentator.pred_cells([[mt[i]], [er[i]], [nu[i]]])
    # get cell mask
    nuclei_mask, cell_mask = label_cell(nuc_segmentation[0], cell_segmentation[0])
    
    # resize mask and image 
    image = cv2.resize(image, (512,512), interpolation=cv2.INTER_NEAREST)
    cell_mask = cv2.resize(cell_mask, (512,512), interpolation=cv2.INTER_NEAREST)
    protein = cv2.resize(protein, (512,512), interpolation=cv2.INTER_NEAREST)
        
    # log the image as well as the mask
    wandb.log({f"mask_{image_id}" : [wandb_mask(image, cell_mask)]})
    
    # log green channel 
    wandb.log({f"protein_{image_id}": [wandb.Image(protein, caption=f"{labels}")]})
    
    # Close W&B run
    run.finish()

After the above cell is executed head over to the W&B project to visualize the segmentation masks and other visualizations. 

**Note**: Since we have silenced the W&B logs head over to your W&B profile page and open `hpa-segmentation-mask` project.

Here's the link to my W&B project: https://wandb.ai/ayush-thakur/hpa-segmentation-mask

![](https://i.imgur.com/iw3rETK.png)