<a href="https://colab.research.google.com/github/wps0/deep4life/blob/main/Project04_CellSegmentation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Project : Cell image segmentation projects

Contact: Elena Casiraghi (University Milano elena.casiraghi@unimi.it)

Cell segmentation is usually the first step for downstream single-cell analysis in microscopy image-based biology and biomedical research. Deep learning has been widely used for cell-image segmentation.
The CellSeg competition aims to benchmark cell segmentation methods that could be applied to various microscopy images across multiple imaging platforms and tissue types for cell Segmentation. The  Dataset challenge organizers provide both labeled images and unlabeled ones.
The “2018 Data Science Bowl” Kaggle competition provides cell images and their masks for training cell/nuclei segmentation models.

In 2022 another [Cell Segmentation challenge was proposed at Neurips](https://neurips22-cellseg.grand-challenge.org/).
For interested readers, the competition proceeding has been published on [PMLR](https://proceedings.mlr.press/v212/)

### Project Description

In the field of (bio-medical) image processing, segmentation of images is typically performed via U-Nets [1,2].

A U-Net consists of an encoder - a series of convolution and pooling layers which reduce the spatial resolution of the input, followed by a decoder - a series of transposed convolution and upsampling layers which increase the spatial resolution of the input. The encoder and decoder are connected by a bottleneck layer which is responsible for reducing the number of channels in the input.
The key innovation of U-Net is the addition of skip connections that connect the contracting path to the corresponding layers in the expanding path, allowing the network to recover fine-grained details lost during downsampling.

<img src='https://production-media.paperswithcode.com/methods/Screen_Shot_2020-07-07_at_9.08.00_PM_rpNArED.png' width="400"/>


At this [link](https://rpubs.com/eR_ic/unet), you find an R implementation of basic U-Nets. At this [link](https://github.com/zhixuhao/unet), you find a Keras implementation of UNets.  
Other implementations of more advanced UNets are also made available in [2] at these links: [UNet++](https://github.com/MrGiovanni/UNetPlusPlus)
and by the CellSeg organizers as baseline models: [https://neurips22-cellseg.grand-challenge.org/baseline-and-tutorial/](https://neurips22-cellseg.grand-challenge.org/baseline-and-tutorial/)


### Project aim

The aim of the project is to download the *gray-level* (.tiff or .tif files) cell images from the [CellSeg](https://neurips22-cellseg.grand-challenge.org/dataset/) competition and assess the performance of an UNet or any other Deep model for cell segmentation.
We suggest using gray-level images to obtain a model that is better specified on a sub class of images.

Students are not restricted to use UNets but may other model is wellcome; e.g., even transformer based model in the [leaderboard](https://neurips22-cellseg.grand-challenge.org/evaluation/testing/leaderboard/) may be tested.
Students are free to choose any model, as long as they are able to explain their rationale, architecture, strengths and weaknesses.



### References

[1] Ronneberger, O., Fischer, P., Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab, N., Hornegger, J., Wells, W., Frangi, A. (eds) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science(), vol 9351. Springer, Cham. https://doi.org/10.1007/978-3-319-24574-4_28

[2] Long, F. Microscopy cell nuclei segmentation with enhanced U-Net. BMC Bioinformatics 21, 8 (2020). https://doi.org/10.1186/s12859-019-3332-1


## Initialization

In [9]:
!pip install --upgrade gdown



In [20]:
import numpy as np
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import os
import tempfile
from typing import Callable, List, Tuple
from torchvision import transforms
from PIL import Image

In [25]:
TRAIN_PATH = 'data_train/Training-labeled/images/'
TRAIN_LABELS_PATH = 'data_train/Training-labeled/labels/'

TEST_PATH = 'data_test/Testing/Public/images/'
TEST_LABELS_PATH = 'data_test/Testing/Public/labels/'

VAL_PATH = 'data_val/Tuning/images/'
VAL_LABELS_PATH = ' data_val/Tuning/labels/'

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

### Data preparation
[Browse the data](https://drive.google.com/drive/folders/1MaJibsHYitCPOltxVzYjr3rm5s9Vpjpv)

In [4]:
!curl -o data_test.zip https://zenodo.org/records/10719375/files/Testing.zip?download=1
!unzip -d data_test data_test.zip

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2793M  100 2793M    0     0  78.4M      0  0:00:35  0:00:35 --:--:-- 87.2M
Archive:  data_test.zip
   creating: data_test/Testing/
   creating: data_test/Testing/Hidden/
   creating: data_test/Testing/Hidden/images/
  inflating: data_test/Testing/Hidden/images/TestHidden_214.tif  
  inflating: data_test/Testing/Hidden/images/TestHidden_224.tif  
  inflating: data_test/Testing/Hidden/images/TestHidden_222.tif  
  inflating: data_test/Testing/Hidden/images/TestHidden_049.png  
  inflating: data_test/Testing/Hidden/images/TestHidden_161.tif  
  inflating: data_test/Testing/Hidden/images/TestHidden_040.tiff  
  inflating: data_test/Testing/Hidden/images/TestHidden_003.tif  
  inflating: data_test/Testing/Hidden/images/TestHidden_038.tif  
  inflating: data_test/Testing/Hidden/images/TestHidden_073.tif  
  inflating: data_test/Test

In [3]:
!curl -o data_train.zip https://zenodo.org/records/10719375/files/Training-labeled.zip?download=1
!unzip -d data_train data_train.zip

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1926M  100 1926M    0     0   106M      0  0:00:18  0:00:18 --:--:--  132M
Archive:  data_train.zip
   creating: data_train/Training-labeled/
   creating: data_train/Training-labeled/labels/
  inflating: data_train/Training-labeled/labels/cell_00504_label.tiff  
  inflating: data_train/Training-labeled/labels/cell_00530_label.tiff  
  inflating: data_train/Training-labeled/labels/cell_00515_label.tiff  
  inflating: data_train/Training-labeled/labels/cell_00552_label.tiff  
  inflating: data_train/Training-labeled/labels/cell_00517_label.tiff  
  inflating: data_train/Training-labeled/labels/cell_00642_label.tiff  
  inflating: data_train/Training-labeled/labels/cell_00630_label.tiff  
  inflating: data_train/Training-labeled/labels/cell_00511_label.tiff  
  inflating: data_train/Training-labeled/labels/cell_00557_label.tiff  

In [2]:
!curl -o data_val.zip https://zenodo.org/records/10719375/files/Tuning.zip?download=1
!unzip -d data_val data_val.zip

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  595M  100  595M    0     0  6064k      0  0:01:40  0:01:40 --:--:-- 6217k
Archive:  data_val.zip
   creating: data_val/Tuning/
   creating: data_val/Tuning/images/
  inflating: data_val/Tuning/images/cell_00101.tif  
  inflating: data_val/Tuning/images/cell_00026.png  
  inflating: data_val/Tuning/images/cell_00063.png  
  inflating: data_val/Tuning/images/cell_00065.png  
  inflating: data_val/Tuning/images/cell_00099.tif  
  inflating: data_val/Tuning/images/cell_00054.png  
  inflating: data_val/Tuning/images/cell_00064.png  
  inflating: data_val/Tuning/images/cell_00047.png  
  inflating: data_val/Tuning/images/cell_00066.png  
  inflating: data_val/Tuning/images/cell_00058.png  
  inflating: data_val/Tuning/images/cell_00045.png  
  inflating: data_val/Tuning/images/cell_00048.png  
  inflating: data_val/Tuning/images/c

In [None]:
# Partially adapted from https://colab.research.google.com/github/mim-ml-teaching/public-dnn-2024-25/blob/master/docs/DNN-Lab-7-UNet-in-Pytorch-student-version.ipynb
class ImageTiffDataset(torch.utils.data.Dataset):
  def __init__(self,
               image_dir: str,
               target_dir: str,
               cache_dir: str,
               filenames: List[str],
               transform: torch.nn.Module = transforms.ToTensor(),
               target_transform: torch.nn.Module = transforms.ToTensor()):
    self.image_dir = image_dir
    self.target_dir = target_dir
    self.cache_dir = cache_dir
    self.filenames = filenames
    self.transform = transform
    self.target_transform = target_transform

    if not os.path.exists(self.cache_dir):
      os.mkdir(self.cache_dir)
      os.mkdir(os.path.join(self.cache_dir, "images"))
      os.mkdir(os.path.join(self.cache_dir, "target"))

  def __getitem__(self, idx: int) -> Tuple[torch.Tensor, torch.Tensor]:
    img_filename = self.filenames[idx]
    target_filename = os.path.basename(img_filename) + '.tiff'
    img_path = os.path.join(self.image_dir, img_filename)
    target_path = os.path.join(self.target_dir, target_filename)
    img_cache = os.path.join(self.cache_dir, img_filename)
    target_cache = os.path.join(self.cache_dir, target_filename)

    if not os.path.exists(img_cache):
      img = Image.load(img_path)
      img = self.transform(img)
      torch.save(img, img_cache)
    else:
      img = torch.load(img_cache)

    if not os.path.exists(target_cache):
      target = Image.load(target_path)
      target = self.target_transform(target)
      torch.save(target, target_cache)
    else:
      target = torch.load(target_cache)

    return img, target

  def __len__(self) -> int:
    return len(self.filenames)

def make_tiff_dataset(image_dir: str, target_dir: str, cache_dir: str):
  filenames = []
  with os.scandir(image_dir) as it:
    for entry in it:
      if not entry.is_file():
        continue
      lower_name = entry.name.lower()
      if lower_name.endswith('.tiff') or lower_name.endswith('.tif'):
        filenames.append(entry.name)
  return ImageTiffDataset(image_dir, target_dir, cache_dir, filenames)

In [27]:
train_dataset = make_tiff_dataset(TRAIN_PATH, TRAIN_LABELS_PATH, tempfile.mkdtemp())
test_dataset = make_tiff_dataset(TEST_PATH, TEST_LABELS_PATH, tempfile.mkdtemp())
val_dataset = make_tiff_dataset(VAL_PATH, VAL_LABELS_PATH, tempfile.mkdtemp())
print('Train:', len(train_dataset))
print('Test:', len(test_dataset))
print('Val:', len(val_dataset))

Train: 491
Test: 30
Val: 58


## Basic U-Nets