# Cell segmentation and image generation

The cells represent several difficulties, such as the inconsistency of the intensity and size of cells, as well as non-differentiation in color between citoplasms that surround the cells.

## Installations and imports

Check CUDA version and GPU first

In [None]:
!python --version

Python 3.7.12


In [None]:
!nvcc --version
!nvidia-smi

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0
Fri Jan  7 14:42:28 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.44       Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   36C    P0    26W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+------------

Install cellpose -- by default the torch GPU version is installed in COLAB notebook.

In [None]:
#!pip install folium==0.2.1
#!pip install imgaug==0.2.5
#!pip install --upgrade numpy
!pip install cellpose 
#!pip install numpy==1.19.5



import libraries and check GPU (the first time you import cellpose the models will download).

In [None]:
import numpy as np
import time, os, sys
from urllib.parse import urlparse
import skimage.io 
import matplotlib.pyplot as plt
import matplotlib as mpl

!pip install opencv-python-headless==4.1.2.30
import cv2

%matplotlib inline
mpl.rcParams['figure.dpi'] = 300

from urllib.parse import urlparse

from cellpose import models, io, core

use_GPU = core.use_gpu()
print('>>> GPU activated? %d'%use_GPU)

creating new log file
2022-01-07 14:42:41,502 [INFO] WRITING LOG OUTPUT TO /root/.cellpose/run.log
2022-01-07 14:42:50,232 [INFO] ** TORCH CUDA version installed and working. **
>>> GPU activated? 1


If error occurs in the above cell, click (Ctrl+M) and run again

In [None]:
import zipfile
from google_drive_downloader import GoogleDriveDownloader as gdd
import glob
import cv2
import time

import matplotlib
import matplotlib.pyplot as plt
import numpy as np

from skimage import data, img_as_float
from skimage import exposure

Connect Drive

In [None]:
# import library
from google.colab import drive

#mount the drive
drive.mount('/content/drive')
# go to the url and get the password for the drive

Mounted at /content/drive


Go through files of each class, run the code of segmentation, then proceed with image treatment on them.

Obs: We created two different types of treatment
- One is to normalize and pretreat the images before going through cellpose
- The other consists of pretreating only the cells generated by cellpose on the original images.

## Image treatment functions 

In [None]:
def apply_mask (img, mask):
  cells = []
  cell_numbers = np.unique(mask)
  cell_numbers = np.delete(cell_numbers, np.where(cell_numbers == 0))
  for i in cell_numbers:
    x, y = np.where(mask==i)
    cell = np.copy(img[np.amin(x):np.amax(x)+1, np.amin(y):np.amax(y)+1][:,:,1])
    cell_mask = mask[np.amin(x):np.amax(x)+1, np.amin(y):np.amax(y)+1]
    for ri in range(cell_mask.shape[0]):
      for ci in range(cell_mask.shape[1]):
        if cell_mask[ri][ci] != i:
          cell[ri][ci] = 0
    cells.append(cell)
  return np.amax(mask)+1, cells

def exclude_borders(mask):
  m = np.copy(mask)
  is_border = m != 0
  is_border[1:m.shape[0]-1, 1:m.shape[1]-1] = False
  border_cells = np.unique(m[is_border])
  for i in border_cells:
    m[m==i] = 0
  return m

In [None]:
def create_images(class_path, class_name):
    # go through all the images in a class 
    # return images and their names
    count = 0
    images = []
    names = []
    for cell_path in sorted(glob.glob(class_path + '/*')):
        if cell_path[-3:]=='jpg':
            count += 1
            img = cv2.imread(cell_path)
            img = clahe_img(img)
            name = class_name + '_' + str(count)
            images.append(img)
            names.append(name)
    return images, names 

def clahe_img(img):
  clahe = cv2.createCLAHE(clipLimit=10, tileGridSize=(8,8))
  out = np.copy(img)
  # apply norm to green channel
  out[:,:,1] = clahe.apply(out[:,:,1])
  return out

In [None]:
def rename_images(class_path, class_name):
    # go through all the images in a class 
    # return images and their names
    count = 0
    for cell_path in sorted(glob.glob(class_path + '/*')):
        if cell_path[-3:]=='jpg':
            count += 1
            name = class_name + '_' + str(count)
            os.rename(cell_path, class_path+'/'+name+'.jpg')

In [None]:
def delete_folder_content(folder):
    # for a certain forder delete its contents
    for filename in os.listdir(folder):
        file_path = os.path.join(folder, filename)
        try:
            if os.path.isfile(file_path) or os.path.islink(file_path):
                os.unlink(file_path)
            elif os.path.isdir(file_path):
                shutil.rmtree(file_path)
        except Exception as e:
            print('Failed to delete %s. Reason: %s' % (file_path, e))

## Cellpose functions

In [None]:
def run_cellpose(images, names, dest_path, class_path):
    model = models.Cellpose(gpu=use_GPU, model_type='nuclei')
    # must save masks as npy, in case there are more than 255 cells in a image

    # define CHANNELS to run segementation on
    # grayscale=0, R=1, G=2, B=3
    # channels = [cytoplasm, nucleus]
    # if NUCLEUS channel does not exist, set the second channel to 0
    # channels = [0,0]
    # IF ALL YOUR IMAGES ARE THE SAME TYPE, you can give a list with 2 elements
    # channels = [0,0] # IF YOU HAVE GRAYSCALE
    # channels = [2,3] # IF YOU HAVE G=cytoplasm and B=nucleus
    # channels = [2,1] # IF YOU HAVE G=cytoplasm and R=nucleus

    # or if you have different types of channels in each image
    channels = [2, 0]
    # channels = [1,1]

    # if diameter is set to None, the size of the cells is estimated on a per image basis
    # you can set the average cell `diameter` in pixels yourself (recommended) 
    # diameter can be a list or a single number for all images

    masks, flows, styles, diams = model.eval(images, diameter=60, channels=channels)
   
    #delete_folder_content(class_path+"/Masks")
    if not os.path.isdir(class_path+"/Masks"):
        os.mkdir(class_path+"/Masks")
    if not os.path.isdir(class_path+"/CytoMasks"):
        os.mkdir(class_path+"/CytoMasks")
    # BATCHES ARE DETERMINED BY CLASS
    print("Saving images")
    n_cells = 0
    # for 1 image
    if len(masks) == 1:
        i = 0
        np.save(class_path+"/Masks/"+names[i]+'.npy',masks[i])
        print(i, '-', 1)
        m = exclude_borders(masks[i])
        np.save(class_path+"/Masks/"+names[i]+'_no_borders.npy', m)
        aux, cells = apply_mask(images[i], m)
        filename = dest_path + names[i] + '.npy'
        np.save(filename, np.asarray(cells))
        n_cells+=aux
    # for all images, exclude border cells,
    # apply cells' mask
    # and save cells
    else:
        for i in range(np.shape(masks)[0]):
            np.save(class_path+"/Masks/"+names[i]+'.npy',masks[i])
            print(i, '-', np.shape(masks)[0])
            m = exclude_borders(masks[i])
            np.save(class_path+"/Masks/"+names[i]+'_no_borders.npy', m)
            aux, cells = apply_mask(images[i], m)
            filename = dest_path + names[i] + '.npy'
            np.save(filename, np.asarray(cells))
            n_cells+=aux
    print(f'Total number of cells: {n_cells}')
      

In [None]:
def apply_cellpose(path, dest, interrupt=0):
    # for all the classes in path given, apply cellpose to their images
    i=0
    n_class = len(glob.glob(path + '/*'))
    for class_path in sorted(glob.glob(path + '/*')): 
        i+=1
        class_name = class_path[len(path)+1:]

        c = dest + class_name
        #rename_images(class_path=class_path, class_name=class_name)
        images, names = create_images(class_path=class_path, class_name=class_name)
        #os.mkdir(c)
        print("Running for:", class_name," - ",  i, "/" ,n_class)
        run_cellpose(images, names, dest+class_name+'/', class_path)
        
        if interrupt:
          break


## Run cellpose on 2D sample images
Here we run the cellpose algorithm on all the images of every class to generate their cells, while saving them to the drive, untreated.

In [None]:
# do not run again, takes a long time and would be unnecessary
path = "/content/drive/MyDrive/Images/Images by class"
dest = "/content/drive/MyDrive/Images/Cells/Cells_ images/"

start = time.time()
apply_cellpose(path, dest, 0)

Running for: Centromere  -  1 / 47
2022-01-07 13:13:31,345 [INFO] ** TORCH CUDA version installed and working. **
2022-01-07 13:13:31,347 [INFO] >>>> using GPU
2022-01-07 13:13:31,450 [INFO] ~~~ FINDING MASKS ~~~
2022-01-07 13:13:32,877 [INFO] >>>> TOTAL TIME 1.43 sec
Saving images
0 - 1




Total number of cells: 95
Running for: Controle neg  -  2 / 47
2022-01-07 13:13:35,816 [INFO] ** TORCH CUDA version installed and working. **
2022-01-07 13:13:35,818 [INFO] >>>> using GPU
2022-01-07 13:13:35,920 [INFO] ~~~ FINDING MASKS ~~~
2022-01-07 13:13:37,371 [INFO] >>>> TOTAL TIME 1.45 sec
Saving images
0 - 1
Total number of cells: 171
Running for: Cyto APL  -  3 / 47
2022-01-07 13:13:44,148 [INFO] ** TORCH CUDA version installed and working. **
2022-01-07 13:13:44,150 [INFO] >>>> using GPU
2022-01-07 13:13:44,250 [INFO] ~~~ FINDING MASKS ~~~
2022-01-07 13:13:44,252 [INFO] 0%|          | 0/7 [00:00<?, ?it/s]
2022-01-07 13:13:45,635 [INFO] 14%|#4        | 1/7 [00:01<00:08,  1.38s/it]
2022-01-07 13:13:47,016 [INFO] 29%|##8       | 2/7 [00:02<00:06,  1.38s/it]
2022-01-07 13:13:48,382 [INFO] 43%|####2     | 3/7 [00:04<00:05,  1.37s/it]
2022-01-07 13:13:49,760 [INFO] 57%|#####7    | 4/7 [00:05<00:04,  1.38s/it]
2022-01-07 13:13:51,145 [INFO] 71%|#######1  | 5/7 [00:06<00:02,  1.38s/it



1 - 7
2 - 7
3 - 7
4 - 7
5 - 7
6 - 7
Total number of cells: 426
Running for: Cyto Fibreux  -  4 / 47
2022-01-07 13:14:10,880 [INFO] ** TORCH CUDA version installed and working. **
2022-01-07 13:14:10,882 [INFO] >>>> using GPU
2022-01-07 13:14:10,981 [INFO] ~~~ FINDING MASKS ~~~
2022-01-07 13:14:10,983 [INFO] 0%|          | 0/3 [00:00<?, ?it/s]
2022-01-07 13:14:12,322 [INFO] 33%|###3      | 1/3 [00:01<00:02,  1.33s/it]
2022-01-07 13:14:13,711 [INFO] 67%|######6   | 2/3 [00:02<00:01,  1.37s/it]
2022-01-07 13:14:15,106 [INFO] 100%|##########| 3/3 [00:04<00:00,  1.38s/it]
2022-01-07 13:14:15,109 [INFO] 100%|##########| 3/3 [00:04<00:00,  1.37s/it]
2022-01-07 13:14:15,115 [INFO] >>>> TOTAL TIME 4.13 sec
Saving images
0 - 3
1 - 3
2 - 3
Total number of cells: 155
Running for: Cyto Golgi  -  5 / 47
2022-01-07 13:14:24,952 [INFO] ** TORCH CUDA version installed and working. **
2022-01-07 13:14:24,954 [INFO] >>>> using GPU
2022-01-07 13:14:25,070 [INFO] ~~~ FINDING MASKS ~~~
2022-01-07 13:14:25,0

In [None]:

end = time.time()
print(f"Classified images time: {(end - start)/60} min" )

Classified images time: 41.073148095607756 min


Then we run for the pattern cells given

In [None]:

path = "/content/drive/MyDrive/Images/Patterns"
dest = "/content/drive/MyDrive/Images/Cells/Pattern cells/"
   
    
# isn't slow
start = time.time()
apply_cellpose(path, dest, 0)

Running for: pat_10  -  1 / 14
2022-01-07 14:43:17,916 [INFO] ** TORCH CUDA version installed and working. **
2022-01-07 14:43:17,917 [INFO] >>>> using GPU
2022-01-07 14:43:17,923 [INFO] Downloading: "https://www.cellpose.org/models/nucleitorch_0" to /root/.cellpose/models/nucleitorch_0



100%|██████████| 25.3M/25.3M [00:00<00:00, 28.0MB/s]

2022-01-07 14:43:19,264 [INFO] Downloading: "https://www.cellpose.org/models/nucleitorch_1" to /root/.cellpose/models/nucleitorch_1




100%|██████████| 25.3M/25.3M [00:00<00:00, 28.3MB/s]

2022-01-07 14:43:20,462 [INFO] Downloading: "https://www.cellpose.org/models/nucleitorch_2" to /root/.cellpose/models/nucleitorch_2




100%|██████████| 25.3M/25.3M [00:00<00:00, 28.1MB/s]

2022-01-07 14:43:21,674 [INFO] Downloading: "https://www.cellpose.org/models/nucleitorch_3" to /root/.cellpose/models/nucleitorch_3




100%|██████████| 25.3M/25.3M [00:00<00:00, 28.0MB/s]


2022-01-07 14:43:23,007 [INFO] Downloading: "https://www.cellpose.org/models/size_nucleitorch_0.npy" to /root/.cellpose/models/size_nucleitorch_0.npy



100%|██████████| 3.54k/3.54k [00:00<00:00, 3.69MB/s]

2022-01-07 14:43:23,317 [INFO] ~~~ FINDING MASKS ~~~
2022-01-07 14:43:23,319 [INFO] 0%|          | 0/4 [00:00<?, ?it/s]





2022-01-07 14:43:24,663 [INFO] 25%|##5       | 1/4 [00:01<00:04,  1.34s/it]
2022-01-07 14:43:25,822 [INFO] 50%|#####     | 2/4 [00:02<00:02,  1.23s/it]
2022-01-07 14:43:26,988 [INFO] 75%|#######5  | 3/4 [00:03<00:01,  1.20s/it]
2022-01-07 14:43:28,132 [INFO] 100%|##########| 4/4 [00:04<00:00,  1.18s/it]
2022-01-07 14:43:28,134 [INFO] 100%|##########| 4/4 [00:04<00:00,  1.20s/it]
2022-01-07 14:43:28,140 [INFO] >>>> TOTAL TIME 4.82 sec
Saving images
0 - 4




1 - 4
2 - 4
3 - 4
Total number of cells: 301
Running for: pat_11  -  2 / 14
2022-01-07 14:43:35,247 [INFO] ** TORCH CUDA version installed and working. **
2022-01-07 14:43:35,248 [INFO] >>>> using GPU
2022-01-07 14:43:35,338 [INFO] ~~~ FINDING MASKS ~~~
2022-01-07 14:43:35,342 [INFO] 0%|          | 0/4 [00:00<?, ?it/s]
2022-01-07 14:43:36,497 [INFO] 25%|##5       | 1/4 [00:01<00:03,  1.15s/it]
2022-01-07 14:43:37,629 [INFO] 50%|#####     | 2/4 [00:02<00:02,  1.14s/it]
2022-01-07 14:43:38,805 [INFO] 75%|#######5  | 3/4 [00:03<00:01,  1.16s/it]
2022-01-07 14:43:39,930 [INFO] 100%|##########| 4/4 [00:04<00:00,  1.14s/it]
2022-01-07 14:43:39,931 [INFO] 100%|##########| 4/4 [00:04<00:00,  1.15s/it]
2022-01-07 14:43:39,938 [INFO] >>>> TOTAL TIME 4.60 sec
Saving images
0 - 4
1 - 4
2 - 4
3 - 4
Total number of cells: 261
Running for: pat_12  -  3 / 14
2022-01-07 14:43:47,136 [INFO] ** TORCH CUDA version installed and working. **
2022-01-07 14:43:47,138 [INFO] >>>> using GPU
2022-01-07 14:43:47,

In [1]:
end = time.time()
print(f"Patterns time: {(end - start)/60} min" )

NameError: ignored