# Quality control notebook

As there exists a large heterogeneity of cell shape and population numbers across similar classifications of images, I have created a notebook that can load an image and the associated single-cell tracking data for inspection before granting approval for further downstream analyses. 

The order of this notebook is as follows:

1. Import necessary modules for loading and visualisation of tracking (across Z) data.
2. Define image file name
3. Load image
4. Define single-cell segmentation/tracking data filename (using image file name)
5. Load single-cell data
6. Launch Napari viewer with images, segmentation and tracks.

In [1]:
import os
from glob import glob
from skimage import io
import napari
import btrack

# Define image filename (fn) as a string

I have first defined a `root_dir` (i.e. an folder address where the images are kept) and then joined (using `os.path.join`) that with the image basename (i.e. the actual filename of just the image) so that it's easier to read the final identifying part of the whole full image path. 
The `root_dir` string will look different on a Mac as the way our respective computers connect to NEMO is different. 

In [2]:
root_dir = '/run/user/30046150/gvfs/smb-share:server=data2.thecrick.org,share=lab-gutierrezm/home/shared/Lung on Chip/homuncu_loc_image_analysis/iAT1_iAT2_iVEC_macrophage_experiments/DAPI_ZO1_CD16_MTB/images'

In [31]:
# folder where the images are kept
root_dir = '/run/user/30046150/gvfs/smb-share:server=data2.thecrick.org,share=lab-gutierrezm/home/shared/Lung on Chip/homuncu_loc_image_analysis/iAT1_iAT2_iVEC_macrophage_experiments/DAPI_ZO1_LDO_MTB/images'
# now define the image file name
image_file_name = '20230802_20X_23-03-001A2_DAPI_ZO-1_LDO_Mtb_Multichannel Z-Stack_20230802_1504.tif'
image_path = os.path.join(root_dir, image_file_name)
print(f'Loading image path: {image_file_name}')
image = io.imread(image_path)
print(f'Loaded image: {image_file_name}')

Loading image path: 20230802_20X_23-03-001A2_DAPI_ZO-1_LDO_Mtb_Multichannel Z-Stack_20230802_1504.tif
Loaded image: 20230802_20X_23-03-001A2_DAPI_ZO-1_LDO_Mtb_Multichannel Z-Stack_20230802_1504.tif


# Launch napari with image

In [32]:
# this opens the napari viewer
viewer = napari.Viewer(title = os.path.basename(image_path))
# and this adds the image and organises it as the channels are on the final axis (nota)
viewer.add_image(image, channel_axis=-1)

[<Image layer 'Image' at 0x7f602d139700>,
 <Image layer 'Image [1]' at 0x7f6296399ee0>,
 <Image layer 'Image [2]' at 0x7f6296353fd0>,
 <Image layer 'Image [3]' at 0x7f6296310e80>]

# Load corresponding single-cell data

In [20]:
# couple of things going on here, I am using glob to find all files that end have a similar filename to the image_path but that are in the sc_analyses folder and that end with the .h5 prefix
sc_paths = glob(os.path.join(root_dir.replace('images','sc_analyses'), image_file_name.replace('.tif', '*.h5')))
# show results 
print(f'The following single-cell files have been found:\n {sc_paths}')

The following single-cell files have been found:
 ['/run/user/30046150/gvfs/smb-share:server=data2.thecrick.org,share=lab-gutierrezm/home/shared/Lung on Chip/homuncu_loc_image_analysis/iAT1_iAT2_iVEC_macrophage_experiments/DAPI_ZO1_CD16_MTB/sc_analyses/20230707_40X_23-01-001A3_Multichannel Z-Stack_20230707_1313_mphi.h5', '/run/user/30046150/gvfs/smb-share:server=data2.thecrick.org,share=lab-gutierrezm/home/shared/Lung on Chip/homuncu_loc_image_analysis/iAT1_iAT2_iVEC_macrophage_experiments/DAPI_ZO1_CD16_MTB/sc_analyses/20230707_40X_23-01-001A3_Multichannel Z-Stack_20230707_1313_iat2.h5', '/run/user/30046150/gvfs/smb-share:server=data2.thecrick.org,share=lab-gutierrezm/home/shared/Lung on Chip/homuncu_loc_image_analysis/iAT1_iAT2_iVEC_macrophage_experiments/DAPI_ZO1_CD16_MTB/sc_analyses/20230707_40X_23-01-001A3_Multichannel Z-Stack_20230707_1313_iat1.h5']


# Add single-cell data to napari

In [29]:
# iterate over the three different type of single-cell files
for n, fn in enumerate(sc_paths):
    # extract cell type information from filename
    if 'iat1' in fn:
        obj_type = 'obj_type_3'
        name = 'iat1'
    if 'iat2' in fn:
        obj_type = 'obj_type_2'
        name = 'iat2'
    if 'mphi' in fn:
        obj_type = 'obj_type_1'
        name = 'mphi'
    # use btrack to load single-cell data
    with btrack.io.HDF5FileHandler(fn, 'r', obj_type=obj_type) as reader:
        # load segmentation
        segmentation = reader.segmentation
        # load tracks and filter to only include above 3 in length
        tracks = [t for t in reader.tracks if len(t) >= 3]
    # convert tracks to napari compatible 
    napari_tracks, _, _ = btrack.utils.tracks_to_napari(tracks, ndim = 2)
    # colour segmentation according to tracks (so that the masks stay the same colour over Z)
    recolored_segmentation = btrack.utils.update_segmentation(segmentation, tracks, color_by='ID')
    # add all single-cell data to napari    
    viewer.add_labels(segmentation, visible = False, name = f'{name} original unfiltered segmentation')
    viewer.add_labels(recolored_segmentation, name = f'{name} filtered segmentation', )
    viewer.add_tracks(napari_tracks, name = f'{name} tracks', visible = False)

[INFO][2023/10/03 02:08:46 pm] Opening HDF file: /run/user/30046150/gvfs/smb-share:server=data2.thecrick.org,share=lab-gutierrezm/home/shared/Lung on Chip/homuncu_loc_image_analysis/iAT1_iAT2_iVEC_macrophage_experiments/DAPI_ZO1_CD16_MTB/sc_analyses/20230707_40X_23-01-001A3_Multichannel Z-Stack_20230707_1313_iat1.h5...
[ERROR][2023/10/03 02:08:47 pm] Segmentation not found in /run/user/30046150/gvfs/smb-share:server=data2.thecrick.org,share=lab-gutierrezm/home/shared/Lung on Chip/homuncu_loc_image_analysis/iAT1_iAT2_iVEC_macrophage_experiments/DAPI_ZO1_CD16_MTB/sc_analyses/20230707_40X_23-01-001A3_Multichannel Z-Stack_20230707_1313_iat1.h5
[ERROR][2023/10/03 02:08:47 pm] Tracks not found in /run/user/30046150/gvfs/smb-share:server=data2.thecrick.org,share=lab-gutierrezm/home/shared/Lung on Chip/homuncu_loc_image_analysis/iAT1_iAT2_iVEC_macrophage_experiments/DAPI_ZO1_CD16_MTB/sc_analyses/20230707_40X_23-01-001A3_Multichannel Z-Stack_20230707_1313_iat1.h5
[INFO][2023/10/03 02:08:47 pm] 

TypeError: 'NoneType' object is not iterable

In [33]:
fn

'/run/user/30046150/gvfs/smb-share:server=data2.thecrick.org,share=lab-gutierrezm/home/shared/Lung on Chip/homuncu_loc_image_analysis/iAT1_iAT2_iVEC_macrophage_experiments/DAPI_ZO1_CD16_MTB/sc_analyses/20230707_40X_23-01-001A3_Multichannel Z-Stack_20230707_1313_iat1.h5'

In [None]:
# sometimes when you give napari a lot to think about it needs a gentle nudge to finish completing the task, run this cell to do so
print()

# Misc: example of inspecting cell data

In [None]:
# look at the first track
tracks[0]

In [None]:
# look at a specific cell ID
cell_ID = 17
# print that track
[t for t in tracks if t.ID == cell_ID][0]

In [35]:
mask_obj_type_2[i, j]

array([0, 0, 0, ..., 0, 0, 0], dtype=uint16)

In [43]:
combined_mask.shape

(61, 2304, 2304)

In [44]:
# first combine at1/2
# Load the mask arrays from segmentation_dict
mask_obj_type_2 = segmentation_dict['obj_type_2']
mask_obj_type_3 = segmentation_dict['obj_type_3']

# Create a new mask with the same shape as the input masks
combined_mask = np.zeros_like(mask_obj_type_2)

for frame in tqdm(range(len(combined_mask))):
    # Iterate through each pixel and assign values based on priority
    for i in range(combined_mask.shape[1]):
        for j in range(combined_mask.shape[2]):
            if mask_obj_type_2[frame, i, j] > 0:
                combined_mask[frame, i, j] = mask_obj_type_2[frame, i, j]
            else:
                combined_mask[frame, i, j] = mask_obj_type_3[frame, i, j]

viewer.add_labels(combined_mask)

  0%|          | 0/61 [00:00<?, ?it/s]

KeyboardInterrupt: 

In [49]:
# Load the mask arrays from segmentation_dict
mask_obj_type_2 = segmentation_dict['obj_type_2']
mask_obj_type_3 = segmentation_dict['obj_type_3']

# Create a new mask with the same shape as the input masks
combined_mask = np.zeros_like(mask_obj_type_2)

# Find pixels where object type 2 masks are non-zero
obj_type_2_pixels = mask_obj_type_2 > 0

# Use object type 2 masks where they are non-zero, and object type 3 masks elsewhere
combined_mask[obj_type_2_pixels] = mask_obj_type_2[obj_type_2_pixels]
combined_mask[~obj_type_2_pixels] = mask_obj_type_3[~obj_type_2_pixels]

# Extract unique pixel IDs from object type 2 masks
unique_pixel_ids_obj_type_2 = np.unique(mask_obj_type_2)

# Extract unique pixel IDs from object type 3 masks that are not in object type 2
final_pixel_ids_obj_type_3 = np.unique(combined_mask)
final_pixel_ids_obj_type_3 = np.setdiff1d(final_pixel_ids_obj_type_3, unique_pixel_ids_obj_type_2)

# Create a binary mask for final_pixel_ids_obj_type_3
binary_mask_final_obj_type_3 = np.zeros_like(mask_obj_type_2, dtype=np.uint8)
for pixel_id in tqdm(final_pixel_ids_obj_type_3):
    binary_mask_final_obj_type_3[combined_mask == pixel_id] = 1

viewer.add_labels(combined_mask)
viewer.add_labels(binary_mask_final_obj_type_3)

  0%|          | 0/75 [00:00<?, ?it/s]

<Labels layer 'binary_mask_final_obj_type_3' at 0x7f79d7ca1ee0>

In [54]:
threshold = 2500

# Calculate connected components in the binary mask
labeled_mask = label(binary_mask_final_obj_type_3)

# Filter masks based on area threshold
filtered_mask = np.zeros_like(labeled_mask)
for region in tqdm(regionprops(labeled_mask)):
    if region.area > threshold:
        filtered_mask[labeled_mask == region.label] = 1


viewer.add_labels(filtered_mask)

  0%|          | 0/427 [00:00<?, ?it/s]

<Labels layer 'filtered_mask' at 0x7f79d4aa09d0>

In [55]:
viewer.add_labels(labeled_mask)

<Labels layer 'labeled_mask' at 0x7f79b252a8e0>

In [48]:
final_pixel_ids_obj_type_3

array([  4,   5,   8,  10,  12,  13,  16,  17,  19,  24,  25,  26,  27,
        31,  54,  59,  62,  72,  94, 101, 103, 116, 146, 147, 158, 160,
       164, 165, 169, 171, 177, 197, 213, 219, 240, 249, 264, 293, 304,
       318, 329, 330, 359, 367, 368, 369, 380, 384, 389, 393, 395, 397,
       402, 403, 405, 412, 425, 435, 439, 452, 498, 510, 521, 523, 539,
       540, 550, 567, 607, 627, 653, 657, 676, 680, 686], dtype=uint16)