#last modified (yyyy/mm/dd): 2024/08/22

Author: Alessandro Ulivi (ale.ulivi@gmail.com)

The present pipeline allows to perform a semantic segmentation of 1 channel. It was conceptualized for CYK1::GFP. Refer to the ACR005 transgenic C. elegans line for background information on the transgenes.

The pipeline have been developed and tested on time-lapse images of the C.elegans embryo cortex. The embryo was imaged between the 2 and the 6 cells stages. Images were acquired, live, at a Nikon CSU-X1 spinning disk microscope at IGMBC Strasbourg, using a 100x, 1.4 NA, oil immersion objective (pixel size xy 0.11 um).

The pipeline works on 3D arrays. It was conceptualized for processing a single plane (the embryonic cortical plane) imaged over time. Thus, the segmentation masks are obtained by iterating 2D segmentations methods along one of the 3 axes of the input arrays. When the pipeline was created this corresponded to the time-dimension of the time-lapses.

To properly work the pipeline requires an existing input folder (input_folder) containg the a single 3D arrays to process. It is not possible to have additional files in the input folder. The pipeline was conceptualized so that the array is the time-lapse for the two cyk1::gfp structure.

When an ROI is used for segmentation, the iteration axis must be on position 0 of the input file shape. For example, when the pipeline was created this corresponded to the time-axis and the files had dimensions order TYX. Different dimension orders are NOT possible at the moment if a ROI is provided for segmentation because the reference image passed to form_mask_from_roi when creating embryo_roi_mask is at position 0 of the ch_timecourse.

It is required to provide an output folder. The output folder must exist. The output folder can contain files. The outputs of the pipeline are saved in the output folder.

The output of the pipeline are:
1) Binary segmentation mask for ch1 3D array. Refer to section SEGMENTATION OF CH1 - CYK1GFP ENRICHED STRUCTURES.
2) A binary segmentation mask corresponding to the emrbyo cortex. Refer to section SEGMENTATION OF EMBRYO CORTEX.

When creating output 2, it is possible to restrict the segmentation to one or more regions of interest (ROIs) of the input 3D array. In this case, the ROIs must be provided as a single file (.roi or .zip) contained in a folder (following roi_path variable). Nothing else but the file must be present in the folder. The procedure is conceptualized for the use of ROI files created in ImageJ/Fiji.



The following cell import packages and functions required for the pipeline.

The cell should be run.

The cell should not be modified.

In [7]:
#Import packages
import os
import numpy as np
import matplotlib.pyplot as plt
import tifffile
from utils import listdirNHF, form_mask_from_roi
from segment_cyk1gfp import mask_embryo, segment_cyk1_enriched_domains



DEFINE INPUT AND OUTPUT FOLDERS - DEFINE COMMON VARIABLES FOR DIFFERENT PROCESSING PARTS - OPEN THE INPUT FILE

The following cell must be run.

Some parts of the following cell must be changed according to the processing to do.

In [8]:
#indicate the directories of the input_folder and the output_folder.
input_folder = r"/Users/ulivia/Desktop/Alessandro/projects/filopodia_paper/fig2_contact_filopodia/step_03/210709-GM-ACR005/210709-GM-ACR005-7"
output_folder = r"/Users/ulivia/Desktop/Alessandro/projects/filopodia_paper/fig2_contact_filopodia/step_04/210709-GM-ACR005/210709-GM-ACR005-7"

#indicate saving names for the channels - indicate the name without extension - the present names  will be included in the names of the output segmentation masks of each channel
ch1_sav_ing_name = "cyk1_enriched_domains" #Note: when the pipeline was created, channel-1 was cyk1::gfp signal
embryo_mask_sav_ing_name = "embryo_mask"

#indicate in which dimension to iterate the 2D segmentation methods - when the pipeline was created this axis corresponded to time in the time-lapse image - when input files are created using
#open_rearrange_nikon_files this dimension is 0
time_axis = 0

#=== OPEN THE INPUT FILE - CREATE THE PREFIX FOR THE SAVING NAME === DON'T MODIFY THE LINES BELOW
#Open files in the input_folder as a list
input_files_l = listdirNHF(input_folder)

#Get file extension
extension = ".tif"
for e in ['.TIF', '.ome.tif']:
    if e in input_files_l[0]:
        extension=e

#Get the part of the file name before the extension and use it as the initial part of the saving name
extension_index = input_files_l[0].index(extension)
prefix_name = input_files_l[0][:extension_index]

#Open the input file (the single file in input_folder)
ch1_timecourse = tifffile.imread(os.path.join(input_folder,input_files_l[0]))


SEGMENTATION OF EMBRYO CORTEX

Run the following 2 cells to obtain and save a binary segmentation of the time-lapse. When the pipeline was created this segmentation allowed to segment the embryo cortex from the backgound.

To do the segmentation, the next two cells must be run. The first of the next two cells could be modified. The second of the next two cells should not be modified.





The segmentation follows the following steps:
1) Each 2D array (i) along the iteration axis of input array is smoothed using a gaussian kernel of size 20. Output name GAU(i).
2) The histogram distribution of intensity values is calculated (bins=100) per each GAU(i) along the iteration axis of input array. Output name HIS(i).
3) Per each HIS(i) along the iteration axis of input array, the position on the x axis of the mode value for the histogram distribution is calculated. Output name MOD(i).
4) Per each HIS(i) along the iteration axis of input array, an intensity value IV1(i) is calculated. IV1(i) corresponds to the minimum of the HIS(i) when only values higher than MOD(i) are considered.
5) Per each HIS(i) along the iteration axis of input array, an intensity value IV2(i) is calculated. IV2(i) corresponds to the maximum of HIS(i) when only values higher than MOD(i) are considered.
6) Per each HIS(i) along the iteration axis of input array, an intensity value IV3(i) is calculated. IV3(i) corresponds to the minimum of HIS(i) when only values higher than MOD(i) and smaller than IV2(i) are considered.
7) IV1 and IV2 are pooled for all 2D arrays along the iteraton axis. Output name POOL.
8) The median M and standard deviation S of POOL are calculated.
9) An intensity value THRESHOLD(i) is calculated per each 2D array (i) along the iteration axis of input array. THRESHOLD(i) corresponds to IV1(i) if IV1(i) is within 1 S from M. Otherwise, THRESHOLD(i) corresponds to IV2(i) if IV2(i) is within 1 S from M. Otherwise, THRESHOLD(i) corresponds to M.
10) Each gaussian smoothed 2D array GAU(i) along the iteraton axis is binarized using THRESHOLD(i) as highpass filter. Pixels whose intensity values are >THRESHOLD(i) are set to positive  Output THRESH_IMG(i).
11) Individual regions of THRESH_IMG(i) are groups of connected pixels with the same intensity value and completely sourranded by backgroung pixels. Refer to https://scikit-image.org/docs/stable/api/skimage.measure.html#skimage.measure.label . Individual regions of each 2D THRESH_IMG(i) along the iteration axis are filered using a highpass filter. The value of the highpass filter (in number of pixels) is defined in the variable embryo_highpass_area_threshold. Only regions whose area is >embryo_highpass_area_threshold are maintained. Output FINAL_THRESH_IMG(i).


In [9]:
#THESE PARAMETERS COULD BE CHANGED TO REFINE THE SEGMENTATION

#Highpass filter area - most likely the following line should not be modified
embryo_highpass_area_threshold = 250 #Segmented structures smaller than this value (the unit is number of pixels) are removed. Suggested value is 250.

#Indicate if an an roi should be used for the embryo segmentation
#NOTE:
# 1) the roi is meant to restrict the segmentation procedure to a part of the time-course. This means that regions outside the provided roi will be automatically
# excluded from the segmentation result. This could be useful if, for example, there are bright structures near the target segmentation structure (supposedly the embryo cortex)
# which might interfer with the segmentation.
# 2) if use_roi_for_segmentation is set to True, the folder where the roi is stored must be indicated in roi_path
use_roi_for_segmentation = False
roi_path = r"/Users/ulivia/Desktop/Alessandro/projects/filopodia_paper/fig2_contact_filopodia/step_03/210709-GM-ACR005_roi/210709-GM-ACR005-7" #Ignore this line if use_roi_for_segmentation=False



In [10]:
#GET (EMBRYO CORTEX) SEGMENTATION MASK FOR ALL TIMEPOINTS - THE PRESENT CELL SHOULD NOT BE MODIFIED

#Get the roi mask for segmentation if provided
if use_roi_for_segmentation:
    embryo_roi_file_path = os.path.join(roi_path, listdirNHF(roi_path)[0])
    embryo_roi_mask = form_mask_from_roi(embryo_roi_file_path, ch1_timecourse[0,...])


# #Use mask_embryo on ch2 (pip2-mCherry, cortex plane) to get the mask of the embryo.
if use_roi_for_segmentation:
    embryo_thresholded_timecourse = mask_embryo(ch1_timecourse,
                                                area_threshold4embryo=embryo_highpass_area_threshold,
                                                sigma_gaussian_smt=20, #Note that this is different from all the rest of the "segment_structure..." notebooks
                                                int_hist_bins=100,
                                                roi_mask=embryo_roi_mask,
                                                time_axis=time_axis,
                                                output_lowval=0,
                                                output_highval=255,
                                                output_dtype=np.uint8)
else:
    embryo_thresholded_timecourse = mask_embryo(ch1_timecourse,
                                                area_threshold4embryo=embryo_highpass_area_threshold,
                                                sigma_gaussian_smt=20, #Note that this is different from all the rest of the "segment_structure..." notebooks
                                                int_hist_bins=100,
                                                time_axis=time_axis,
                                                output_lowval=0,
                                                output_highval=255,
                                                output_dtype=np.uint8)

#Save the result with the correct data structure
embryo_thresholded_timecourse_saving_name = prefix_name + embryo_mask_sav_ing_name + ".ome.tif"




if time_axis==0:
    tifffile.imwrite(os.path.join(output_folder,embryo_thresholded_timecourse_saving_name), embryo_thresholded_timecourse, photometric='minisblack', metadata={"axes":'TYX'})
elif time_axis==1:
    tifffile.imwrite(os.path.join(output_folder,embryo_thresholded_timecourse_saving_name), embryo_thresholded_timecourse, photometric='minisblack', metadata={"axes":'YTX'})
elif time_axis==2:
    tifffile.imwrite(os.path.join(output_folder,embryo_thresholded_timecourse_saving_name), embryo_thresholded_timecourse, photometric='minisblack', metadata={"axes":'TYX'})


SEGMENTATION OF CH1 - CYK1GFP ENRICHED STRUCTURES

Run the following 2 cells to obtain and save a binary segmentation of channel 1 time-lapse based on a hysteresis filtering.

When the pipeline was created this segmentation allowed to segment the cyk1-enriched domains from the backgound using the signal from cyk1::gfp.

IMPORTANT NOTE: the present segmentation requires a binary mask restricting the processing to a ROI. This binary mask is loaded in the next cell in a variable called embryo_segmented_timecourse_1. The binary mask must have the same shape of the array which undergoes the present segmentation process. Positive pixels in the binary mask are interpreted as pixels of interest. The code is built so that the segmentation output of the section "SEGMENTATION OF EMBRYO CORTEX USING CH2 (PIP2MCHERRY)" is used as the binary mask. This means that either the "SEGMENTATION OF EMBRYO CORTEX USING CH2 (PIP2MCHERRY)" segmentation has been run, or its output must be present in the output_folder. When the pipeline was written, the binary mask provided the shape of the embryo cortex during the time-lapse.

To do the segmentation, the next two cells must be run. The first of the next two cells could be modified. The second of the next two cells should not be modified.





The segmentation follows the following steps:
I will name 2DR (i) the i-th 2D array of the array to process, along the iteration axis. I will name 2DM (i) the corresponding i-th 2D array of the binary mask array.
1) Each 2DR (i) is smoothed using a median filter. Output name MED(i).
2) The histogram distribution of intensity values is calculated (bins=100) per each MED(i). The histogram is calculated only on the pixels of MED(i) corresponded by positive pixels in 2DM (i). Output name HIS(i).
3) Per each HIS(i), an intensity value IV1(i) is calculated. IV1(i) corresponds to the intensity values for the percentile of HIS(i) indicated by variable ch1_top_percentile_hyst_filt.
4) Per each HIS(i), an intensity value IV2(i) is calculated. IV2(i) corresponds to the intensity values for the percentile of HIS(i) indicated by variable ch1_bot_percentile_hyst_filt.
5) Per each HIS(i), the mode MOD(i) and standard deviation STD(i) are calculated.
6) Per each HIS(i), an intensity value IV3(i) is calculated. IV3(i) corresponds to the intensity values of HIS(i) calculated using the formula: IV3(i)=MOD(i)+(N1*STD(i)). Where N1 is indicated by variable ch1_top_stdv_multfactor_hyst_filt.
7) Per each HIS(i), an intensity value IV4(i) is calculated. IV4(i) corresponds to the intensity values of HIS(i) calculated using the formula: IV4(i)=MOD(i)+(N2*STD(i)). Where N2 is indicated by variable ch1_bot_stdv_multfactor_hyst_filt.
8) An intensity value THRESHOLD1(i) is calculated per each MED(i). THRESHOLD1(i) corresponds to the biggest value between IV1(i) and IV3(i).
9) An intensity value THRESHOLD2(i) is calculated per each MED(i). THRESHOLD2(i) corresponds to the biggest value between IV2(i) and IV4(i).
10) Each MED(i) is binarized using an hysteresis based method (ref https://scikit-image.org/docs/stable/api/skimage.filters.html#skimage.filters.apply_hysteresis_threshold). THRESHOLD1(i) and THRESHOLD2(i) are used, respectively, as high and low values for the hysteresis thresholding process. Output name FINAL_2D.


In [11]:
#DEFINE PARAMETERS FOR SEGMENTATION

# Parameters regulating hysteresis based segmentation - refer to get_hysteresis_based_segmentation in image_segmentation.py for documentation and to description of the segmentation steps
# in the past cell. In general, the higher the percentages, the stricted the segmentation of ch1 structures.
ch1_top_percentile_hyst_filt = 99 #percentile to use for calculating the high value to be passed to hysteresis filtering - suggested 99
ch1_bot_percentile_hyst_filt = 98.7 #percentile to use for calculating the low value to be passed to hysteresis filtering - suggested 98.7
ch1_top_stdv_multfactor_hyst_filt = 5 #how many standard deviations to sum to the mode of the intensity distribution, for calculating the high value to be passed to hysteresis filtering - suggested 5
ch1_bot_stdv_multfactor_hyst_filt = 4 #how many standard deviations to sum to the mode of the intensity distribution, for calculating the low value to be passed to hysteresis filtering - suggested 4




#OPEN CH1 TIME-LAPSE AND THE BINARY MASK SUPPORTING THE PRESENT SEGMENTATION (when the pipeline was created this is the output of the 
# "SEGMENTATION OF EMBRYO CORTEX USING CH2 (PIP2MCHERRY)" segmentation: the segmented embryo cortex)
#Don't modify the lines below.
#If "SEGMENTATION OF EMBRYO CORTEX USING CH2 (PIP2MCHERRY)" segmentation has been run, use the result
try:
    embryo_segmented_timecourse_1 = embryo_thresholded_timecourse

#If "SEGMENTATION OF EMBRYO CORTEX USING CH2 (PIP2MCHERRY)" segmentation has not been run, open the file from the mask saved in the output folder
except:
    #Get a list of files in the output folder
    list_output_files = listdirNHF(output_folder)
    #Select file which contains the embryo segmentaton saving name
    embryo_segmentation_file_name_1 = [f for f in list_output_files if embryo_mask_sav_ing_name in f][0]
    #Open the embryo segmentation timecourse
    embryo_segmented_timecourse_1 = tifffile.imread(os.path.join(output_folder, embryo_segmentation_file_name_1))



In [12]:
#Iterate segmentation on the timecourse - don't modify the present cell
ch1_segmented_timecourse = segment_cyk1_enriched_domains(ch1_timecourse,
                                                         embryo_segmented_timecourse_1,
                                                         hyst_filt_low_percentil_e=ch1_bot_percentile_hyst_filt,
                                                         hyst_filt_high_percentil_e=ch1_top_percentile_hyst_filt,
                                                         hyst_filt_low_stdfacto_r=ch1_bot_stdv_multfactor_hyst_filt,
                                                         hyst_filt_high_stdfacto_r=ch1_top_stdv_multfactor_hyst_filt,
                                                         output_low_va_l=0,
                                                         output_high_va_l=255,
                                                         output_d_typ_e=np.uint8,
                                                         iteration_axi_s=0)

#Save the result with the correct data structure
ch1_thresholded_timecourse_saving_name = prefix_name + ch1_sav_ing_name + ".ome.tif"

if time_axis==0:
    tifffile.imwrite(os.path.join(output_folder,ch1_thresholded_timecourse_saving_name), ch1_segmented_timecourse, photometric='minisblack', metadata={"axes":'TYX'})
elif time_axis==1:
    tifffile.imwrite(os.path.join(output_folder,ch1_thresholded_timecourse_saving_name), ch1_segmented_timecourse, photometric='minisblack', metadata={"axes":'YTX'})
elif time_axis==2:
    tifffile.imwrite(os.path.join(output_folder,ch1_thresholded_timecourse_saving_name), ch1_segmented_timecourse, photometric='minisblack', metadata={"axes":'TYX'})


(89, 928, 684)
(89, 928, 684)
