Skip to content

This repository contains CellProfiler pipelines, ImageJ macros and other files generated during my NEUBIAS Short Term Scientific Mission at the scientific platforms for BioSciences Screening and Advanced Light Microscopy, Instituto de Investigação e Inovação da Universidade do Porto (Porto, Portugal)

License

Notifications You must be signed in to change notification settings

paucabar/stsm-i3s

Repository files navigation

Short Term Scientific Mission - Deployment of bioimage analysis pipelines for high content screening assays

Home Institution: ERI BIOTECMED - Estructura de Recerca Interdisciplinar en Biotecnologia i Biomedicina, Universitat de València (Valencia, Spain)
Host Institution: i3S - Instituto de Investigação e Inovação em Saúde, Universidade do Porto (Porto, Portugal)
Host Labs: Advanced Light Microscopy / BioSciences Screening
Research Advisors: Paula Sampaio / André Maia
Funding: COST Action CA15124 (NEUBIAS)

Description of the work carried out during the STSM

I brought an image dataset acquired with the IN Cell Analyzer 2000 high content microscope. The dataset is a cell proliferation and apoptosis assay. During the STSM I used this dataset to learn and practice a series of methods usually applied in microscopy-based screening. First of all, we worked on the use of pixel classification in CellProfiler to improve the segmentation of primary objects (e.g. nuclei) or secondary objects (e.g. cells). Actually, CellProfiler does not include a pixel classification module, so we performed it in ilastik as a previous step and imported the pixel classification output in a CellProfiler pipeline by setting up the Metadata and NamesAndTypes modules. As a first step of the pipeline, we worked on the field-of-view quality control using machine learning in CellProfiler Analyst. This step was immediately followed by the illumination correction, in our case based on both retrospective single-image and multi-image methods. After that, we worked on the segmentation settings to identify objects in raw images or probabilities maps obtained by means of pixel classification. Then we worked on the cell-level quality control with different approaches: plotting the data and looking for outliers and/or inspecting the segmentation output looking for ill quality segmented objects. For segmentation visualization we used Fiji, merging the raw images with the outlines of nuclear and cellular segmentation and displaying measurements over the objects (e.g. form factor, solidity…). The last CellProfiler steps consisted on the data export for the further data analysis or the training of classifiers. For the use of classifiers, we also practiced in the use of CellProfiler Analyst to identify cellular phenotypes by means of machine learning methods. Finally, we worked with shinyHTM, a R-based web-based tool to interactively inspect, plot and visualise data and images generated by means of microscopy-based screening.

Description of the main results obtained

The pixel classification was performed in ilastik with the aim to facilitate the subsequent segmentation of nuclei and cells. To this aim, three classifiers were trained: i) one for the nuclei, using the DAPI channel, and two for the cells, using the brightfield images and the cells’ autofluorescence. After that, a CellProfiler pipeline was configured to measure the image quality of the entire dataset. The generated data was used to establish a series of quality control rules based on blurring and saturation metrics. Two pipelines were developed for the bioimage analysis of the dataset, only differing in the illumination correction methods applied. As a first step, both pipelines apply the quality control rules in order to label bad quality fields-of-view. Then, both pipelines apply a retrospective method for illumination correction, one of them based on single-image and the other one in groups of images. The multi-image method requires the use of an additional pipeline, which was built to generate the reference images for the illumination correction. Then, both pipelines have two options to perform the nuclei segmentation, one for the raw images and other for the probabilities maps. The segmentation of the cells is performed based on the probabilities maps obtained from the autofluorescence images. Having the nuclei and the cells masks is easy to obtain the cytoplasm segmentation. Using the shape descriptors of the nuclei a series of cell-level quality control rules were stablished and applied in the pipelines to filter the ill quality objects. In order to facilitate the work, the pipelines generate images outlining the segmentation results and displaying different shape descriptors data over the objects. A Fiji macro was developed to merge the raw data with the outlines in order to inspect the segmentation output and look for the quality control rules, which is quite useful combined with outlier information. The data obtained through the final pipeline was used to classify the cells in different classes. The material generated during the STSM is available in GitHub (https://github.com/paucabar/stsm_i3s). Please find more information about the files available in the repository in the Popular Report.

Popular Report

Introduction

The data generated by cutting-edge light microscopes is getting bigger and more complex, which has required advances in the way researchers analyse such information. Indeed, the use of automated microscopes has enabled the high content collection of images on such a scale that the visual inspection is utterly unfeasible. To overcome this, specific image acquisition and image and data analysis methods have been deployed for image-based screening experiments, which can be divided into two classes. On one hand, high throughput screens are generally limited to a few measurements, being the read-out of each of them the average of a whole microplate well. On the other hand, high content screens use a larger number of measurements to identify different phenotypes in individual cells [1].

Since the main aim of the STSM was to get familiar with the common bioimage analysis methods used in microscopy-based screens, I worked on the development of several bioimage analysis workflows applying those methods. I generated an image dataset to practice, which consists in a cell proliferation and apoptosis (programmed cell death) assay. Cell proliferation and apoptosis are parameters routinely assessed. The dataset was generated using methods widely used in fields such as cancer drug discovery [2, 3]: i) EdU (5-etynil-20-deoxyuridine) pulse-chase to label the genomic DNA of cells undergoing S-phase and ii) caspase3 immunocytochemistry. Despite this fact, the idea was not to generate a bioimage analysis workflow to analyse such assays, but to deploy a series of bioimage analysis tools capable to analyse a wide spectrum of image-based high content screens with the aim to identify different cellular phenotypes.

Results

Pixel-based classification with ilastik Pixel classification is a machine learning method which allows the generation of probabilities maps, where each pixel take the value of its likeliness to belong to a series of classes defined by the user (e.g. nuclei, cytoplasm, mitochondria…). This may facilitate the subsequent segmentation of images that would otherwise be difficult to segment applying a more classical image processing approach, based on filters. We have used ilastik [4] to deploy a series of pixel classification workflows in order to combine this tool with CellProfiler, a bioimage analysis software developed with the special aim of enable the score of cellular phenotypes when combined with CellProfiler Analyst [5, 6]. If one is interested in scoring cellular phenotypes, a common step is the segmentation of both nuclei and cells, which may be improved using pixel classification. Therefore, we trained a series of classifiers to this aim using three different channels: i) DAPI for the nuclei and ii) Brightfield or FITC (autofluorescence) for cells. The autofluorescence seemed to perform better for the segmentation of the cells. The raw images used for the training of the classifier are available in GitHub [7], in the ‘ilastik_training’ folder.

Field-of-view quality control During the acquisition of an image dataset by means of a microscopy-based screening, the human supervision of the process is an unattainable work. Therefore, the image quality of some fields-of-view might be affected due to inaccurate acquisition (e.g., autofocus errors) or sample artifacts. This may compromise the segmentation step, kingpin of the bioimage analysis procedure, corrupting the results. Indeed, the metrics used to establish the quality of the images to be analysed can be classified in two main groups: metrics to detect blurring (e.g. out of focus fields-of-view) or saturated pixels (e.g. saturated debris) [8]. There are a lot of blurring metrics, and none of them lacks limitations. In order to get the best of them, we used a machine learning classifier in CellProfiler Analyst. First of all, the ‘cell_proliferation_apoptosis_qc.cppipe’ [7] pipeline was used to measure several image quality metrics. Then, a classifier was trained using the Fast Gentle Boosting algorithm to distinguish between: i) good quality images, ii) blurred images and iii) images without cells. 30 rules to classify images in those three groups were obtained and saved in the ‘qc_rules.txt’ [7]. The saturation metrics are simpler, so a threshold on the % of saturated pixels was established just plotting and inspecting outliers. The rules are applied to label ill quality images in both ‘cell_proliferation_apoptosis_icAll.cppipe’ and ‘cell_proliferation_apoptosis_icEach.cppipe’ [7] pipelines, which only differ in the illumination correction method.

Illumination correction The illumination of the images acquired by means of light microscopy techniques is usually heterogeneous and, therefore, need a correction for both better segmentation and intensity quantitative measurements. There are several illumination correction methods, but the most used in high content analysis is the retrospective multi-image method [8], i.e., which consists in building an illumination correction function using the images of the experimental dataset (conversely, prospective methods are based in reference images acquired without a sample in the foreground). However, sometimes it can be useful the use of retrospective single-image methods, calculating the illumination correction function for each image. The multi-image method is advisable, since it applies the same correction for all the images of the dataset (grouped by channels), whereas the single-image may be useful to correct more complex illumination patterns or when working with non-high content datasets. For the use of multi-image methods, we deployed the ‘cell_proliferation_apoptosis_icAll.cppipe’ [7], which needs the illumination correction functions previously generated by means of the ‘illumination_correction.cppipe’ [7]. A series of generated functions have been stored in the ‘illumination_images’ folder [7]. Conversely, the pipeline ‘cell_proliferation_apoptosis_icEach.cppipe’ [7] performs a single-image method and does not need the use of any additional pipeline to perform illumination correction.

Segmentation There are two options to identify primary objects (i.e., nuclei segmentation): i) use the raw images or ii) use the probabilities map. While a manual threshold is established for the pixel classification segmentation, when using the raw images, the Otsu algorithm is applied. In our case, working with cells stained with the DAPI counterstain, the use of pixel classification is actually not needed, and it may even affect the efficiency of some watershed methods. The secondary objects (cells) are identified applying a propagation method over the autofluorescence (FITC) probabilities map and, therefore, establishing a manual threshold. Then, nuclei and cells might be easily used to obtain the tertiary objects (cytoplasm).

Cell-level quality control From sample preparation to image segmentation, different error on any step may lead to obtain incorrectly segmented cells. In order to skip this ill quality objects, it is advisable to apply a cell-level quality control. In this pipeline we focused on the detection of cell-pairs and clustered cells badly segmented. In addition to plot shape factors and look for outliers, sometimes is useful to inspect the segmentation output over the raw data. In order to do so, both analysis pipelines include modules to generate the overlays of the segmented nuclei and cells and display per-object data, such as form factor or solidity measurements. To facilitate the visualization, merging the outlines and the data with the raw images, a Fiji [9] macro was scripted (Merge_outlines.ijm [7]). Instructions for the installation and the usage of the macro are available in the Wiki page of [7], together with other information related to the tools developed during the STSM. A series of cell-level quality control rules were stablished and applied in a FilterObjects module, previous to the feature extraction and data export modules.

Identification of cells undergoing S-phase and apoptotic cells using machine learning A classifier model was trained using the Random Forest algorithm in CellProfiler Analyst. It allows to identify useful features for a specific assay, which allows to avoid the measurements of superfluous information in future assays, reducing the computation time and the data size.

References

  1. Boutros M., Heigwer F., Laufer C. (2015) Microscopy-Based High-Content Screening. Cell 163(6):1314-1325.
  2. Mandavilli B.S., Yan M., Clarke S. (2018) Cell-Based High Content Analysis of Cell Proliferation and Apoptosis. In: Johnston P., Trask O. (eds) High Content Screening. Methods in Molecular Biology, vol 1683. Humana Press, New York, NY.
  3. Carrillo-Barberà P., Morante-Redolat J.M., Pertusa J.F. (2019) Cell Proliferation High-Content Screening on Adherent Cell Cultures. In: Rebollo E., Bosch M. (eds) Computer Optimized Microscopy. Methods in Molecular Biology, vol 2040. Humana, New York, NY.
  4. Sommer C., Strähle C., Köthe U., Hamprecht F.A. (2011) ilastik: Interactive Learning and Segmentation Toolkit. In: Eighth IEEE International Symposium on Biomedical Imaging (ISBI). Proceedings, 230-233.
  5. McQuin C., Goodman A., Chernyshev V., Kamentsky L., Cimini B.A., Karhohs K.W., Doan M., Ding L., Rafelski S.M., Thirstrup D., Wiegraebe W., Singh S., Becker T., Caicedo J.C., Carpenter A.E. (2018) CellProfiler 3.0: Next-generation image processing for biology. PLoS Biol. 16(7):e2005970.
  6. Jones T.R., Carpenter A.E., Lamprecht M.R., Moffat J., Silver S., Grenier J., Root D., Golland P., Sabatini D.M. (2009) Scoring diverse cellular morphologies in image-based screens with iterative feedback and machine learning. PNAS 106(6):1826-18316.
  7. https://github.com/paucabar/stsm_i3s
  8. Caicedo J.C., Cooper S., Heigwer F., Warchal S., Qiu P., Molnar C., Vasilevich A.S., Barry J.D., Bansal H.S. et al (2017) Data-analysis strategies for image-based cell profiling. Nat Methods 14(9):849-863.
  9. Schindelin J., Arganda-Carreras I., Frise E., Kaynig V., Longair M., Pietzsch T., Preibisch S., Rueden C., Saalfeld S. et al (2012) Fiji: an open-source platform for biological-image analysis. Nat Methods. 9(7):676-682.

About

This repository contains CellProfiler pipelines, ImageJ macros and other files generated during my NEUBIAS Short Term Scientific Mission at the scientific platforms for BioSciences Screening and Advanced Light Microscopy, Instituto de Investigação e Inovação da Universidade do Porto (Porto, Portugal)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published