Skip to content

BodenmillerGroup/ImcSegmentationSnakemake

Repository files navigation

A flexible image segmentation pipeline for heterogneous multiplexed tissue images based on pixel classification implemente in Snakemake

Deprecation note: This repository is not actively maintained. For a dockerized version of the IMC segmentation pipeline, please refer to steinbock.

Snakemake Build Status

The pipeline is based on CellProfiler (http://cellprofiler.org/) for segmentation and Ilastik (http://ilastik.org/) for for pixel classification. It is streamlined by using the specially developed imctools python package (https://github.com/BodenmillerGroup/imctools) package as well as custom CellProfiler modules (https://github.com/BodenmillerGroup/ImcPluginsCP).

This pipeline was developed in the Bodenmiller laboratory of the University of Zurich (http://www.bodenmillerlab.org/) to segment hundereds of highly multiplexed imaging mass cytometry (IMC) images. However it also has been already been sucessfully applied to other multiplexed imaging modalities..

The PDF found describes the conceptual basis: 'Documentation/201709_imctools_guide.pdf'. While still conceputually valid the installation procedures described are outdated.

Usage

If you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this (original) repository and, if available, its DOI (see above).

Step 0: install systems requirements

To run the pipeline, the following software needs to be installed:

Make sure this software packages work.

Step 1: Obtain a copy of this workflow

  1. Create a new github repository using this workflow as a template.
  2. Clone the newly created repository to your local system, into the place where you want to perform the data analysis.
  3. Initialize the ImcPluginsCP submodule:
git submodule update --init --recursive

Step2: Install Snakemake

Install Snakemake using conda:

conda create -c bioconda -c conda-forge -n snakemake snakemake

For installation details, see the instructions in the Snakemake documentation.

Step 3: Configure workflow

Configure the workflow according to your needs via editing the files in the config/ folder. Adjust config.yaml to configure the workflow execution.

The schema at workflow/schemas/config_pipeline.schema.yml explains all the options.

To enable the 'compensation' workflow according to Chevrier et al 2018, need to configure either:

Step 4: Execute workflow

Activate the conda environment:

conda activate snakemake

Test your configuration by performing a dry-run via

snakemake --use-conda -n --use-singularity

Execute the workflow locally via

snakemake --use-conda --cores $N --use-singularity

using $N cores or run it in a cluster environment via

snakemake --use-conda --cluster qsub --jobs 100 --use-singularity

or

snakemake --use-conda --drmaa --jobs 100 --use-singularity

The Cellprofiler output will be in results/cpout. All other folders should be considered temporary output.

See section 'UZH slurm cluster' to get more details how to run this on the cluster of the University of Zurich

Optional:

Step: download the example data

snakemake download_example_data --use-singularity --cores 32

Step: Run the pipeline until the Ilastik classifier

snakemake get_untrained_ilastik --use-singularity --cores 32

This will generate random crops to train the Ilastik cell pixel classifier in results/ilastik_training_data and produce an untrained classifier at: untrained.ilp

Open the classifier in Ilastik and save the trained classifier under the filename specified as: fn_cell_classifier in the configuration file.

Step: Open cellprofiler pipeline in gui

To open the cellprofiler GUI at any step, run:

snakemake results/cp_{batchname}_open.sh --use-singularity --cores 32

replacing {batchname} with the cellprofiler step name you want to inspect. Eg. run

snakemake results/cp_segmasks_open.sh --use-singularity --cores 32

will open the segmentation step.

This will generate a script results/cp_segmasks_open.sh that will open cellprofiler with all paths, plugins and pipeline set as they would be when running the Snakemake workflow.

Note: this requires the cellprofiler command to be installed and working.

Step: UZH cluster: How to run this workflow on a slurm cluster (UZH science cluster)

First retrieve the github repository & install the conda environment as above.

Make a default configuration

Generate this file in the path ~/.config/snakemake/cluster_config.yml

__default__:
  time: "00:15:00"

This defines the default batch run parameters.

Install the slurm cluster profile

Follow the instructions from:
https://github.com/Snakemake-Profiles/slurm

Use the following settings:
profile_name: slurm
sbatch_defaults:
cluster_config: ../cluster_config.yml
advanced_argument_conversion: 1 (Actually I have never tried this, might be worth a try)

To run the pipeline, the following modules are required and need to be loaded in this order:

module load generic
module load anaconda3 
module load singularity
conda activate snakemake_imc

To run the snakemake command on the cluster, the following flags are needed:

  • --profile slurm flag to specify the profile
  • --use-singularity to use singularity
  • --singularity-args "\-u" to use non-privileged singularity mode
  • --jobs # to have at most # number of concurrent jobs submitted (eg --jobs 50)

After the example data has been downloaded (see above) the following command would run the full pipeline:

snakemake --profile slurm --use-singularity --singularity-args "\-u" --jobs 50

About

The repository for the Snakemake implementation of the ImcSegmentationPipeline

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published