Skip to content
Michele Bortolomeazzi edited this page Apr 20, 2023 · 16 revisions

Input

SIMPLI execution is controlled through configuration files in .csv tabular format, which can be edited with a spreadsheet editor (like excel) or plain text editor. For examples see the the test dataset folder at: metadata

Sample Metadata

In the analysis performed by SIMPLI each multiplexed image or ROI is associated to a unique sample identifier. This can then be associated to a color or to a category for the visualisation of comparisons between samples.

Sample metadata file

This files provide the metadata for all the samples, each ROI is considered as a sample and is associated to a row with the following required fields:

  • sample_name: Identifier to be used to refer to this sample (ROI) in the analysis.
  • color: Color used to represent this sample in plots, can be a color name or hexadecimal #RGB or #RGBA format ("#RRGGBB" or "#RRGGBBAA").
  • comparison column: Each sample is associated a category name. To exclude a sample from the comparison, set its field to "NA". Pairwise comparisons will be made only if the column contains two category names ("NA" excluded).

Image input

SIMPLI works on highly multiplexed imaging data and can process input files in the following formats:

  • Single-channel TIFF files (several files per sample).
  • Multi-channel OME-TIFF files (one file per sample).
  • TXT acquisition files from Imaging Mass Cytometry experiments (1 file = 1 ROI per sample).
  • MCD acquisition files from Imaging Mass Cytometry experiments (1 ROI per sample).

The input metadata files required for the analysis depend on the type of starting data:

  • Single- or multi-channel tiff images
  • Imaging Mass Cytometry acquisition data

Metadata file for Single or multi channel tiff images

If the input consists of single or multi channel tiff images, then a single metadata file is used to associate each input image to the corresponding sample.

Tiff image metadata file

  • sample_name: Identifier to be used to refer to this sample (ROI) in the analysis.
  • marker: Marker associated to the channel.
  • label: Label used to name the channel in the analysis.
  • file_name: Path to the input file.

Metadata files for Imaging Mass Cytometry acquisition data

If the input consists of Imaging Mass Cytometry acquisition data, then two files are required:

  • Channel metadata file
  • Raw metadata file

Channel Metadata file

This files provide the metadata for all the channels that need to be extracted from the raw data. These channels must be present in all the samples included in the analysis. To each channel is associated a row with two required fields:

  • channel_marker: Metal associated to the channel, must match the metal name used in the acquisition from the raw data.
  • channel_label: Label used to name the channel in the analysis.

Raw IMC Metadata file

This files provide the metadata for all the images that need to be extracted from the raw imaging mass cytometry data (.mcd or .txt format). To each ROI is associated a row with the following required fields:

  • sample_name: Identifier to be used to refer to this sample (ROI) in the analysis.
  • roi_name: Name of the ROI. in the mcd file. This field can be left blank if the input is in .txt format.
  • file_name: Path to the MCD or TXT input file.

Additional note on Metadata files format

Input file locations can be specified as absolute paths, or as URIs (i.e. file://, http://, s3://).

The singularity containers used by SIMPLI mount the directory from where SIMPLI is launched as /data. As a result, when user provides custom metadata files, the file paths specified in the metadata file_name fields should be modified as

/dirA/dirB/.../launch_dir/dirC/.../file -> /data/dirC/.../file

where launch_dir stands for the SIMPLI launching directory.