Skip to content

adelabriere/SLAW

Repository files navigation

SLAW development is continuing in the zamboni-lab GitHub. Please report your bug there

SLAW

SLAW is a scalable, containerized workflow for untargeted LC-MS processing. It was developed by Alexis Delabriere in the Zamboni Lab at ETH Zurich. An explanation of the advantages of SLAW and its motivations of development can be found in this blog post. In brief, the core advantages of SLAW are:

  • Complete processing including peak picking, sample alignment, pick picking, grouping of isotopologues and adducts, gap-filling by data recursion, extraction of consolidated MS2 spectra and isotopic data.
  • Scalability: SLAW can process thousands of SAMPLES efficiently
  • Wrapping of three main peak picking algorithms: Centwave, FeatureFinderMetabo, ADAP
  • Automated parameter optimization for picking, alignment, gap-filling

If you want to use SLAW, please cite the following paper:

Delabriere A, Warmer P, Brennsteiner V and Zamboni N, SLAW: A scalable and self-optimizing processing workflow for untargeted LC-MS, 2021 (https://doi.org/10.1021/acs.analchem.1c02687)

This repository contains the current stable version (1.0.0)**

The latest development version can be found on adelabriere/slaw:dev. It notably includes a fix for low memory/processor settings.

Installation

The source code provided here is meant for developers. For an average user, setting up an environment with R, python, mzMine, etc. is a cumbersome process. Instead, the recommended way to use SLAW is to pull the container from DockerHub that come preconfigured with all components and can be used as a black box:

docker pull adelabriere/slaw:latest

An equivalent container is available on SingularityHub for operating on a HPC cluster.

Running SLAW

Some example data are given in the test_data folder of this git folder. These data have been heavily filtered to allow quick testing. An example of input folder is given in the test_data/mzML folder, and an example of parameters file (which will be generated by SLAW if you run it on an empty folder) is given in test_data/parameters.txt an example of the complete output of SLAW without optimization is given in test_data/output. A zipped file containing the inputs, the only thing needed to run SLAW a can be downloaded

Once this folder have been downloaded, and extracts at the location of your choice (PATH_INPUT), and create an empty anywhere to store the output (PATH_OUTPUT). The workflow can be run by opening a terminal, or on windows a Powershell (NOT Powershell ISE) and running:

docker run --rm -v PATH_FOLDER\mzML:/input -v PATH_OUTPUT:/output adelabriere/slaw:latest

If you specified the path correctly, you should see the following text:

2020-12-02|12:39:28|INFO: Total memory available: 7556 and 6 cores. The workflow will use 1257 Mb by core on 5 cores.
....
2020-12-02|12:39:31|INFO: Parameters file generated please check the parameters values/ranges and toggle optimization if needed.

Now a parameters.txt which stores all the parameters of the processing. The processing can then be strated br rerunning the command line. SLAW takes the input parameters.txt and processes the data.

docker run --rm -v PATH_FOLDER\mzML:/input -v PATH_OUTPUT:/output adelabriere/slaw:latest

The optimization is switched off by default to avoid a long processing time. You can turn it on in the parameters.txt file by setting the optimization/need_optimization parameters to "True". If you choose to do so, the processing should take less than 1 hour to finish. If not, it should take less than 5 mins. If the workflow finished the

2020-12-02|12:39:37|INFO: Total memory available: 7553 and 6 cores. The workflow will use 1257 Mb by core on 5 cores.
2020-12-02|12:39:37|INFO: Guessing polarity from file:DDA1.mzML
2020-12-02|12:39:38|INFO: Polarity detected: positive
2020-12-02|12:39:39|INFO: STEP: initialisation TOTAL_TIME:2.05s LAST_STEP:2.05s
...
2020-12-02|12:41:04|INFO: STEP: gap-filling TOTAL_TIME:86.86s LAST_STEP:15.17s
2020-12-02|12:41:30|INFO: Annotation finished
2020-12-02|12:41:30|INFO: STEP: annotation TOTAL_TIME:112.74s LAST_STEP:25.87s

The outputs are generated in PATH_OUTPUT the complete outputs are:

  • datamatrices: The complete table with row corresponding to features or ions and the columns corresponding to a sample. Three flavors of datamatrices are generated.
  • fused_mgf: The consensus mfg spectra obtained, storing one ms-ms spectrum by features in the data matrices.
  • OPENMS/CENTWAVE/ADAP: Store the individual peak tables and ms-ms spectra for each sample.

More information about the parameters and the workflow can be found into the wiki..