Skip to content
/ STAVER Public

A Standardized Dataset-Based Algorithm for Efficient Variation Reduction in Large-Scale DIA MS Data.

License

Notifications You must be signed in to change notification settings

Ran485/STAVER

Repository files navigation



STAVER: A Standardized Dataset-Based Algorithm for Efficient Variation Reduction

Table of Contents

Introduction

STAVER is Python library that presents a standardized dataset-based algorithm designed to reduce variation in large-scale data-independent acquisition (DIA) mass spectrometry (MS) data. By employing a reference dataset to standardize mass spectrometry signals, STAVER effectively reduces noise and enhances protein quantification accuracy, especially in the context of hybrid spectral library search. The effectiveness of STAVER is demonstrated in multiple large-scale DIA datasets from different platforms and laboratories, showing improved precision and reproducibility of protein quantification. STAVER, featuring a modular design, provides flexible compatibility with existing DIA-MS data analysis pipelines. The project aims to eliminate non-biological noise and variability in the large-scale DIA-MS study analyses, enhancing the quality and reliability of DIA proteomics data through the open-source STAVER software package. A comprehensive overview of the research workflow and STAVER algorithm architecture are summarized in the following figure: alt text

Installation

You can install staver package from PyPI by calling the following command:

pip install staver

You may install from source by cloning the STAVER repo, navigating to the root directory and using one of the following commands pip install ., or pip install -e . to install in editable mode:

# clone the source repo
git clone https://github.com/Ran485/STAVER.git

# install the package in editable mode
pip install .

# or using the following command
pip install -e .

You may install additional environmental dependencies:

pip install -r requirements_dev.txt
pip install -r requirements.txt

Installing within a conda environment is recommended.

Getting Started

To get started with STAVER, see the the installation guided walkthrough in here. For example code and an introduction to the library, please refer to the detailed discriptions in tutorials. The following block presents an easy-to-follow guide and quick start for running the STAVER workflow using the Command-Line Interface (CLI).

python  ./staver_pipeline.py \
        --thread_numbers < The CPU worker numbers, Default to [nmax-2] > \
        --input < The DIA data input directory > \
        --output_peptide < The processed DIA peptide data output directory > \
        --output_protein < The processed DIA protein data output directory > \
        --fdr_threshold < Default to 0.01 > \
        --count_cutoff_same_libs < Default to 2 > \
        --count_cutoff_diff_libs < Default to 1 > \
        --peptides_cv_thresh < Default to 0.3 > \
        --na_threshold < Default to 0.3 > \
        --top_precursor_ions < Default to 6 > \
        --file_suffix < Default to "_F1_R1" >  \

Documentation

To gain a comprehensive understanding of STAVER's functionality and parameters available in the software, we highly recommend exploring the STAVER documentation. This documentation is crafted to be comprehensive and user-friendly, offering a step-by-step guide enriched with detailed instructions. Each feature is illustrated with practical examples and supported by clear, concise explanations, enabling users to effectively use and maximize the software's capabilities.

How to Contribute

We welcome the contribution from the open-source community to improve the library!

To add a new explanation method/feature into the library, please follow the template and steps demonstrated in this contribution guidelines.

Contact Us

If you have any questions, comments or suggestions, please do not hesitate to contact us at 21112030023@m.fudan.edu.cn

License

The STAVER project licensed under the MIT License, granting users open access and the freedom to employ, adapt, and share the software as needed, while preserving the original copyright and license acknowledgements.

About

A Standardized Dataset-Based Algorithm for Efficient Variation Reduction in Large-Scale DIA MS Data.

Topics

Resources

License

Stars

Watchers

Forks