dpSWATH is developed for building high quality library for SWATH-MS based on deep-learning.
- Preprocess data of high-quality identified fragmentations, retention time.
- Train models for predictions of retention time and mass spectra combined with dpMC for local specific experiments with fine-tuning.
- Building high quality library for SWATH-MS analysis using trained model.
- Both of Windows and Linux platforme are supported.
- NVIDIA Graphics Processing Unit (GPU) is highly reconmmended; Central Processing Unit (CPU) calculation is also available but depreciated;
- NVIDIA CUDA 10.0+, CuDNN 7.6.5+ are recomended.
- Keras with Tensorflow backend. The tensorflow 2.0 was used for the development of dpSWATH.
dpSWATH was developed under Python 3.6.5(Anaconda3 5.2.0 64-bit) with keras tensorflow-gpu backend. Hardware including GPU card NVIDIA GeForce 1080Ti, CPU i7-8086K and 128GB RAM were utilized.
1. Installation of Python (Anaconda is recommended)
- Anaconda installer can be downloaded from the Anaconda official site.
- Official python installer can be downloaded from the Python official site.
2. Installation of associated packages
-
Install Tensorflow using
conda install(recommended)or pip:conda install -c conda-forge tensorflowor pip install --upgrade tensorflow
-
Install Tensorflow with GPU supported using
conda install(recommended)or pip:conda install -c anaconda tensorflow-gpuor pip install --upgrade tensorflow-gpu
-
Install Keras using
conda install(recommended)or pip:conda install -c conda-forge kerasor pip install keras
-
Other associated packages including
os,re,datetime,Bio,pandas,numpy,random,fnmatchcan also be installed usingconda install(recommended)or pip.
- For the training of dpRT model, retention time files from the following searching software are supported:
- SpectroMine/Spectronaut,
the experimental library(.xls) file built by Pulsar in Spectronaut or searching file from SpectroMineare supported. - ProteinPilot,
the identifications from ".mzid" file generated from ProteinPilotis supported.
- SpectroMine/Spectronaut,
- Pretrained model for fine-tuning. Fine-tuning is provided when training model of dpRT and the pretrained models are provided in the models folder.
- Trained model for prediction of retention time. This file is needed when you have trained your dpRT model and ready to build dpSWATH library.
- For the training of dpMS model, mass spectra files from the following searching software are supported:
- SpectroMine/Spectronaut,
the experimental library(.xls) file built by Pulsar in Spectronaut or searching file from SpectroMineare supported. - ProteinPilot,
the identifications from ".mzid" file generated from ProteinPilotis supported.
- SpectroMine/Spectronaut,
- Pretrained model for fine-tuning. Fine-tuning is provided when training model of dpMS and the pretrained models are provided in the models folder.
- Trained model for prediction of mass spectra. This file is needed when you have trained your dpMS model and ready to build dpSWATH library.
For the training of both dpRT and dpMS, only high quality data are used to train the models.
- For the datasets used for dpRT, only the retention time of high confident peptides are selected.
- For the datasets used for dpMS, dpMScore are performed to get the consistent mass spectra for the training of dpMS.
- Start using dpSWATH by opening command interpreter
cmd.exein windows platform orshellin Liux platform. - Run dpSWATH by calling
pythonprogram:python dpSWATH_main.py. - After entering the commond line, follow the prompt and enter
1to selecttrainingmodels. - Then select
1to train dpRT. - Next, please set your working directory after the prompt which will store all your trained dpRT models.
- Put the absolute path of your
pretrained dpRT modelandDDA libraryafter corresponding prompt. - The trained dpRT models can be found under folder
./working directory/dpSWATH/md/dpRT/XXX-XX-XX_XX_XX_XX_XXXXXX/, please keep the best model based on the 'validation loss' for building library.
- The examples for training of dpRT model:
- Start using dpSWATH by opening command interpreter
cmd.exein windows platform orshellin Liux platform. - Run dpSWATH by calling
pythonprogram:python dpSWATH_main.py. - After entering the commond line, follow the prompt and enter
1to selecttrainingmodels. - Then select
2to train dpMS. - Next, please set your working directory after the prompt which will store all your trained dpMS models.
- Put the absolute path of your
pretrained dpMS modelandDDA fragmatation file from ProteinPilotafter corresponding prompt. - The trained dpMS models can be found under folder
./working directory/dpSWATH/md/dpMS/XXX-XX-XX_XX_XX_XX_XXXXXX/, please keep the best model based on the 'validation loss' for building library.
- The examples for training of dpMS model:
- Start using dpSWATH by opening command interpreter
cmd.exein windows platform orshellin Liux platform. - Run dpSWATH by calling
pythonprogram:python dpSWATH_main.py. - After entering the commond line, follow the prompt and enter
2to selectbuild library. - Next, please set your working directory after the prompt which will store the built dpSWATH library.
- Put the absolute path of your
file of precursorsand directory ofmodels of dpRT and dpMSafter corresponding prompt. - The result can be found under folder
./working directory/dpSWATH/Library/dpSWATH-Lib.txt.
The progression informaiton will be shown in the progress bar in commond window.
- The examples for building library by dpSWATH:
-
The model files. The model files can be generated during training process by selecting
trainfunction. All the model files are in the following format:./dpRT_XXX_Y.YYYYY_Z.ZZZZZ.h5/or./dpMS_XXX_Y.YYYYY_Z.ZZZZZ.h5/, in whichXXXdenotes the epoch of the model,Y.YYYYYdenotes the training loss for this epoch of training,Z.ZZZZZdenotes the validation loss for this epoch of training.- For the
trainingfunction, the model files can be found under directory./working directory/dpSWATH/md/dpRT/XXX-XX-XX_XX_XX_XX_XXXXXX/or./working directory/dpSWATH/md/dpMS/XXX-XX-XX_XX_XX_XX_XXXXXX/.
- For the
-
The precursor file. The precursor file used for building dpSWATH library has two columns: 1) the peptides; 2) the precursor charge. We recommend to prepare the precursor lists by using dpMC (https://github.com/dpMC-sun/dpMC).
-
The library files. Please refer to the Supplementary Note of our paper.
- For the
build libraryfunction, the digested file can be found in./working directory/dpSWATH/Library/dpSWATH-Lib.txt.
- For the
Please submit any issues happened to dpSWATH on issues section or send Emails to dpSWATH.sun@gmail.com.
