GitHub - anne04/PointIso: This is the implementation of PointIso model presented in the paper "Deep neural network for detecting arbitrary precision peptide features through attention based segmentation" (https://www.nature.com/articles/s41598-021-97669-7 and https://arxiv.org/abs/2009.07250). For any inquiry about running the codes, contact me (Fatema): fzohora@uwaterloo.ca

This research project is the implementation of deep neural network based model PointIso for peptide feature detection from 3D and 4D LC-MS map. This paper is published in Scietific Reports, 2021: https://www.nature.com/articles/s41598-021-97669-7. For any further query, please contact me at: fzohora@uwaterloo.ca

This PointIso model is trained using a dataset generated by LTQ Orbitrap XL ETD. The model should work on any test dataset generated by the similar instrument.

The *.raw files are downloaded from ProteomeXchange database and ProteoWizerd 3.0.1817 is used to convert the raw file into .ms1 format. Trained models can be found in the zipped folders. Please unzip those and keep at your convenient directory. Also download all the python scripts in a directory. Open Linux (Ubuntu) terminal and change to the directory where you saved the python scripts. After that you have to run the python scripts as explained below:

Syntax for running the python scripts from linux (Ubuntu) terminal is provided below in the order of execution:

read_pointCloud.py: This script read the *.ms1 file from location 'filepath' having name 'sample_name' and convert the LC-MS map in that file to a hash table holding the datapoint triplets (RT, m/z, I). The hash table is saved at location 'topath'. The syntax and example of related command to run in terminal is provided below:
$ nohup python -u read_pointCloud.py [filepath] [topath] [sample_name] > output.log &
Example:
$ nohup python -u read_pointCloud.py /data/anne/dilution_series_syn_pep/ /data/anne/dilution_series_syn_pep/hash_record/ 130124_dilA_1_01 > output.log &

isoDetecting_scan_MS1_pointNet.py: This script loads the LC-MS map of sample 'sample_name' saved at location 'recordpath' and scans it using the trained model saved at 'modelpath' location. The scanned result is saved at location 'scanpath'. GPU index is to be mentioned as parameter. Also, this script should scan multiple segments of the LC-MS map in parallel. That is why, we provide another parameter 'start_mz', so that IsoDetecting module starts scanning at that particular m/z value and covers next 200 m/z. If LC-MS map is ranged from 400 to 2000 m/z, then the 'start_mz' of 7 parallel scripts should be: 400, 600, 800, 1000, 1200, 1400, 1800. The syntax and example of related command to run in terminal is provided below:
$nohup python -u isoDetecting_scan_MS1_pointNet.py [recordpath] [sample_name] [modelpath] [gpu_index] [start_mz] [scanpath] > output.log &
Example:
$nohup python -u isoDetecting_scan_MS1_pointNet.py /data/anne/dilution_series_syn_pep/hash_record/ 130124_dilA_1_01 /data/anne/pointIso/3D_model/ 0 400 /data/anne/dilution_series_syn_pep/scanned_result/ > output.log &

makeCluster.py: This script clusters the equidistant isotopes of same charge together. This combines the scanned result generated by multiple parallel scripts as mentioned above. The cluster list is saved at the location 'scanpath'. The syntax and example of related command to run in terminal is provided below:
$nohup python -u makeCluster.py [recordpath] [filename] [scanpath] > output.log &
Example:
$nohup python -u makeCluster.py /data/anne/dilution_series_syn_pep/hash_record/ 130124_dilA_1_01 /data/anne/dilution_series_syn_pep/scanned_result/ > output.log &

IsoGrouping_reportFeature_ev2r4.py: This script process the cluster list generated in previous step by IsoGrouping module and prints the feature table. Feature table is saved at location 'resultpath'. The syntax and example of related command to run in terminal is provided below:
$nohup python -u IsoGrouping_reportFeature_ev2r4.py [recordpath] [scanpath] [modelpath] [filename] [resultpath] [gpu_index] > output.log &
Example:
$ nohup python -u IsoGrouping_reportFeature_ev2r4.py /data/anne/dilution_series_syn_pep/hash_record/ /data/anne/dilution_series_syn_pep/scanned_result/ /data/anne/pointIso/3D_model/ 130124_dilA_1_01 /data/anne/pointIso/3D_result/ 0 > output.log &

Syntax for 4D TimsTOF data:

For 4D data, we convert the raw file to *.mzML using ProteoWizerd. DO NOT select the 'merge scan' option during conversion. Then following script is run.

$ nohup python -u read_pointcloud_4DtimsTOF.py [filepath] [filename] [sample_name] [topath] > output.log &
Example:
$ nohup python -u read_pointcloud_4DtimsTOF.py '/data/anne/timsTOF/' 20180924_50ngHeLa_1.0.25.1_Hystar5.0SR1_S2-A1_1_2042.mzML A1_1_2042 '/data/anne/timsTOF/hash_records/' > output.log &

$nohup python -u k0_dict_write.py [recordpath] [sample_name] > output.log &
Example:
$ nohup python -u k0_dict_write.py '/data/anne/timsTOF/hash_records/' 'A1_1_2042' > output.log &

isoDetecting_scan_4DtimsTOF_pointNet.py script is run for 1 to 12 segments in parallel.
$ nohup python -u isoDetecting_scan_4DtimsTOF_pointNet.py [recordpath] [sample_name] [modelpath] [gpu_index] [segment] [scanpath] > output.log &
Example:
$ nohup python -u isoDetecting_scan_4DtimsTOF_pointNet.py '/data/anne/timsTOF/hash_records/' 'A1_1_2042' /data/anne/pointIso/4D_model/ 0 1 /data/anne/timsTOF/scanned_result/ > output.log &

$ nohup python -u makeCluster.py [recordpath] [sample_name] [scanpath] > output.log &
Example:
$ nohup python -u makeCluster.py '/data/anne/timsTOF/hash_records/' 'A1_1_2042' '/data/anne/timsTOF/scanned_result/' > output.log &

$ nohup python -u IsoGrouping_reportFeature_4DtimsTOF.py [recordpath] [sample_name] [modelpath] [gpu_index] [scanpath] > output.log &
Example:
$ nohup python -u IsoGrouping_reportFeature_4DtimsTOF.py '/data/anne/timsTOF/hash_records/' 'A1_1_2042' /data/anne/pointIso/4D_model/ 0 /data/anne/timsTOF/scanned_result/ > output.log &

k0_separation_timsTOF_parallel.py script is run for 1 to 12 segments in parallel.
$ nohup python -u k0_separation_timsTOF_parallel.py [recordpath] [sample_name] [segment] [scanpath] > output.log &
Example:
$ nohup python -u k0_separation_timsTOF_parallel.py '/data/anne/timsTOF/hash_records/' 'A1_1_2042' 1 /data/anne/timsTOF/scanned_result/ > output.log &

$ nohup python -u make_feature_table_4DtimsTOF.py [scanpath] [sample_name] [resultpath] > output.log &
Example:
$ nohup python -u make_feature_table_4DtimsTOF.py '/data/anne/timsTOF/scanned_result/' 'A1_1_2042' '/data/anne/timsTOF/4D_result/' > output.log &

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
3D_models_IsoDetecting.7z		3D_models_IsoDetecting.7z
3D_models_IsoGrouping.7z		3D_models_IsoGrouping.7z
4D_models.zip		4D_models.zip
IsoGrouping_reportFeature_4DtimsTOF.py		IsoGrouping_reportFeature_4DtimsTOF.py
IsoGrouping_reportFeature_ev2r4.py		IsoGrouping_reportFeature_ev2r4.py
PointIso_quantative_accuracy_csvlist.py		PointIso_quantative_accuracy_csvlist.py
README.md		README.md
annotate_RT_index.py		annotate_RT_index.py
auc_calculation.py		auc_calculation.py
create_common_set_for_training.py		create_common_set_for_training.py
database_search_result_3D.zip		database_search_result_3D.zip
distribution.py		distribution.py
feature_table_to_csv.py		feature_table_to_csv.py
isoDetecting_scan_4DtimsTOF_pointNet.py		isoDetecting_scan_4DtimsTOF_pointNet.py
isoDetecting_scan_MS1_pointNet.py		isoDetecting_scan_MS1_pointNet.py
isoDetecting_scan_MS1_pointNet_smaller.py		isoDetecting_scan_MS1_pointNet_smaller.py
k0_dict_write.py		k0_dict_write.py
k0_separation_timsTOF_parallel.py		k0_separation_timsTOF_parallel.py
makeCluster.py		makeCluster.py
makeCluster_4DtimsTOF.py		makeCluster_4DtimsTOF.py
make_feature_table_4DtimsTOF.py		make_feature_table_4DtimsTOF.py
pointIso_quantitative_accuracy.py		pointIso_quantitative_accuracy.py
pointIso_spiked_peptide_detection.py		pointIso_spiked_peptide_detection.py
read_ionMobility_mq.py		read_ionMobility_mq.py
read_pointCloud.py		read_pointCloud.py
read_pointcloud_4DtimsTOF.py		read_pointcloud_4DtimsTOF.py
screenshot_proteoWizard.png		screenshot_proteoWizard.png
spiked_peptides_in_the_dilution_series_3D.csv		spiked_peptides_in_the_dilution_series_3D.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Syntax for running the python scripts from linux (Ubuntu) terminal is provided below in the order of execution:

Syntax for 4D TimsTOF data:

* Please cite the paper if you are using this project *

About

Releases

Packages

Languages

anne04/PointIso

Folders and files

Latest commit

History

Repository files navigation

Syntax for running the python scripts from linux (Ubuntu) terminal is provided below in the order of execution:

Syntax for 4D TimsTOF data:

*** Please cite the paper if you are using this project ***

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

* Please cite the paper if you are using this project *

Packages