Skip to content
Scripts for looking up peptides with custom grouping for uniqueness
Python R
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.


Type Name Latest commit message Commit time
Failed to load latest commit information.


This is a folder for looking up peptides from fractionation-MS experiments using custom grouping for uniqueness

One part of a larger scheme that goes:

  1. Convert .RAW thermo files in /MS/submit to mzXML in /MS/processed using local Windows MSConvert
  2. Run MSblender on TACC (ls5) $scratch using the setup (using branch
    • Download output ot /MS/processed
  3. Format proteome for peptide lookup
  4. Lookup peptides

Analysis steps

Define grouping of peptides

1. Run eggnog-mapper to assign proteins to groups and format output

Running hmmer straight on a full protein will take several days to process

$ nohup python /project/eggnog-mapper-0.99.2/ -i /project/cmcwhite/orthology_proteomics/proteomes/human/uniprot-reviewed%3Ayes+AND+proteome%3Aup000005640.fasta --output human_hmmer_euNOG -d euNOG --override --scratch_dir /project/cmcwhite/orthology_proteomics/proteomes/human/ -m hmmer --output_dir /project/cmcwhite/orthology_proteomics/eggnog_mapper  &> /project/cmcwhite/orthology_proteomics/logs/nohup_human_euNOG.txt &

Alternatively, break up the proteome into chunks and process in parallel using

The output from the eggnog mapper need to be formatted 
$ format_emapper_output.R -f human_hmmer_euNOG.emapper.annotations -o human_hmmer_euNOG.mapping -s hmmer -l euNOG

creates file with format:
ProteinID	ID

2. Do an artificial trypsin digest on a proteome

$ python scripts/ --input proteomes/human/uniprot-proteome%3AUP000005640.fasta --output proteomes/human/uniprot-proteome%3AUP000005640_peptides.csv --miss 2

4. Get group-unique peptides Identify groups of proteins from peptides that are unique to the proteins in a group

$ python scripts/ --spec human --grouping_type euNOG --grouping eggnog_mapper/human_hmmer_euk.mapping  --peptides proteomes/human/working_proteome/uniprot-proteome_human_reviewed_peptides.csv --output_dir proteomes/human/working_proteome/

Identify proteins in an experiment

1. Consolidate identified peptides from multiple experiments into a single file

$ bash scripts/ /MS/processed/Fusion_data/ExperimentA/output ExperimentA elutions/


     ACDER 1
     ETIAJR 2

     GFEAR 1
     AYTQWER 3


ExperimentA_elution.csv ExperimentA,fraction1,ACDER,1 ExperimentA,fraction1,ETIAJR,2 ExperimentA,fraction2,GFEAR,1 ExperimentA,fraction2,AYTQWER,3

These formatted files are stored in the elutions/ folder

#Don't do by proteins. Use weighted peptide output instead
2. Lookup peptides by protein

Do the look up $ python scripts/ human protein ExperimentA elutions/ExperimentA_elution.csv proteomes/human/working_proteome/unique_peptides_human_protein.csv proteomes/contam/contam_benzo_peptides.csv

$ python scripts/ identified_elutions/human/ExperimentA_elution_human_protein.csv

Transform columns to a wide table. ex. "tidy elution format" ExperimentA,fraction1,protein1,10 ExperimentA,fraction2,protein1,30 ExperimentA,fraction1,protein2,3 ExperimentA,fraction2,protein2,2

   "wide elution format"


3. Lookup peptides according to a grouping of proteins

3. Get protein-unique peptides Identify proteins from peptides that are unique to single proteins

$ python scripts/ --spec human --grouping_type protein  --peptides proteomes/human/working_proteome/uniprot-proteome_human_reviewed_peptides.csv --output_dir proteomes/human/working_proteome/

$ python scripts/ human euNOG ExperimentA elutions/ExperimentA_elution.csv proteomes/human/working_proteome/unique_peptides_human_euNOG.csv proteomes/contam/contam_benzo_peptides.csv

$ python scripts/ identified_elutions/human/ExperimentA_elution_human_euNOG.csv eggnog_mapper/human_hmmer_euNOG.mapping annotation_files/all_annotations.csv

Going to be removed, not very useful extra format Similar to, but also creates an alternate format that shows the proteins in a group


You can’t perform that action at this time.