# Proteomics - Philosopher Pilot
```
pi:ababaian
files: ~/Crown/data2/proteomics_pilot_philosopher/
start: 2019 11 29
complete : YYYY MM DD
```
## Introduction

Pilot run of philosopher for a TMT experiment. Following [online tutorial](https://github.com/Nesvilab/philosopher/wiki/TMT-Analysis).



## Materials and Methods

### Initialize project workspace

In [1]:
# Initialize a home folder for analysis
HOME='/home/artem/Desktop/Crown/data2/proteomics_pilot_philosopher'
cd $HOME
mkdir -p workdir

# ln -s "~/CPTAC\ Colon\ Cancer\ Confirmatory\ Study/01CPTAC_COprospective_Proteome_PNNL_20170123/mzML_data" mzML




In [2]:
# Fragpipe software home
FRAG='/home/artem/Desktop/FragPipe/FragPipe-12.1/'
# CPTAC data dir
CPTAC="$HOME/mzML_data"

# inMZML
inMZML="$HOME/workdir/01CPTAC_COprospective_W_PNNL_20170123_B1S1_f01.mzML"

# Software
philosopher="$FRAG/lib/philosopher"
MSFragger="$FRAG/lib/MSFragger-2.2.jar"
TMTIntegrator="$FRAG/lib/TMTIntegrator_v1.0.8.jar"




In [3]:
# Intialize workspace
cd $HOME/workdir
$philosopher workspace --init

[36mINFO[0m[14:05:29] Executing Workspace  v2.0.0                  
[36mINFO[0m[14:05:29] Creating workspace                           
[36mINFO[0m[14:05:30] Done                                         


In [4]:
# Download UniProt database with contaimination file
# auto-generates decoys
$philosopher database --id UP000005640 --contam
#$philosopher database --annotate 2019-11-20-td-rev-UP000005640.fas --prefix rev_

[36mINFO[0m[14:05:34] Executing Database  v2.0.0                   
[36mINFO[0m[14:05:34] Fetching database                            
[36mINFO[0m[14:07:37] Processing decoys                            
[36mINFO[0m[14:07:40] Creating file                                
[36mINFO[0m[14:07:52] Processing decoys                            
[36mINFO[0m[14:07:54] Creating file                                
[36mINFO[0m[14:07:56] Done                                         


In [5]:
# Initialize configuration files
java -jar "$MSFragger" --config

Creating configuration files
Writing file: /home/artem/Crown/data2/proteomics_pilot_philosopher/workdir/closed_fragger.params
Writing file: /home/artem/Crown/data2/proteomics_pilot_philosopher/workdir/open_fragger.params
Writing file: /home/artem/Crown/data2/proteomics_pilot_philosopher/workdir/nonspecific_fragger.params


In [6]:
# Update configuration file for TMT
sed -i "s/database_name = test.fasta/database_name = 2019-11-29-td-UP000005640.fas/g" closed_fragger.params

# Variable modifiction of N-term TMT label
sed -i "s/variable_mod_02 = 42.01060 \[^/variable_mod_02 = 42.01060 [^\nvariable_mod_03 = 229.162932 n^/g" closed_fragger.params

# Lysine TMT label
sed -i "s/add_K_lysine = 0.000000/add_K_lysine = 229.162932/g" closed_fragger.params




In [8]:
cd $HOME/workdir
echo $inMZML
echo $MSFragger
echo 
echo
# Run MSFragger on TMT data
java -Xmx6g -jar $MSFragger closed_fragger.params $inMZML

/home/artem/Desktop/Crown/data2/proteomics_pilot_philosopher/workdir/01CPTAC_COprospective_W_PNNL_20170123_B1S1_f01.mzML
/home/artem/Desktop/FragPipe/FragPipe-12.1//lib/MSFragger-2.2.jar


MSFragger version MSFragger-2.2
Batmass-IO version 1.17.1
(c) University of Michigan
RawFileReader reading tool. Copyright (c) 2016 by Thermo Fisher Scientific, Inc. All rights reserved.
System OS: Linux, Architecture: amd64
Java Info: 11.0.4, OpenJDK 64-Bit Server VM, Ubuntu
JVM started with 6 GB memory

Checking /home/artem/Desktop/Crown/data2/proteomics_pilot_philosopher/workdir/01CPTAC_COprospective_W_PNNL_20170123_B1S1_f01.mzML...

***********************************FIRST SEARCH************************************
Parameters:
num_threads = 4
database_name = 2019-11-29-td-UP000005640.fas
decoy_prefix = rev_
precursor_mass_lower = -20.0
precursor_mass_upper = 20.0
precursor_mass_units = 1
precursor_true_tolerance = 20.0
precursor_true_units = 1
fragment_mass_tolerance = 20.

In [9]:
# Validate PSM with PeptideProphet
$philosopher peptideprophet --database 2019-11-29-td-UP000005640.fas \
  --ppm --accmass --expectscore --decoyprobs \
  --nonparam 01CPTAC_COprospective_W_PNNL_20170123_B1S1_f01.pepXML


[36mINFO[0m[14:18:46] Executing PeptideProphet  v2.0.0             
 file 1: /home/artem/Desktop/Crown/data2/proteomics_pilot_philosopher/workdir/01CPTAC_COprospective_W_PNNL_20170123_B1S1_f01.pepXML
Unknown file type. No file loaded./home/artem/Desktop/Crown/data2/proteomics_pilot_philosopher/workdir/01CPTAC_COprospective_W_PNNL_20170123_B1S1_f01.mzBIN_calibrated
SUCCESS: CORRECTED data file /home/artem/Desktop/Crown/data2/proteomics_pilot_philosopher/workdir/01CPTAC_COprospective_W_PNNL_20170123_B1S1_f01.mzML in msms_run_summary tag...
 processed altogether 34569 results
INFO: Results written to file: /home/artem/Desktop/Crown/data2/proteomics_pilot_philosopher/workdir/interact-01CPTAC_COprospective_W_PNNL_20170123_B1S1_f01.pep.xml

  - /home/artem/Desktop/Crown/data2/proteomics_pilot_philosopher/workdir/interact-01CPTAC_COprospective_W_PNNL_20170123_B1S1_f01.pep.xml
  - Building Commentz-Walter keyword tree...
  - Searching the tree...
  - Linking duplicate entries...
 

In [10]:
# Perform protein-inference and make protXML
$philosopher proteinprophet interact-01CPTAC_COprospective_W_PNNL_20170123_B1S1_f01.pep.xml


[36mINFO[0m[14:20:16] Executing ProteinProphet  v2.0.0             
ProteinProphet (C++) by Insilicos LLC and LabKey Software, after the original Perl by A. Keller (TPP v5.2.1-dev Flammagenitus, Build 201906251008-exported (Linux-x86_64))
 (no FPKM) (using degen pep info)
Reading in /home/artem/Desktop/Crown/data2/proteomics_pilot_philosopher/workdir/interact-01CPTAC_COprospective_W_PNNL_20170123_B1S1_f01.pep.xml...
...read in 0 1+, 6282 2+, 8355 3+, 2283 4+, 367 5+, 38 6+, 0 7+ spectra with min prob 0.05

Initializing 12195 peptide weights: 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Calculating protein lengths and molecular weights from database /home/artem/Desktop/Crown/data2/proteomics_pilot_philosopher/workdir/2019-11-29-td-UP000005640.fas
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........1000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........2000
.....

In [11]:
# Filter matches and estimate FDR
$philosopher filter --razor \
  --pepxml  interact-01CPTAC_COprospective_W_PNNL_20170123_B1S1_f01.pep.xml \
  --protxml interact.prot.xml


[36mINFO[0m[14:21:04] Executing Filter  v2.0.0                     
[36mINFO[0m[14:21:04] Processing peptide identification files      
[36mINFO[0m[14:21:07] 1+ Charge profile                             [36mdecoy[0m=0 [36mtarget[0m=0
[36mINFO[0m[14:21:07] 2+ Charge profile                             [36mdecoy[0m=322 [36mtarget[0m=5960
[36mINFO[0m[14:21:07] 3+ Charge profile                             [36mdecoy[0m=148 [36mtarget[0m=8208
[36mINFO[0m[14:21:07] 4+ Charge profile                             [36mdecoy[0m=26 [36mtarget[0m=2257
[36mINFO[0m[14:21:07] 5+ Charge profile                             [36mdecoy[0m=9 [36mtarget[0m=358
[36mINFO[0m[14:21:07] 6+ Charge profile                             [36mdecoy[0m=0 [36mtarget[0m=38
[36mINFO[0m[14:21:07] Database search results                       [36mions[0m=12180 [36mpeptides[0m=9100 [36mpsms[0m=17326
[36mINFO[0m[14:21:07] Converged to 1.00 % FDR with 15801 PSMs       

In [12]:
# Annotation File for TMT
echo "126 normal_1
127N normal_2
127C tumour_norm_1
128N tumour_mid_1
128C tumour_mid_2
129N tumour_na_1
129C tumour_na_2
130N tumour_hypo_1
130C tumour_hypo_2
131N pool" > annotation.txt



In [13]:
# quantify TMT-10 plex data
# output:
#  ion.tsv
$philosopher labelquant --plex 10 --dir .

## --uniqueonly

[36mINFO[0m[14:22:32] Executing Isobaric-label quantification  v2.0.0 
[36mINFO[0m[14:22:35] Calculating intensities and ion interference 
[36mINFO[0m[14:22:35] Processing 01CPTAC_COprospective_W_PNNL_20170123_B1S1_f01 
[36mINFO[0m[14:23:12] Filtering spectra for label quantification   
[36mINFO[0m[14:23:12] Removing 0 PSMs from isobaric quantification 
[36mINFO[0m[14:23:12] Calculating normalized protein levels        
[36mINFO[0m[14:23:12] Saving                                       
[36mINFO[0m[14:23:14] Done                                         


In [15]:
# Quantify TMT-10 plex data with TMT-integrator
# TMTIntegrator="$FRAG/lib/TMTIntegrator_v1.0.8.jar"
java -jar -Xmx8g $TMTIntegrator TMTIntegrator.config.yaml psm.tsv




In [14]:
# Create an output report
# output:
#   psm.tsv
#   peptide.tsv
#   protein.tsv
$philosopher report

[36mINFO[0m[14:23:18] Executing Report  v2.0.0                     
[36mINFO[0m[14:23:21] Creating reports                             
[36mINFO[0m[14:23:22] Done                                         


## Results


### 

## Discussion
