# Inferring phospho peptides

In the following we will use AlphaQuant's proteoform analysis and combine it with deep learning predictions that give a probability how prone a certain peptide sequence is to phosphorylation. We use the combination of those two to predict phorphorylated proteoforms.

As with the standard differential expression analysis, we need:

* an input file from a proteomics search engine. We support most common search engines. Specifications on input files are given in our [README](https://github.com/MannLabs/alphaquant/blob/master/README.md#preparing-input-files).
* a sample mapping file that maps each sample to a condition (e.g.  sample 'brain_replicate_1' is mapped to condition 'brain').
* (optional) a results directory can be defined on where to save the data
* (optional) a list where we specify, which conditions we compare

Additionally, we need to specify, wether we want to perform 




In [None]:
INPUT_FILE = "./data/phospho/proteome_subset.tsv"
SAMPLEMAP_FILE = "./data/phospho/samplemap_proteome.tsv"
RESULTS_DIRECTORY = "./data/phospho/results_phospho_inference"

CONDPAIRS_LIST = [("egf_treated", "untreated")] #this means each fc is egf_treated/untreated


In [None]:
import alphaquant.run_pipeline as aq_pipeline

aq_pipeline.run_pipeline(input_file=INPUT_FILE, samplemap_file=SAMPLEMAP_FILE, results_dir=RESULTS_DIRECTORY, condpairs_list=CONDPAIRS_LIST, organism="human", 
                         perform_phospho_inference=True, cluster_threshold_pval=0.00001, take_median_ion=False, fcdiff_cutoff_clustermerge=0)

In [None]:
import pandas as pd

proteoform_df = pd.read_csv(RESULTS_DIRECTORY + "/egf_treated_VS_untreated.proteoforms.tsv", sep='\t')
display(proteoform_df)

In [None]:
import alphaquant.utils.diffquant_utils as aq_diffquant_utils

proteoform_df_filtered = aq_diffquant_utils.filter_proteoform_df(proteoform_df=proteoform_df, min_num_peptides=1, likely_phospho=True, keep_reference_proteoform=True)
display(proteoform_df_filtered)

In [None]:
import alphaquant.plotting.fcviz as aq_fcviz

proteins_of_interest = proteoform_df_filtered['protein'].unique()

fc_visualizer = aq_fcviz.FoldChangeVisualizer(condition1= "egf_treated", condition2="untreated", results_directory= RESULTS_DIRECTORY, samplemap_file=SAMPLEMAP_FILE)
fc_visualizer.plot_list_of_proteins(proteins_of_interest)

In [None]:
import alphaquant.plotting.alphamapviz as aq_alphamapviz

alphamap_visualizer = aq_alphamapviz.AlphaMapVisualizer(condition1= "egf_treated", condition2="untreated", results_directory= RESULTS_DIRECTORY, samplemap_file=SAMPLEMAP_FILE, organism="Human" )

In [None]:
fc_plot, alphamap_plot = alphamap_visualizer.visualize_protein("EGFR")
fc_plot.show()
alphamap_plot.show()