# Cis effects enrichment - g:Profiler

This notebook will use the [g:Profiler tool](https://biit.cs.ut.ee/gprofiler/gost) to look for pathways enriched for proteins that came up across multiple cancers in the cis effects analysis.

## Setup

In [1]:
import pandas as pd
import numpy as np
import gprofiler
import cptac.utils as ut
import IPython.display

In [2]:
def run_gprofiler(input_file):

    input_df = pd.read_csv(input_file, sep="\t")
    protein_list = input_df["protein"].tolist()
    
    gp = gprofiler.GProfiler(return_dataframe=True)
    
    results = gp.profile(
        organism="hsapiens",
        query=protein_list,
        ordered=False,
        sources=["GO:BP", "KEGG", "REAC", "WP"]
    )
    
    return results

### 8p cis effects

In [3]:
run_gprofiler("pancancer_summary_8p_cis.tsv")

Unnamed: 0,source,native,name,p_value,significant,description,term_size,query_size,intersection_size,effective_domain_size,precision,recall,query,parents
0,REAC,REAC:R-HSA-8853336,Signaling by plasma membrane FGFR1 fusions,0.009989,True,Signaling by plasma membrane FGFR1 fusions,3,34,2,10588,0.058824,0.666667,query_1,[REAC:R-HSA-1839124]
1,GO:BP,GO:0031468,nuclear envelope reassembly,0.021981,True,"""The reformation of the nuclear envelope follo...",18,44,3,17916,0.068182,0.166667,query_1,[GO:0006998]


### 8q cis effects

In [4]:
run_gprofiler("pancancer_summary_8q_cis.tsv")

Unnamed: 0,source,native,name,p_value,significant,description,term_size,query_size,intersection_size,effective_domain_size,precision,recall,query,parents
0,GO:BP,GO:0090501,RNA phosphodiester bond hydrolysis,0.001431,True,"""The RNA metabolic process in which the phosph...",155,117,9,17916,0.076923,0.058065,query_1,"[GO:0016070, GO:0090305]"
1,GO:BP,GO:0090502,"RNA phosphodiester bond hydrolysis, endonucleo...",0.001484,True,"""The chemical reactions and pathways involving...",79,117,7,17916,0.059829,0.088608,query_1,[GO:0090501]
2,GO:BP,GO:0090305,nucleic acid phosphodiester bond hydrolysis,0.009966,True,"""The nucleic acid metabolic process in which t...",308,117,11,17916,0.094017,0.035714,query_1,[GO:0090304]
3,WP,WP:WP2363,Gastric Cancer Network 2,0.015219,True,Gastric Cancer Network 2,31,60,4,6954,0.066667,0.129032,query_1,[WP:000000]
4,GO:BP,GO:0006259,DNA metabolic process,0.02165,True,"""Any cellular metabolic process involving deox...",950,117,19,17916,0.162393,0.02,query_1,"[GO:0044260, GO:0090304]"
5,GO:BP,GO:0034641,cellular nitrogen compound metabolic process,0.0229,True,"""The chemical reactions and pathways involving...",6463,117,65,17916,0.555556,0.010057,query_1,"[GO:0006807, GO:0044237]"
6,GO:BP,GO:0006725,cellular aromatic compound metabolic process,0.024193,True,"""The chemical reactions and pathways involving...",5905,117,61,17916,0.521368,0.01033,query_1,[GO:0044237]
7,GO:BP,GO:0006281,DNA repair,0.035411,True,"""The process of restoring DNA after damage. Ge...",567,117,14,17916,0.119658,0.024691,query_1,"[GO:0006259, GO:0006974]"
8,KEGG,KEGG:00190,Oxidative phosphorylation,0.038174,True,Oxidative phosphorylation,133,59,6,7747,0.101695,0.045113,query_1,[KEGG:00000]
9,GO:BP,GO:0044237,cellular metabolic process,0.043554,True,"""The chemical reactions and pathways by which ...",10857,117,92,17916,0.786325,0.008474,query_1,"[GO:0008152, GO:0009987]"
