# Pathway Enrichment Analysis File #

File to see which pathways seem to be significantly enriched with high synergy and different toxicity categories

Steps:
- Retrieve the drug combinations along with their synergy scores and toxicity categories
- Retrieve the pathways that are targeted for each of these drug combinations and the pathway sets of interest to test (perhaps the higher level pathways that consist of the lowest level pathways)
Break up the drug combination datasets into each toxicity category (drugcombo_major, drugcombo_moderate, drugcombo_minor)
- For each of these drug combination toxicity datasets
    - Rank them by synergy score
    - Compute a hypergeometric test (is this Fisher's exact test?) on the pathway sets to get an enrichment p-value for each pathway set
    - For each pathway set, you create a 2x2 contingency table:
        - Rows: Drug combinations in high synergy vs not high synergy (you'd need to define a threshold)
        - Columns: Hits pathway set vs doesn't hit pathway set
        - Then run Fisher's exact test on this table


In [2]:
# Import everything needed
from scipy import stats
from toxicity_ranking import *
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [3]:
# Read in the syntoxtarg_allpw.csv file
df = pd.read_csv("data_processed/syntoxtarg_allpw.csv")

Then create a heatmap visualization:
- Rows are enriched pathways
- Columns are toxicity categories
- Color intensity represents -log10(p-value)
- Size of squares could represent odds ratio