
# Identify enriched gene sets associated with differentially spliced PTMs

As is commonly done for exon-centric analyses, we have provided the ability to perform gene set enrichment analysis for gene associated with spliced PTMs, using the EnrichR API from the gseapy module. By default, we include gene ontology terms, KEGG pathways, and Reactome pathways, but you can also provide your own gene sets listed in EnrichR.


In [None]:
from ptm_pose import analyze
import pandas as pd

# Load spliced ptm and altered flank data
spliced_ptms = pd.read_csv('spliced_ptms.csv')
altered_flanks = pd.read_csv('altered_flanks.csv')

Use the below function, we can identify enriched gene sets associated with spliced ptms, altered flanks, or both. We can also specify the alpha value for significance, the minimum change in PSI value to consider, and whether to return only significant gene sets. Lastly (not shown here), we can also provide a background dataframe that contains all spliced PTMs measured. Otherwise it will just use the entire genome, which may inflate certain PTM-centric gene sets (cell signaling pathways, for example).



In [None]:
genesets = analyze.gene_set_enrichment(spliced_ptms, altered_flanks, alpha = 0.05, min_dPSI = 0.1, return_sig_only = True)
genesets.head()

You can then plot the enriched gene sets, including the proportion of genes associated with differentially included PTMs and those with altered flanking sequences. Here, let's restrict to looking at the top 5 enriched gene sets:



In [None]:
from ptm_pose import plots as pose_plots

pose_plots.plot_EnrichR_pies(genesets, top_terms = 5)