Skip to content

MAGMA GWAS Genetic Risk Coexpression Module Integration with Seyfried Pipeline Adaptation

License

Notifications You must be signed in to change notification settings

edammer/MAGMA.SPA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MAGMA.SPA

MAGMA GWAS, TWAS, or PWAS Genetic Risk Coexpression Module Integration with Seyfried Pipeline Adaptation

Calculates a mean enrichment score for risk in modules or clusters of gene product proteins or trancriptomics, using any comprehensive genome-wide list of genes and their estimated significance of contribution to a trait, e.g. disease risk. Bootstrap-based permutation calculation is performed as we have published.

Requires the following R packages: WGCNA, statmod, doParallel, xlsx, ggplot2, gridBase, grid, gplots, calibrate

Sample Wrapper for MAGMA.SPA function:

#MAGMA-SPA (Seyfried Pipeline Adaptation for MAGMA)
#---------------------------------

# Required parameters, variables, and data must be set as shown above in .GlobalEnv before calling function; currently no defaults are automatic.
##################################
MAGMAinputDir= "E:/5.MAGMAinput/"

MAGMAinputs= c(	"AD_GWAS_ENSEMBLE_averageMinusLogP(Plt0.05).csv",
                "ALS_GWAS_ENSEMBLE_avgMinusLogP(Plt0.05).csv")     #These files must be in MAGMAinputDir

maxP=0.05                 #no genes with a MAGMA summarized p value greater than this will be considered even if in the MAGMA-derived input files.
FDR=0.10                  #FDR or q value (0 < FDR < 1); recommend 0.10, i.e. 10%
barcolors= c("darkslateblue","mediumorchid")  #specify one unique color for each of above MAGMAinputs
                          #common colors: "darkslateblue","mediumorchid", "seagreen3","hotpink","goldenrod","darkorange","darkmagenta", ...
relatednessOrderBar=TRUE  #Plot mean scaled enrichment bar plot in column order (relatedness) of MEs?  If FALSE, they will be plotted in size rank order M1, M2, ...

# Data created during the Seyfried Analysis Pipeline
##################################
NETcolors= net$colors     #module color assignments, vector of length equal to number of rows in cleanDat; should have all colors for modules from 1:minimumSizeRank as printed by WGCNA::labels2colors(1:nModules)
#MEs= MEs                 #Module Eigengenes (or Eigenproteins) with columns of MEs ordered in relatedness order
#cleanDat= cleanDat       #rownames must start with HUMAN gene symbols, separated by any other rowname information using ';' or '|' character

# Other variables
#################################
outFilePrefix="5"         #Filename prefix; step in the pipeline -- for file sorting by name.
outFileSuffix="AD_ALS_GWAS_ensembleAvg"
parallelThreads=8         #Each permutation analysis is run on a separate thread simultaneously, up to this many threads.
calculateMEs=TRUE         #Recalculate MEs and their relatedness order, even if the data already exists.
plotOnly=FALSE            #If plotOnly is TRUE, the variables created by MAGMA.SPA function holding plot data should already exist (xlabels, allBarData).
##################################

# Run the permutation analysis and generate all outputs
source("MAGMA.SPA.R")
MAGMAoutList <- MAGMA.SPA()
# Outputs XLSX, PDF, and list of barplot y values (allBarData), barplot labels (xlabels), and all permutation statistics and gene symbol hits (all_output)


# Rerun function, just to plot previously calculated statistics
allBarData<-MAGMAoutList$allBarData
xlabels<-MAGMAoutList$xlabels
all_output<-MAGMAoutList$all_output
plotOnly=TRUE
MAGMAoutList <- MAGMA.SPA()

Additional Notes

The provided ensemble (multi-GWAS) Alzheimer's disease or ALS risk p values are calculated as the mean -log(p) gene-level risk for all genes reaching nominal significance in any GWAS study considered, following rollup of SNP-level GWAS summary statistics to the gene-level p value using MAGMA v1.09b command line processing of each GWAS' SNP summary statistics.

MAGMA is only one possible source of genome-wide comprehensive gene-level p values which this bootstrap algorithm can use as input for estimating enrichment of significant risk conferred by multiple gene products represented in a particular systems biology coexpression module or cluster of related gene products.

GWAS studies leveraged to generate the 2-column inputs with human gene SYMBOLS and their associated p value or -log10(p) for disease risk include:

Alzheimer Disease GWAS

  1. BW Kunkle et al (2019) AD GWAS (files for this GWAS contain 1834 genes with p<0.05).
  2. J-C Lambert et al (2013) AD GWAS (files for this GWAS contain 1234 genes with p<0.05), and were originally supplemental when used for our publication by NT Seyfried et al, Cell Syst (2017).
  3. C Bellenguez et al (2022) AD GWAS, summary statistics available via FTP from EBI.

Amylotrophic Lateral Sclerosis (ALS) GWAS

  1. A Iacoangelli et al (2020) GWAS (files for this GWAS are labeled by last author "Al Chalibi"). Related informatics for the paper are on GitHub, and the statistics linked there were downloaded as input for MAGMA.
  2. W van Rheenen et al (2021) GWAS, summary statistics available via FTP from EBI.

Preprocessing and linux shell scripts calling MAGMA for each of the above study data are available on request.

About

MAGMA GWAS Genetic Risk Coexpression Module Integration with Seyfried Pipeline Adaptation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages