Skip to content

Step 0a. Obtain gene level MAGMA association statistics

Huwenbo Shi edited this page Jun 11, 2026 · 4 revisions

Overview

We use MAGMA to prioritize disease-associated genes, using GWAS summary statistics data. MAGMA prioritizes genes that are proximal to genetic variants with strong genetic association to the disease.

Running MAGMA typically takes 2 steps:

  1. Create an annotation file that maps SNPs to genes, using window based approach
  2. Calculate gene-level association statistics using the annotation file and SNP-level GWAS summary statistics data

More details on how to run MAGMA can be found on the MAGMA page here.

Input required

  • GWAS summary statistics data, containing SNP IDs and association p-values
  • A LD reference panel matching the GWAS population. This can be downloaded from the MAGMA page here.
  • A gene annotation file providing Entrez IDs, gene symbols, and start and end base pair position of the gene. This can be downloaded from the MAGMA page here.

Examples

Script to run MAGMA

We provide an example script to obtain gene-level MAGMA association statistics in misc/run_magma.sh (also see below). You will need to modify the relevant variables to run MAGMA on your GWAS of interest.

# Path to the MAGMA tool
MAGMA=<path to the MAGMA package>/magma


# Mapping SNP to genes using MAGMA
WINDOW=<upstream window size>,<downstream window size>	          # By default, WINDOW=10,10
SNP_LOC=<prefix to a reference panel plink file across all SNPs>
GENE_LOC=<a gene annotation file>			                      # Provide Entrez ID, chromosome number, start base pair, stop base pair, strand, gene symbol
ANNOT_OUT=<MAGMA annotation output file name>

$MAGMA \
    --annotate window=$WINDOW \
    --snp-loc $SNP_LOC.bim \
    --gene-loc $GENE_LOC \
    --out $ANNOT_OUT


# Get gene-level association statistics
PVAL=<GWAS summary statistics data file>		    # Path to the SNP-level GWAS summary statistics file
USE=<SNP ID column name>,<p-value column name>		# Provide the SNP ID and p-value column in the GWAS summary statsitics file
NCOL=<sample size colname name>				        # Provide the sample size column in the GWAS summary statistics file
GS_OUT=<output file name>

$MAGMA \
    --bfile $SNP_LOC \
    --pval $PVAL use=$USE ncol=$NCOL \
    --gene-annot $ANNOT_OUT.genes.annot \
    --out $GS_OUT

MAGMA output

MAGMA outputs a table of gene-level association statistics with the following columns. scEPS will use the GENE and ZSTAT column for downstream analysis.

GENE       CHR      START       STOP  NSNPS  NPARAM       N        ZSTAT            P

Notes

By default, MAGMA output uses Entrez IDs to represent genes. You will need to replace the Entrez IDs with gene symbols or Ensemble IDs, based on what's used to represent genes in the single-cell data.