Skip to content
Ensemble Antigen Prediction and Quality Analysis from DNA Variants and Proteins in R
R
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github document changes Aug 2, 2019
R 20x speed up in CI calculation; selectively add back safe mclapply ca… Apr 7, 2020
docs
inst
man Format using lintr, rebuild package Apr 6, 2020
pkgdown
tests
.Rbuildignore Edge case fixes, R implementation of fitness model (#76) Mar 29, 2018
.Rhistory
.editorconfig
.gitignore Rebuild Aug 3, 2019
DESCRIPTION
LICENSE license and news update, refresh namespace Aug 1, 2019
NAMESPACE
NEWS.md
README.md

README.md

rech.io | rech.io | |

antigen.garnish

Ensemble tumor neoantigen prediction and multi-parameter quality analysis from direct input, SNVs, indels, or gene fusion variants.

Detailed flowchart.

Description

An R package for neoantigen analysis that takes human or murine DNA missense mutations, insertions, deletions, or RNASeq-derived gene fusions and performs ensemble neoantigen prediction using 7 algorithms. Input is a VCF file, JAFFA output, or table of peptides or transcripts. Outputs are ranked and summarized by sample. Neoantigens are ranked by MHC I/II binding affinity, clonality, RNA expression, similarity to known immunogenic antigens, and dissimilarity to the normal peptidome.

Advantages

  1. Thoroughness:
    • missense mutations, insertions, deletions, and gene fusions
    • human and mouse
    • ensemble MHC class I/II binding prediction using mhcflurry, mhcnuggets, netMHC, netMHCII, netMHCpan and netMHCIIpan
    • ranked by
      • MHC I/II binding affinity
      • clonality
      • RNA expression
      • similarity to known immunogenic antigens
      • dissimilarity to the normal peptidome
  2. Speed and simplicity:
  3. Integration with R/Bioconductor
    • upstream/VCF processing
    • exploratory data analysis, visualization

Installation

Three methods exist to run antigen.garnish:

  1. Docker
  2. Linux
  3. Amazon Web Services

Docker

docker pull leeprichman/antigen_garnish

See the wiki for instructions to run the Docker container.

Linux

Requirements

  • R ≥ 3.4
  • python-pip
  • tcsh (required for netMHC)
  • sudo privileges (required for netMHC)
  • GNU Parallel (required for master branch development version only)

Installation script

The following line downloads and runs the initial installation script.

$ curl -fsSL http://get.rech.io/install_antigen.garnish.sh | sudo sh

Next, download the netMHC suite of tools for Linux, available under an academic license:

After downloading the files above, move the binaries into the antigen.garnish data directory, first setting the NET_MHC_DIR and ANTIGEN_GARNISH_DIR environmental variables, as shown here:

NET_MHC_DIR=/path/to/folder/containing/netMHC/downloads
ANTIGEN_GARNISH_DIR=/path/to/antigen.garnish/data/directory

cd "$NET_MHC_DIR" || return 1

mkdir -p "$ANTIGEN_GARNISH_DIR/netMHC" || return 1

find . -name "netMHC*.tar.gz" -exec tar xvzf {} -C "$ANTIGEN_GARNISH_DIR/netMHC" \;

chown "$USER" "$ANTIGEN_GARNISH_DIR/netMHC"
chmod 700 -R "$ANTIGEN_GARNISH_DIR/netMHC"

Amazon Web Services

See the wiki for instructions to create an Amazon Web Services instance.

Development version from master

Follow instructions above under Installation script to install dependencies, and then:

devtools::install_github("immune-health/antigen.garnish")

Package documentation

Package documentation can be found: website, pdf.

Workflow example

  1. Prepare input for MHC affinity prediction and quality analysis:

    • VCF input - garnish_variants
    • Fusions from RNASeq via JAFFA- garnish_jaffa
    • Prepare table of direct transcript or peptide input - see manual page in R (?garnish_affinity)
  2. Add MHC alleles of interest - see examples below.

  3. Run ensemble prediction method and perform antigen quality analysis including proteome-wide differential agretopicity, IEDB alignment score, and dissimilarity: garnish_affinity.

  4. Summarize output by sample level with garnish_summary and garnish_plot, and prioritize the highest quality neoantigens per clone and sample with garnish_antigens.

Function examples

Predict neoantigens from missense mutations, insertions, and deletions

library(magrittr)
library(data.table)
library(antigen.garnish)

  # load an example VCF
	dir <- system.file(package = "antigen.garnish") %>%
		file.path(., "extdata/testdata")

	dt <- "antigen.garnish_example.vcf" %>%
	file.path(dir, .) %>%

  # extract variants
    garnish_variants %>%

  # add space separated MHC types

  # see list_mhc() for nomenclature of supported alleles

	# MHC may also be set to "all_human" or "all_mouse" to use all supported alleles

      .[, MHC := c("HLA-A*01:47 HLA-A*02:01 HLA-DRB1*14:67")] %>%

  # predict neoantigens
    garnish_affinity

  # summarize predictions
    dt %>%
      garnish_summary %T>%
        print

  # generate summary graphs
    dt %>% garnish_plot

Predict neoantigens from gene fusions

library(magrittr)
library(data.table)
library(antigen.garnish)

  # load example jaffa output
	dir <- system.file(package = "antigen.garnish") %>%
		file.path(., "extdata/testdata")

	path <- "antigen.garnish_jaffa_results.csv" %>%
			file.path(dir, .)
	fasta_path <- "antigen.garnish_jaffa_results.fasta" %>%
			file.path(dir, .)

  # get predictions
    dt <- garnish_jaffa(path, db = "GRCm38", fasta_path) %>%

  # add MHC info with list_mhc() compatible names
    .[, MHC := "H-2-Kb"] %>%

  # get predictions
    garnish_affinity %>%

  # summarize predictions
    garnish_summary %T>%
    print

Get full MHC affinity output from a Microsoft Excel file of variants

library(magrittr)
library(data.table)
library(antigen.garnish)

  # load example Microsoft Excel file
  dir <- system.file(package = "antigen.garnish") %>%
    file.path(., "extdata/testdata")

  path <- "antigen.garnish_test_input.xlsx" %>%
    file.path(dir, .)

  # predict neoantigens
    dt <- garnish_affinity(path = path) %T>%
      str

Directly calculate IEDB score and dissimilarity for a list of sequences

library(magrittr)
library(data.table)
library(antigen.garnish)

  # generate our character vector of sequences
  v <- c("SIINFEKL", "ILAKFLHWL", "GILGFVFTL")

  # calculate IEDB score
  v %>% iedb_score(db = "human") %>% print

	# calculate dissimilarity
	v %>% garnish_dissimilarity(db = "human") %>% print

Automated testing

From ./<Github repo>:

  devtools::test(reporter = "summary")

How are peptides generated?

  library(magrittr)
  library(data.table)
  library(antigen.garnish)

  # generate a fake peptide
    dt <- data.table::data.table(
       pep_base = "Y___*___THIS_IS_________*___A_CODE_TEST!______*__X",
       mutant_index = c(5, 25, 47, 50),
       pep_type = "test",
       var_uuid = c(
                    "front_truncate",
                    "middle",
                    "back_truncate",
                    "end")) %>%
  # create nmers
    make_nmers %T>% print

Plots and summary tables

  • garnish_plot output:

  • garnish_antigens output:

Citation

Richman LP, Vonderheide RH, and Rech AJ. "Neoantigen dissimilarity to the self-proteome predicts immunogenicity and response to immune checkpoint blockade." Cell Systems 9, 375-382.E4, (2019).

Contributing

We welcome contributions and feedback via Github or email.

Acknowledgments

We thank the follow individuals for contributions and helpful discussion:

License

Please see LICENSE.

You can’t perform that action at this time.