Skip to content
Lee Richman edited this page Dec 4, 2019 · 3 revisions

Docker instructions.

antigen.garnish can be non-interactively run using a public docker image in a local container. A large wrapper script takes vcf input, HLA table or JSON input, and generates all possible output types. Customizability is limited when running non-interactively. Only direct VCF input is supported in this mode. An additional docker image is available to annotate vcfs using SnpEff.

Install Docker

Install Docker and start the Docker daemon. Set appropriate memory limits and core usage for your machine in the Docker app and then open up a terminal (on Mac) or the Linux Virtual Machine command line in the Docker Windows GUI.

Running antigen.garnish with Docker

pull the antigen.garnish docker image

docker pull leeprichman/antigen_garnish

snpEff: Annotate your vcfs with SnpEff

# start up our container
cID=$(docker run -it -d leeprichman/snpeff /bin/bash)

# set up and move our input file from working directory (change names as appropriate here)
# one single sample vcf file at a time
VCF="myvcf.vcf"

# copy our files on
docker cp $VCF $cID:/$VCF

# run the script to hg19 annotate (substitute others here. See SnpEff databases)
# GRCm38.86 and hg38 are preloaded on this docker image, snpEff will download others
docker exec $cID snpeff.sh hg19

# output file name will be with _se inserted
VCFO=$(echo $VCF | sed 's/\.vcf/_se\.vcf/')
# copy output back to the working directory into output folder
docker cp $cID:/$VCFO .

# clean up the container for next sample
docker stop $cID
docker rm  $cID

antigen.garnish: VCF level input

Start container, install netMHC and configure

move files on, execute wrapper script, and recover output

# start up our container
cID=$(docker run -it -d leeprichman/antigen_garnish /bin/bash)

# if at any point you want to interact with the container
# docker exec -it $cID bash

# copy netMHC tars on, see README for links, change version names as appropriate
netMHC="netMHC-4.0a.Linux.tar.gz"
netMHCII="netMHCII-2.3.Linux.tar.gz"
netMHCpan="netMHCIIpan-3.2.Linux.tar.gz"
netMHCIIpan="netMHCpan-4.0a.Linux.tar.gz"

docker cp $netMHC $cID:/$netMHC
docker cp $netMHCII $cID:/$netMHCII
docker cp $netMHCpan $cID:/$netMHCpan
docker cp $netMHCIIpan $cID:/$netMHCIIpan

# run the configuration script
docker exec $cID config_netMHC.sh

# set up and move our input files from working directory (change names as appropriate here)
# one single sample, snpEff annotated vcf file at a time
# for tumor allelic fraction to be properly recognized the vcf name must be the same as the vcf sample column
# for many vcfs, the column names are "TUMOR", "NORMAL"
# TUMOR.vcf, TUMOR_se.vcf, TUMOR.ann.vcf, and TUMOR.vcf.gz all work
#
# MHC may be a .JSON file as from xHLA output or a 2 row csv or tsv named "*mhc.txt"
# or an Excel file formatted  with column names "sample_id" and "MHC" like so:
#       sample_id     MHC
#       mysample.vcf  H-2-Kb H-2-Db H-2-IAb
# Where the MHC alleles must be a quoted space-separated string in appropriate format (see ?list_MHC).
# Generally for class I this is formatted like: HLA-A*02:01 HLA-B*07:02 or H-2-Kb H-2-Db.
# Errors referring to "pepv" during runtime are often from malformed HLA alleles,
# double check your input then try switching between Excel and .txt file input.
#
# optional RNA transcript level count matrix to be matched against (minimum tpm = 1), named "*counts*"
# must contain a first column of Ensembl Transcript IDs, and a column named "tpm".
VCFO="myvcf_se.vcf"
MHC="mhc.json"
# optional
RNA="rna_counts.txt"

# copy our files on
docker cp $VCFO $cID:/$VCFO
docker cp $MHC $cID:/$MHC
# optional
docker cp $RNA $cID:/$RNA

# run the big  wrapper script
docker exec $cID run_antigen.garnish.R

# copy output back to the working directory into output folder
docker cp $cID:/ag_docker_output/ .

# clean up the container for next sample
docker stop $cID
docker rm  $cID

antigen.garnish: Direct peptide or transcript level input

Start container, move files on, execute wrapper script, and recover output

# start up our container
cID=$(docker run -it -d leeprichman/antigen_garnish /bin/bash)

# if at any point you want to interact with the container
# docker exec -it $cID bash


# copy netMHC tars on, see README for links, change version names as appropriate
netMHC="netMHC-4.0a.Linux.tar.gz"
netMHCII="netMHCII-2.3.Linux.tar.gz"
netMHCpan="netMHCIIpan-3.2.Linux.tar.gz"
netMHCIIpan="netMHCpan-4.0a.Linux.tar.gz"

docker cp $netMHC $cID:/$netMHC
docker cp $netMHCII $cID:/$netMHCII
docker cp $netMHCpan $cID:/$netMHCpan
docker cp $netMHCIIpan $cID:/$netMHCIIpan

# run the configuration script
docker exec $cID config_netMHC.sh


# set up and move our input files from working directory (change names as appropriate here)
# One or more input files with the pattern "docker_input" in the file name
# must be a properly formatted input table as below
# dt with transcript id:
#     Column name                 Example input
#
#     sample_id                   sample_1
#     ensembl_transcript_id       ENST00000311936
#     cDNA_change                 c.718T>A
#     MHC                         HLA-A*02:01 HLA-A*03:01
#
# dt with peptide (standard amino-acid one-letter codes only):
#     sample_id                   <same as above>
#     pep_mut                     MTEYKLVVVDAGGVGKSALTIQLIQNHFV
#     mutant_index                25
#     MHC                         HLA-A*02:01 HLA-A*03:01

# copy input onto the container
DT="myantigens_direct_input.xlsx"

# copy our files on, repeat for all files of interest, or combine all into one large table
docker cp $DT $cID:/$DT

# run the big  wrapper script
docker exec $cID run_antigen.garnish_direct.R

# copy output back to the working directory into output folder
docker cp $cID:/ag_docker_output/ .

# clean up the container for next sample
docker stop $cID
docker rm  $cID
Clone this wiki locally
You can’t perform that action at this time.