Skip to content

VCF Requirements

Ryan Whaley edited this page May 21, 2018 · 13 revisions

General Requirements

PharmCAT expects the incoming VCF files to follow the official VCF spec.

In addition, PharmCAT expects incoming VCF to have the following properties:

  1. Build version must be aligned to the GRCh38 assembly (i.e. b38, hg38, etc.)
  2. Any position not in the input VCF is assumed to be a "no call". Missing positions will not be interpreted as reference. You must specify all positions in the input VCF that you want to be considered.
  3. The CHROM field must be in the format chr##
  4. The QUAL and FILTER columns are not interpreted. It is left to the user to remove data not meeting quality criteria before passing it to PharmCAT.
  5. Should only have data for a single sample. If it's a multi-sample VCF file, only the first sample is used.

Insertions & Deletions

Deletions

PharmCAT expects deletions to be represented with an "anchoring" base at the beginning of the REF sequence and then the anchoring base to also appear in the ALT sequence. For example, the following shows a deletion of AGAAATGGAA:

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	SAMPLE
chr10	94942212	.	AAGAAATGGAA	A	.	PASS	desired-deletion-format	GT	0/1

as opposed to the unwanted format:

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	SAMPLE
chr10	94942212	.	AGAAATGGAA	.	.	PASS	do-not-want	GT	0/1

If the REF is a single letter it means no variant was found, so it's safe to replace it with the appropriate nucleotide string.

Insertions

Similarly, PharmCAT expects to find insertions with a reference base REF="A" ALT="ATCT". For example, here's an insertion of A:

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	SAMPLE
chr7	99652770	rs41303343	T	TA	.	PASS	desired-insertion-format	GT	0/1

More Information

PharmCAT has an example VCF file for you to reference that can be successfully run through PharmCAT.

For more details about fulfilling these requirements for PharmCAT read the Preparing VCF Files page.