PharmCAT expects the incoming VCF files to follow the official VCF spec.
In addition, PharmCAT expects incoming VCF to have the following properties:
- Build version must be aligned to the GRCh38 assembly (aka
- Any position not in the input VCF is assumed to be a "no call". Missing positions will not be interpreted as reference. You must specify all positions in the input VCF that you want to be considered.
- Use a parsimonious, left aligned variant representation format.
- Have insertions and deletions normalized to the expected representation.
CHROMfield must be in the format chr##.
FILTERcolumns are not interpreted. It is left to the user to remove data not meeting quality criteria before passing it to PharmCAT.
- Should only have data for a single sample. If it's a multi-sample VCF file, only the first sample is used.
Variant Representation Format
To avoid ambiguity in variant representation, PharmCAT is using a parsimonious, left-aligned variant representation format (as discussed in Unified Representation of Genetic Variants by Tan, Abecasis, and Kang).
Insertions & Deletions
PharmCAT expects deletions to be represented with an "anchoring" base at the beginning of the
REF sequence and then the anchoring base to also appear in the
ALT sequence. For example, the following shows a deletion of
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE chr10 94942212 . AAGAAATGGAA A . PASS desired-deletion-format GT 0/1
as opposed to the unwanted format:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE chr10 94942212 . AGAAATGGAA . . PASS do-not-want GT 0/1
If the REF is a single letter it means no variant was found, so it's safe to replace it with the appropriate nucleotide string.
Similarly, PharmCAT expects to find insertions with a reference base
REF="A" ALT="ATCT". For example, here's an insertion of
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE chr7 99652770 rs41303343 T TA . PASS desired-insertion-format GT 0/1
Every PharmCAT release includes a
pharmcat_positions.vcf VCF file that contains all positions of interest to PharmCAT.
For more details about fulfilling these requirements for PharmCAT read the Preparing VCF Files page.
See Preprocessing VCF Files for PharmCAT for a script to automate some of these steps.