We use SnpEff to annotate vcf files, then parse the results to regular tables with R scripts.
Download SnpEff via this link. Download SARS-CoV-2 database via:
java -jar ~/softwares/snpEff/snpEff.jar download -c ~/softwares/snpEff/snpEff.config -v MN908947.3`
Run SnpEff as:
# run snpeff annotation on vcf
java -jar ~/softwares/snpEff/snpEff.jar MN908947.3 -fastaProt ../results/example.snpeff.vcf.faa -csvStats ../results/example.snpeff.vcf.stats ../results/example.vcf > ../results/example.snpeff.vcf
# run bcftools csq to link consecutive SNPs on the same codon (BCSQ field)
bcftools csq --force --phase a -f reference.fasta -g ../data/genes.gff -Ov variants.snpeff.vcf -o variants.snpeff.csq.vcf
Run the R scripts:
Rscript ./Parse_SnpEff.r ../results/example.snpeff.vcf ../results/example.snpeff.csv
Please note that the summary csv file only contain the closest match of a SNP rather than all matches.