Skip to content

This script generates a distance matrix among individuals from a vcf file to generate a phylogenetic tree.

Notifications You must be signed in to change notification settings

kiwoong-nam/VCFPhylo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

VCFPhylo

This script generates Euclidean genetic distancedistance matrix(es) between pairs of individuals in a vcf file to generate a phylogenetic tree. Transversional genetic variants are weighted to two.

INSTALLATION: Not required

INPUT FILE: gz-compressed vcf file

HOW TO USE:

<1> converting gz-compressed vcf file to genotype file

$ perl reducedvcf.pl VCF_FILE GENOTYPE_FILE , where VCF_FILE is the file name of your gz-compressed vcf file and GENOTYPE_FILE is the name of output genotype file. When you finished running this script, GENOTYPE_FILE.gz will be created.

<2> creating distance matrixes from the gz-compressed genotype file.

$ perl VCF_FILE GENOTYPE_FILE OUTPUT_PREFIX NUMBER_OF_BOOTSTRAPPING_REPLICATES , where VCF_FILE is the name of gz-compressed vcf file,and GENOTYPE_FILE is gz-compressed genotype file, OUTPUT_PREFIX is output prefix, and NUMBER_OF_BOOTSTRAPPING_REPLICATES is the number of bootstrapping replication.

When you finished running this file, the following two files will be created. (a) OUTPUT_PREFIX.bg.tbl => Distance matrix showing Euclidean distance between a pair of individiuals (b) OUTPUT_PREFIX.boot.tbl => Boostrapping distance matrixes generated by resampling

<3> Generating phylogenetic tree

You can use external software to generate a phylogenetic tree. For example, you can use FastME (http://www.atgc-montpellier.fr/fastme/).

<4> Generating a bootstrapping consensus tree

You can use consense in the Phylip package for this. http://evolution.genetics.washington.edu/phylip/

Citation

Please cite this paper if you use these scripts: https://link.springer.com/article/10.1186/s12862-020-01715-3

About

This script generates a distance matrix among individuals from a vcf file to generate a phylogenetic tree.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages