Tool to convert vcf file in DIYABC input file for SNP data
- python 2.x
- A version compatible with python 3.x is avaible on branch
- Windows users: This script is unable to run if python is not installed on your machine. I strongly recommand you do the necessary step to achieve this by following this link, since it's probably gonna be useful for many other bioinformatic purposes.
TODO: I'm currently trying to produce a self contained executable to run the script without the need for a python installation with py2exe.
Unix users can clone this repository if git is installed on their system with:
git clone https://github.com/loire/vcf2DIYABC.snp.git
A link to a zip archive of this repository is on the bottom right of this page.
To install, just unzip the archive in a local directory.
- Unix and mac user can use the terminal and type
- On windows, navigate in the file manager to find where you download and uncompressed the repository and then right click on the script file to execute it.
Once launched, the program will ask you to enter the path and name of your input files.
- a vcf file (tested on the 4.0 version of this file format)
- A text file which specify individuals sex and population of origin. This information is not present in a typical vcf file while being needed for DIYABC analysis. This file contains a line for each individual. First column is the individual ID (same as the name in your vcf file), second is the sex ("M" or "F" in uppercase, "9" if data is missing), and third is the population name.
NA00001 M pop1 NA00002 9 pop2 NA00003 F pop2
You can find an example of each file in example directory of this repo. A perfect match is required between names of individuals in the vcf file and those in the individuals informations file.
This script was written by Etienne Loire in the CBGP lab of the french institute for agronomical research. it's free to use or modify under the GPL license.