ABBA BABA
##ABBA-BABA
The ABBA-BABA test takes a predefined phylogeny of four taxa supplied by the user and searches for variant sites that either conform to an ABBA or BABA inheritance pattern, where "A" represents ancestral and "B" derived allele states. Under the null model of incomplete lineage sorting and a lack of gene flow between between populations, we expect ABBA and BABA sites to occur with equal frequency. Finding an excess of either pattern can indicate gene flow between the two taxa with an excess of shared derived alleles.
These patterns can then be combined using window based approaches to calculate Patterson's D statistic, which acts to measure an excess of shared derived alleles in either the ABBA (positive D-stat values) or BABA (negative D-stat values) tree topologies. Under the null assumption, Patterson's D statistic should be zero.
INFO: help
INFO: description:
abba-baba calculates the tree pattern for four indviduals.
This tool assumes reference is ancestral and ignores non abba-baba sites.
The output is a boolian value: 1 = true , 0 = false for abba and baba.
the tree argument should be specified from the most basal taxa to the most derived.
Example:
D C B A
\ / / /
\ / /
\ /
\ /
/
/
--tree A,B,C,D
Output : 4 columns :
1. seqid
2. position
3. abba
4. baba
INFO: usage: abba-baba-zabba --tree 0,1,2,3 --file my.vcf --type PL
INFO: required: t,tree -- a zero based comma seperated list of target individuals corrisponding to VCF columns
INFO: required: f,file -- a properly formatted VCF.
INFO: required: y,type -- genotype likelihood format ; genotypes: GP,GL or PL;
INFO: version 1.0.0 ; date: April 2014 ; author: Zev Kronenberg & EJ Osborne; email : zev.kronenberg@utah.edu
We first start by finding positions across the vcf file that have ABBA vs BABA inheritance patterns according to the user specified phylogeny with the --tree argument. After this, we can run the smoother function to calculate the Patternson's D statistic in specified window sizes across the region. Last, visualization is done with the plotSmoothed.R script.
WARNING code blocks scroll horizontally
cd samples/
../bin/abba-baba --tree 0,1,2,3 --file scaffold612.vcf --type PL > scaffold612.abba-baba.txt
../bin/smoother --format abba-baba --file scaffold612.abba-baba.txt -w 10000 > scaffold612.d-stat.10kb.txt
R --vanilla < ../bin/plotSmoothed.R --args scaffold612.d-stat.10kb.txt abba-baba
The resulting output:
Installing GPAT++ / updating:
SNP methods:
pFst, wcFst, bFst
Basic population statistics
ABBA-BABA test
Haplotype methods:
Haplotype plotting
Linkage disequilibrium
Nucleotide diversity
xpEHH
iHS
hapLrt
Utilities:
Smoothing GPAT++ output
1KG target and background index
Permutation