Citation: Bensasson, D. “Evidence for a High Mutation Rate at Rapidly Evolving Yeast Centromeres.” BMC Evol Biol 11 (2011): 211. doi:10.1186/1471-2148-11-211.
A simple perl script to annotate Saccharomyces centromeres: This script uses a regular expression to recognise the consensus sequence motifs for CDEI and CDEIII described in Baker and Rogers 2005, Genetics 171(4):1463-1475.
Citation: Bensasson, D. “Evidence for a High Mutation Rate at Rapidly Evolving Yeast Centromeres.” BMC Evol Biol 11 (2011): 211. doi:10.1186/1471-2148-11-211.
A perl script implementing the four-gamete test for recombination described in Hudson and Kaplan (1985) Genetics 111(1)147-164. This test will not tell you whether or not there is recombination in your data. The script only outputs a summary of the alleles at each site, an alignment of only the segregating sites for easier visualisation of pairs of sites, and a list of the pairs of sites detected by the four-gamete test. In cases, where there are only 2 alleles at each site, these are cases where all 4 possible pairs of SNPs occur at the two sites. This implies either homoplasy (convergent mutation) or a recombination somewhere between the 2 sites.
e.g. this combination can only be explained by homoplasy or recombination and fourgamete.pl would report positions 1 and 37 as a pair of sites detected by the four-gamete test
position1 position37
taxon1 T G
taxon2 T A
taxon3 G A
taxon4 G G
For DNA sequence data, homoplasy is a likely explanation for many of these sites. This script does not summarise the data or tell you how many recombination events there are or where.
Citation: Bensasson, D., Dicks, J., Ludwig, J.M., Bond, C.J., Elliston, A., Roberts, I.N., James, S.A., 2018. Diverse lineages of Candida albicans live on old oaks. bioRxiv 341032. https://doi.org/10.1101/341032
A perl script that uses a vcf file generated by samtools to draw B-allele frequency plots in R.
Citation: Tilakaratna, V., Bensasson, D., 2017. Habitat Predicts Levels of Genetic Admixture inSaccharomyces cerevisiae. G3 (Bethesda) 7, 2919–2929. https://doi.org/10.1534/g3.117.041806
Converts DNA sequence alignments in fasta format to STRUCTURE input files that summarises bases at variable sites.
Runs STRUCTURE one time for each value of K in a specified range (e.g. from 1 to 10).
Plots STRUCTURE results as barplots using R with user control of colors.
This directory contains scripts used to manipulate DNA sequence in fasta format:
Citation: Tilakaratna, V., Bensasson, D., 2017. Habitat Predicts Levels of Genetic Admixture inSaccharomyces cerevisiae. G3 (Bethesda) 7, 2919–2929. https://doi.org/10.1534/g3.117.041806
A perl script that concatenates multiple alignment files in fasta format into a single large alignment file
This perl script converts fasta format sequence into alignments in phylip format.
A perl script to choose a subset of sequences from a fasta file. Currently, the script searches for the names provided by the user in a way that is case insensitive, and the pattern can be matched anywhere in the first word of the fasta name descriptor line.
A perl script to summarise the length of DNA sequences in a fasta file. The -g option is useful for showing the ungapped length of DNA sequences in an alignment.
Citation: Bensasson, D., Dicks, J., Ludwig, J.M., Bond, C.J., Elliston, A., Roberts, I.N., James, S.A., 2018. Diverse lineages of Candida albicans live on old oaks. bioRxiv 341032. https://doi.org/10.1101/341032
A perl script that uses alignment(s) in fasta format to identify the most similar sequences to a study strain in sliding windows. It will optionally produce plots of chromosomes/alignments in R that are colored according to similarity to a panel of reference clades. See example data (in faChrompaintData/) and use in https://doi.org/10.1101/341032.