badMIXTUREexample: checking a mixture decomposition from PLINK data

This example runs through the process of: 0. Getting the data

  1. Converting plink data to mixPainter (chromopainter) format
  2. Running ADMIXTURE
  3. Running mixPainter in a way that allows comparisons to the ADMIXTURE run
  4. Getting the data out of mixPainter and into the badMIXTURE R package.

Step 0: Getting the data

From a linux or mac terminal, copy the example data from our repository:

## Get the data
## For linux:
alias mywget=wget
## For mac OSX:
if [ `uname` == "Darwin" ]; then
   alias mywget="curl -O"
files="Recent_admix.bim Marginalisation_admix.bim Remnants_admix.bim Remnants_admix.fam Recent_admix.fam Marginalisation_admix.fam Marginalisation_admix.bed Recent_admix.bed Remnants_admix.bed"
for file in $files; do

## Get the scripts to process the data
git clone
## Note that this stage requires git. You technically don't need it; you can download the scripts manually if you would prefer.  put them in a folder called "badMIXTUREexample" under your working directory for all the paths to work correctly.

Step 1a: Get all the tools you need


Follow the instructions at Specifically:


External tools:

## These are the tools we need. You can either put them all in your path, or update the variables with their full path
plink="plink1.9" # Available from
plink2chromopainter="" # included in the finestructure download:
convertrecfile="" # included in the finestructure download
makeuniformrecfile="" # included in the finestructure download
admixture="admixture" # available from

Step 1b: Creating the commands to do the conversion

./ # This generates calls to "" that will process the three datasets.
## AT THIS POINT YOU SHOULD READ AND CHECK!! Understanding that is the point of this exercise!
## In particular, if you just run the scripts, they will use 8 cores of your machine for several days. You might not want this!

Steps 2-4: Running everything

The included script "" explains how we did everything for the paper. The process is:

  1. Run plink to convert the file to ped/map 1a. (Optionally, prune the data down to fewer SNPs)
  2. Run to convert to chromopainter format
  3. (Optionally, create a recombination map with either or
  4. Run mixPainter

Each of these steps is a single command.

## Additional information

ADMIXTURE takes 147m using 8 cores to process the complete dataset. mixPainter takes 1962m using 8 cores. They scale similarly with the number of SNPs (linearly) and the number of individuals (mixPainter is quadratic).


An example using badMIXTURE and mixPainter




