Skip to content
Branch: master
Go to file
Code

Latest commit

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
R
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

ReadMe.md

Paper: Identification and correction of sample mix-ups in expression genetic data

doi badge

The full manuscript (with supplementary tables and figures) is here.

The paper is available at arXiv and as a formal journal article at G3:

Broman KW, Keller MP, Broman AT, Kendziorski C, Yandell BS, Sen Ś, Attie AD (2015) Identification and correction of sample mix-ups in expression genetic data: A case study. G3 5:2177-2186 PubMed pdf data R/lineup software doi

The data are available at the Mouse Phenome Database, though not in exactly the form used in this repository.

The primary manuscript files are samplemixups_nolegends.Rnw and samplemixups_supp_nolegends.tex.

The Perl script add_legends.pl adds all of the legends, and then the .Rnw file is run through knitr to create a LaTeX file, and the two LaTeX files are sent through pdflatex and xelatex, respectively, to create PDFs.

Things are a bit tricky. In principle, the Makefile tells the full story, but the Analysis/R subdirectory has an asciidoc file for the analyses in the work. That directory has its own Makefile. Cached intermediate results are available at figshare: samplemixups_rcache.zip (This contains a bunch of .RData files that go in Analysis/R/Rcache.)

To compile everything, you can:

  1. Download the cached intermediate results, samplemixups_rcache.zip and unzip them. This will populate Analysis/R/Rcache.

  2. In Analysis/R, run

    R CMD BATCH grab_data.R

    This will download the primary data files. It's quite slow, as it's 2 GB of data to download.

  3. In the primary directory, run make.


Necessary tools

  • R
  • Perl
  • Python 2.7
  • GNU make
  • Asciidoc
  • R packages: knitr, qtl, broman, lineup, ascii, data.table, igraph, beeswarm, RColorBrewer

To Do

  • Do clean tests, with and without the intermediate files

License

The content in this repository is licensed under CC BY.

CC BY

About

Source for paper, "Identification and correction of sample mix-ups in expression genetic data: A case study"

Resources

License

You can’t perform that action at this time.