Skip to content

Source for paper, "Identification and correction of sample mix-ups in expression genetic data: A case study"

License

Notifications You must be signed in to change notification settings

kbroman/Paper_SampleMixups

Repository files navigation

Paper: Identification and correction of sample mix-ups in expression genetic data

DOI

The full manuscript (with supplementary tables and figures) is here.

The paper is available at arXiv and as a formal journal article at G3:

Broman KW, Keller MP, Broman AT, Kendziorski C, Yandell BS, Sen Ś, Attie AD (2015) Identification and correction of sample mix-ups in expression genetic data: A case study. G3 5:2177-2186 PubMed pdf data R/lineup software doi

The data are available at the Mouse Phenome Database, though not in exactly the form used in this repository.

The primary manuscript files are samplemixups_nolegends.Rnw and samplemixups_supp_nolegends.tex.

The Perl script add_legends.pl adds all of the legends, and then the .Rnw file is run through knitr to create a LaTeX file, and the two LaTeX files are sent through pdflatex and xelatex, respectively, to create PDFs.

Things are a bit tricky. In principle, the Makefile tells the full story, but the Analysis/R subdirectory has an asciidoc file for the analyses in the work. That directory has its own Makefile. Cached intermediate results are available at figshare: samplemixups_rcache.zip (This contains a bunch of .RData files that go in Analysis/R/Rcache.)

To compile everything, you can:

  1. Download the cached intermediate results, samplemixups_rcache.zip and unzip them. This will populate Analysis/R/Rcache.

  2. In Analysis/R, run

    R CMD BATCH grab_data.R

    This will download the primary data files. It's quite slow, as it's 2 GB of data to download.

  3. In the primary directory, run make.


Necessary tools

  • R
  • Perl
  • Python 2.7
  • GNU make
  • Asciidoc
  • R packages: knitr, qtl, broman, lineup, ascii, data.table, igraph, beeswarm, RColorBrewer

To Do

  • Do clean tests, with and without the intermediate files

License

The content in this repository is licensed under CC BY.

CC BY

About

Source for paper, "Identification and correction of sample mix-ups in expression genetic data: A case study"

Resources

License

Stars

Watchers

Forks

Packages

No packages published