This package contains the basic Matlab scripts used in the paper: "Identification of Rare alleles And Their Carriers using compressed se(que)nsing" Noam Shental, Amnon Amir and Or Zuk. Nucleic Acid Research, 38(19):e179, 2010
The package enables simulations of different sequencing pooling experiments, including pooling design, modeling sequencing errors, reconstruction of genotype vectors and evaluating reconstruction accuracy. It may serve as a guideline for designing pooling sequencing experiments. One may also enter real allele counts collected in pooled experiments and reconstruct the genotype.
Please look at the following files for the different applications:
pooling_example.m - an introductory example of a `single' scenario, i.e., parameter setting: One reconstructs a certain genotype, given a certain number of individuals, with a certain number of pools, a specific number of reads, etc.
find_limiting_factor.m - Compute the maximal sample size we can reconstruct for a given number of lanes, or the minimal number of lanes sufficient to reconstruct a certain number of individuals.
reconstruct_exp.m - reconstructs a genotype from reads collected experimentally.
These scripts use a specific file taken from the GPSR package (relevant file included). We thank the Mario Figueiredo for allowing us to include this file in our package.
The package was tested on Matlab 2009b, but should work on most Matlab versions. In case one performs very large scale experiments, it is recommended to compile the GPSR file. This may be simply done by the Matlab compiler toolbox. If this toolbox is unavailable at your local Matlab installation, please send us an email stating the operating system, and we will send a compiled version.