Skip to content
fgvieira edited this page Apr 5, 2013 · 2 revisions

ngsF is program to estimate inbreeding coefficients per individual

  • Input Data

As input ngsF needs a Genotype Likelihood (GL) file, formated as 3n_indn_sites doubles in binary. Currently, all sites in the file must be variable, so a previous SNP calling step is needed.

  • Stopping Criteria

An issue on iterative algorithms is the stopping criteria. ngsF implements a dual condition threshold: relative difference in log-likelihood and estimates RMSD (F and freq). As for which threshold to use, simulations show that 1e-5 seems to be a reasonable value. However, if you're dealing with low coverage data (2x-3x), it might be worth to use lower thresholds (between 1e-6 and 1e-9).

ngsF has two different methods implemented: a true EM and an approximated EM. The latter was developed to help reduce the computation time of the former by providing more accurate initial values.

Vieira FG, Fumagalli M, Albrechtsen A and Nielsen R (submited). Estimating inbreeding coefficients from NGS data: impact on genotype calling and allele frequency estimation.

Clone this wiki locally