Skip to content

genFamily

Thomas P Spargo edited this page Oct 17, 2022 · 2 revisions

genFamily function documentation

Updated 17/10/2022

The repository is maintained by Thomas Spargo (thomas.spargo@kcl.ac.uk) - please reach out with any questions.


Description

The genFamily function generates a vector detailing disease states ("Unaffected"/"Sporadic"/"Familial") assigned to a simulated family across time points (or at max lifetime risk) according to various parameters and disease characteristics associated with the simulated variant, as specified by varChars.

Usage examples

#Define the variant characteristics using varChars function
var.Char<- varChars(f=0.75,g=0,numsteps=10,onsetRateDiff=1)
 
#Short lapply genFamily for a small example population of families harbouring variant f (group="var").
# 0:5 is supplied to the f_sibs argument.
lapply(0:5,genFamily,group="var",var.Char=var.Char)

#Alternatively, do just one family (in this case with 5 sibs, and with a family without the f variant):
genFamily(f_sibs=5,group="novar",var.Char=var.Char)

Arguments

f_sibs - numeric indicating number of siblings to generate within the family. If 0, the family will contain 2 individuals, representing two unrelated 'parents'.

group - character string of "var" or "novar". Is "var" if one parent harbours a variant of penetrance 'f' as represented within the var.Char matrix. Is "novar" if no parent harbours this variant, and thus all family members have lifetime disease risk 'g' as represented within the var.Char matrix. See var.Char and Details.

var.Char - A matrix in the format of the varChars function output (The input should have the class 'var.Char.matrix' which is assigned by the varChars function).

final_time - Logical, defaults to FALSE. If TRUE, family is generated at a single time point, when all members are assigned the maximum risk according to their variant status.

return_indivs - Logical, defaults to FALSE. If FALSE, return a vector of family disease state assignments at each time (no list). If TRUE, return a more detailed output as a list, indicating (in addition to the overall family disease state) individual family member's age at time 0 and affectedness (1 if affected, 0 if not), at each time of sampling.

eldestAt0 - Logical, defaults to FALSE. See Details.

stepHazard - Logical, defaults to FALSE. See Details.

Details

Intended use is within lapply to simulate a population of families of various sibship sizes (as indicated using f_sibs).

Further argument information:

eldestAt0: If TRUE, adjusts the family ages such that the eldest parent is assigned age 0 at the first time of sampling, where 0 is the last time point where no family members could be affected by disease as indicated by risks stored within the var.Char matrix. If FALSE, the youngest family member is at age 0, and thus at the first time of sampling all other family members have some probability of being affected by disease, according to their variant status and risk indicated in var.Char. Setting eldestAt0=TRUE is not representative of a real population since all people have parents who may have been affected at an earlier time.

stepHazard: If FALSE, disease risks at each time point are determined using the affAtAge subfunction (see here). If TRUE, disease risk at the first time of sampling is determined using the affAtAge subfunction. Thereafter, cumulative risk across each subsequent time of sampling is determined based on additional risk across the relevant ages of for onset (determined according to numsteps, onsetRateDiff, and whether the individual has the variant). Setting TRUE is not recommended because higher ('familial'/'sporadic') disease state proportions are systematically underrepresented under the 'stepwise' approach.

Output

If return_indivs=FALSE: A character vector is returned detailing the disease states assigned to the simulated families over time, until the time at which the youngest sib has surpassed the final age at which they might be affected.

If return_indivs=TRUE: A list containing 2 elements:

$state: The disease states over time, as when return_indivs=FALSE

$family: A matrix detailing family members as rows (with corresponding variant statuses ("novar", "var", and "possvar") as row names) and each individual family member's affectedness (1 if affected, 0 if not), at each time of sampling.

Clone this wiki locally