Skip to content

4. Interpreting Output

caitiecollins edited this page Jun 11, 2024 · 26 revisions

Plots

If plot.tree is set to TRUE, the tree will be plotted, with the terminal phenotype indicated by tip colour. If plot.null.dist is set to TRUE, the null distributions and findings will be plotted for each association score. If plot.dist is TRUE, a distribution of the empirical association score values will be plotted for each association score. And if plot.manhattan is set to TRUE, a manhattan plot will show the values for each each association score, with a threshold delineating the significant findings from the insignificant.

By default, plot.tree is set to TRUE.

By default, plot.null.dist is set to TRUE.

By default, plot.manhattan is also set to TRUE.

Printing Output

In the example above, the output of the treeWAS function was assigned to an object called out. It is a large object. As such, we recommend that the print.treeWAS function be used to examine the set of results identified (NB: print.treeWAS is just the print function for an object of class treeWAS):

## Example output:
data(treeWAS.example.out)
out <- treeWAS.example.out
class(out) # treeWAS
print(out, sort.by.p=FALSE)

Output Returned

The output of treeWAS contains the set of significant loci identified as well as all relevant information used by or generated within treeWAS. The treeWAS function returns a list object (in our example, called out), which takes on the following general structure (run str(out) to examine the structure of our example output):


$treeWAS.combined, the first element, is a list of length two containing the identities of significant findings:

$treeWAS.combined : The pooled set of significant loci identified by any association score.

$treeWAS : A list with the sets of significant loci identified by each association score individually.


$[SCORE], list elements for each association score, contain the original score values for each locus and additional information for significant loci. By default, there will be three such $[SCORE]-type elements called $terminal, $simultaneous, and $subsequent, each of which will have the following elements:

$corr.dat : The association score values for loci in the empirical genetic dataset.

$corr.sim : The association score values for loci in the simulated genetic dataset.

$p.vals : The p-values associated with the loci in the empirical genetic dataset for this association score.

$sig.thresh : The significance threshold for this association score.

$sig.snps : A data frame describing the genetic loci identified as significant. The last four columns will only be present if the data is binary, in which case they will contain the cell counts of a 2x2 table of genotypic and phenotypic states for each significant locus.
row.names: The column names of significant loci.
$SNP.locus: The column positions of significant loci in dat$snps (see below).
$p.value: The p-values for significant loci.
$score: The association score values for significant loci. The sign indicates the relative direction of the association, while the value indicates the strength of the association according to the metric in question.
$G1P1: The number of individuals with genotype = 1 and phenotype = 1 at this locus.
$G0P0: The number of individuals with genotype = 0 and phenotype = 0 at this locus.
$G1P0: The number of individuals with genotype = 1 and phenotype = 0 at this locus.
$G0P1: The number of individuals with genotype = 0 and phenotype = 1 at this locus.

$min.p.value : The minimum p-value. P-values listed as zero can only truly be defined as below this value.


$dat, the final element, contains all of the data either used by or generated within treeWAS. Objects that were provided as inputs to the treeWAS function will be returned here in the form in which they were analysed (i.e., after data cleaning within treeWAS).

$snps : The empirical genetic data matrix.

$snps.reconstruction : The ancestral state reconstruction of the empirical genetic data matrix.

$snps.sim : The simulated genetic data matrix.

$snps.sim.reconstruction : The ancestral state reconstruction of the simulated genetic data matrix.

$phen : The phenotypic variable.

$phen.reconstruction : The ancestral state reconstruction of the phenotype.

$tree : The phylogenetic tree.

$n.subs : The homoplasy distribution. Each element represents a number of substitutions (from 1 to length(n.subs)) and contains the number of loci that have been inferred to undergo that many substitutions.