-(** compute the E-step statistics for one site, given the leaves (as would be given to {{:PhyloModel}PhyloModel.infer}). Update the given [sufficient_statistics] and return the probability of the leaves under the model. This should be called for each site to collect the statistics for the whole alignment. *)
+(** compute the E-step statistics for one site, given the leaves (as would be given to {{:PhyloModel}PhyloModel.prepare_lik}). Update the given [sufficient_statistics] and return the probability of the leaves under the model. This should be called for each site to collect the statistics for the whole alignment. *)
(** any entry in the sufficient statistics less than [tol] is set to zero. useful if e.g. planning to compress the sufficient statistics for transport between compute nodes *)
valclean_sufficient_statistics : ?tol:float -> sufficient_statistics -> unit
-(** Probabilistic inference on phylogenetic trees given the substitution (P) matrices on each branch. These are low-level routines that should usually be used through the higher-level {{:PhyloModel}[PhyloModel]} abstractions. *)
+(** core phylogenetic likelihood calculations
-typeintermediate
+These should usually be accessed through the higher-level {{:PhyloModel}[PhyloModel]} abstractions. *)
(** Specifying a leaf (extant) character. Usually the extant character is known with certainty; in this case, use [`Certain] with the index of the character, e.g. [`Certain (Code.Codon61.index ('A','T','G'))]. Alternatively, you can specify an arbitrary probability distribution over extant characters. Lastly, you can specify to marginalize a leaf out of the likelihood calculations entirely. *)
-(** The calculations use a workspace of [((2 * T.size tree - T.leaves tree) * k)] [float]s where [k] is the alphabet size. As a performance optimization, you can create a workspace with [new_workspace tree k] and use it across multiple calls to [prepare]; otherwise it will allocate by itself. *)
+(** The calculations use a workspace of [((2 * T.size tree - T.leaves tree) * k)] [float]s where [k] is the alphabet size. As a performance optimization, you can create a workspace with [new_workspace tree k] and use it across multiple calls to [prepare]; otherwise, it will allocate automatically. *)
valnew_workspace : T.t -> int -> workspace
-(** [prepare tree p_matrices root_prior leaves] infers ancestral states, given:
+typeintermediate
+
+(** [prepare tree p_matrices root_prior leaves] initializes the likelihood calculations, given:
- [tree] the phylogenetic tree
-- [p_matrices.(i)] is the substitution matrix for the branch leading TO node [i] FROM its parent.
+- [p_matrices.(i)] is the substitution matrix for the branch leading {e to} node [i] {e from} its parent.
- [root_prior] the prior probability distribution over characters at the root.
- [leaves] is an array of [leaf]s (see above), the appropriate number for the tree
-- [workspace] is an appropriately sized workspace; it will be allocated if not given.
-@return an abstract value from which various information about ancestral states can be extracted (see below). If using a shared workspace, be sure to get all the results you need before the next call to [prepare].
+- [workspace] is an appropriately sized workspace; one will be allocated if not given.
+@return an abstract value from which various information about ancestral states can be extracted (see below). The time-consuming calculations are not actually performed until needed. If using a shared workspace, be sure to get all the results you need before the next call to [prepare].
(** calculate the probability of the leaves under the substitution model *)
vallikelihood : intermediate -> float
-(** compute the posterior probability distribution over characters at the specified node *)
+(** compute the posterior probability distribution over characters at the specified node.*)
valnode_posterior : intermediate -> int -> floatarray
(** [branch_posteriors intermediate k] computes the posterior probability of each possible substitution on the specified branch [k]. That is, entry [(i,j)] of the returned matrix is [P(parent(k) = i && k = j | Leaves)] *)
0 comments on commit
6bb5eda