Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get the mutational spectrum of a specified node and its descendants #148

Open
szhan opened this issue Apr 12, 2023 · 2 comments
Open

Get the mutational spectrum of a specified node and its descendants #148

szhan opened this issue Apr 12, 2023 · 2 comments

Comments

@szhan
Copy link
Contributor

szhan commented Apr 12, 2023

Bloom et al. (2023) MBE examined clade-specific mutational spectra. In Figure 1, they compared VoC-specific mutational spectra and noticed changes in mutational spectra (e.g., Omicron versus early clades).

It should be useful to extend the get_mutation_spectrum function in sc2ts to get the mutation spectrum of a clade under a specified ancestral node. We may also want to allow for including and/or excluding nested clades under nodes with certain metadata (e.g., GISAID clade or Pango lineage).

@szhan
Copy link
Contributor Author

szhan commented Apr 12, 2023

Maybe something like this? Am I thinking about the descendants across trees correctly?

def get_mutation_spectrum(
    self,
    min_inheritors=1,
    focal_node=None,
    exclude_nodes=None,
):
    """
    Returns a counter of mutations in the ARG, representing a mutation spectrum.

    :param min_inheritors: Minimum number of inheritors of a mutation.
    :param focal_node: If not None, only mutations in the subtree of this node are considered.
    :param exclude_nodes: If not None, mutations in the subtree of these nodes are excluded.
    """
    if focal_node is not None:
        # Get the descendants of this node across trees.
        pass
    if exclude_nodes is not None:
        # Get the descendants of these nodes across trees.
        pass
    keep = self.mutations_num_inheritors >= min_inheritors
    # Keep mutations ocurring on the descendants of the focal node, but not on the focal node itself.
    # Exclude mutations ocurring on the descendants of the exclude nodes, but not on the exclude nodes themselves.
    inherited = self.mutations_inherited_state[keep]
    derived = self.mutations_derived_state[keep]
    sep = inherited.copy()
    sep[:] = ">"
    x = np.char.add(inherited, sep)
    x = np.char.add(x, derived)
    return collections.Counter(x)

@jeromekelleher
Copy link
Owner

Good idea, that should be doable. I guess the straightforward way to do this for a single node is to iterate over the trees, and then look at all mutations at sites on that tree, and keep them if they have the node in question as an ancestor.

This will work fine for the application here I think, since we only care about a few important nodes. keeping the mutation spectrum for all nodes is harder, but doable I think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants