Skip to content

Change MutationTable.node to MutationTable.edge? #668

@jeromekelleher

Description

@jeromekelleher

Currently a mutation knows what node it happened over, but really it should know what edge it happened on. Conceptually, rather than storing the node ID, we should be storing the edge ID - then, you can look up both the child and parent node for a given mutation, as well as know the spatial extent of the edge.

I can't think of any immediate applications of this beyond checking that mutation times are valid and viz, so I think it's perhaps it's one to acknowledge as a sub optimal design and move on. It would cause a fair bit of breakage if we were to change the Tables API so that the MutationTable.node became MutationTable.edge. The TreeSequence API could be updated in a non-breaking way, I'd imagine, as we'd fill out Mutation.node by looking up the edge table.

There's a few ways we could go about doing this, if we did it. Here's the "ripping off the band aid" way:

  • In the C Tables API, change and references to node to edge.
  • Change the file format to store the edge instead of node.
  • In the C mutation_t struct, add a new edge attribute. Fill in the old node item by looking up the edge table. Code that uses the mutation_t struct shouldn't be affected at all then.
  • Similarly, in Python, we change and break the Tables API, but keep the compatability in the high-level TreeSequence/Tree API. The MutationTable could also provide the node column as a property computed on demand, which should really minimise downstream breakage.

I don't think there'd be that much downstream breakage if we change the tables API. Most things that use the Tables API are either within tskit, or closely related projects. Since we're already going through a major file version bump for adding the time field to mutations (#513), there's no additional pain caused there.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions