Skip to content

Add Tree.generate_random() method #1033

@hyanwong

Description

@hyanwong

It's now trivial to add a tskit.Tree.generate_random() method for generating random bifurcating trees with equal probability for each topology. All we have to do is:

def generate_random(self, num_leaves, random_seed=None):
    return self.generate_star(num_leaves).split_polytomy(random_seed)

And we probably don't need another unit test for this, because we already test it here, so all we need to do (IMO) is change that function to generate_random() rather than split_polytomies(), and we should be covered.

The arguments for this (taken from #1030 (comment)) are

  1. It allows tskit to produce random trees without relying on an external dependency. That could be useful e.g. for SLiM testing, or what have you.
  2. It's the sort of thing that a phylogeneticist might expect in a phylogenetic library (if we are trying to appeal to that audience)
  3. The distribution of coalescence tree topologies is not equiprobable - see below - the top plot is msprime topologies, the bottom is from split_polytomy.
import matplotlib.pyplot as plt
import collections

import tqdm
import msprime
import tskit

msprime_counts = collections.defaultdict(int)
split_poly_counts = collections.defaultdict(int)
for s in tqdm.trange(1, 10000):
    msprime_counts[msprime.simulate(5, random_seed=s).first().rank()] += 1

for s in tqdm.trange(1, 10000):
    split_poly_counts[tskit.Tree.generate_star(5).split_polytomies(random_seed=s).rank()] += 1

fig, axes = plt.subplots(2)
axes[0].bar(list(range(len(msprime_counts))), msprime_counts.values())
axes[1].bar(list(range(len(split_poly_counts))), split_poly_counts.values())

image

I'm happy to submit this as a PR if the approach I describe is the right one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions