Skip to content

Commit

Permalink
Deprecate WrappedTreeSequence (#841)
Browse files Browse the repository at this point in the history
* Deprecate WrappedTreeSequence.

* move function implementations to fwdpy11.tskit_tools
  • Loading branch information
molpopgen committed Nov 1, 2021
1 parent 3dc94bf commit c5f682d
Show file tree
Hide file tree
Showing 8 changed files with 177 additions and 81 deletions.
32 changes: 16 additions & 16 deletions doc/long_vignettes/tskit_metadata_vignette.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ fwdpy11.evolvets(rng, pop, params,
simplification_interval=100,
suppress_table_indexing=True)
ts = pop.dump_tables_to_tskit(demes_graph=graph, wrapped=True)
ts = pop.dump_tables_to_tskit(demes_graph=graph)
```

Now that we have some data, let's look at how the `fwdpy11` mutation and individual information got encoded as `tskit` metadata!
Expand All @@ -174,21 +174,21 @@ Now that we have some data, let's look at how the `fwdpy11` mutation and individ
By default, the metadata decode as a {class}`dict`:

```{code-cell} python
for m in ts.ts.mutations():
for m in ts.mutations():
print(m.metadata)
```

We can call {func}`fwdpy11.tskit_tools.WrappedTreeSequence.decode_mutation_metadata` to convert the `dict` to a {class}`fwdpy11.Mutation`. Thus function returns a {class}`list` because it can be accessed using a {class}`slice`:
We can call {func}`fwdpy11.tskit_tools.decode_mutation_metadata` to convert the `dict` to a {class}`fwdpy11.Mutation`. Thus function returns a {class}`list` because it can be accessed using a {class}`slice`:

```{code-cell}
m = ts.decode_mutation_metadata(0)
m = fwdpy11.tskit_tools.decode_mutation_metadata(ts, 0)
type(m[0])
```

With no arguments, all metadata are converted to the `Mutation` type:

```{code-cell} python
for m in ts.decode_mutation_metadata():
for m in fwdpy11.tskit_tools.decode_mutation_metadata(ts):
print(m.esizes)
```

Expand All @@ -206,8 +206,8 @@ Let's mutate a copy of our tree sequence.
We will apply a binary, infinite-site model to add neutral mutations:

```{code-cell} python
tscopy = msprime.sim_mutations(ts.ts.tables.tree_sequence(), rate=RECRATE / L, model=msprime.BinaryMutationModel(), discrete_genome=False, random_seed=615243)
print("Our original number of mutations =", ts.ts.num_mutations)
tscopy = msprime.sim_mutations(ts.tables.tree_sequence(), rate=RECRATE / L, model=msprime.BinaryMutationModel(), discrete_genome=False, random_seed=615243)
print("Our original number of mutations =", ts.num_mutations)
print("Our new number of mutations =", tscopy.num_mutations)
```

Expand All @@ -217,9 +217,9 @@ for i in tscopy.mutations():
if i.metadata is None:
metadata_is_none += 1
assert metadata_is_none == tscopy.num_mutations - ts.ts.num_mutations
assert metadata_is_none == tscopy.num_mutations - ts.num_mutations
# mut be true because we asked for an "infinite-sites" mutation scheme
assert metadata_is_none == tscopy.num_sites - ts.ts.num_mutations
assert metadata_is_none == tscopy.num_sites - ts.num_mutations
```


Expand All @@ -228,19 +228,19 @@ assert metadata_is_none == tscopy.num_sites - ts.ts.num_mutations
As with mutations, individual metadata automatically decode to {class}`dict`:

```{code-cell} python
print(ts.ts.individual(0).metadata)
print(type(ts.ts.individual(0).metadata))
print(ts.individual(0).metadata)
print(type(ts.individual(0).metadata))
```

It is often more efficient to decode the data into {class}`fwdpy11.tskit_tools.DiploidMetadata` (which is an {mod}`attrs`-based analog to {class}`fwdpy11.DiploidMetadata`).
As with mutation metadata, {func}`fwdpy11.tskit_tools.WrappedTreeSequence.decode_individual_metadata` returns a list:
As with mutation metadata, {func}`fwdpy11.tskit_tools.decode_individual_metadata` returns a list:

```{code-cell} python
ts.decode_individual_metadata(0)
fwdpy11.tskit_tools.decode_individual_metadata(ts, 0)
```

```{code-cell} python
print(type(ts.decode_individual_metadata(0)[0]))
print(type(fwdpy11.tskit_tools.decode_individual_metadata(ts, 0)[0]))
```

The main difference between this Python class and its C++ analog is that the former contains several fields that decode the `flags` column of the individual table.
Expand All @@ -250,7 +250,7 @@ See {ref}`here <tskit_tools>` for details.
## Traversing all time points for which individuals exist

The example simulation preserves individuals at many different time points.
Use {func}`fwdpy11.tskit_tools.WrappedTreeSequence.timepoints_with_individuals` to automate iterating over each time point:
Use {func}`fwdpy11.tskit_tools.iterate_timepoints_with_individuals` to automate iterating over each time point:

```{code-cell} python
import pandas as pd
Expand All @@ -259,7 +259,7 @@ pd.set_option("display.max_rows", 11)
times = []
num_nodes = []
len_metadata = []
for time, nodes, metadata in ts.timepoints_with_individuals(decode_metadata=True):
for time, nodes, metadata in fwdpy11.tskit_tools.iterate_timepoints_with_individuals(ts, decode_metadata=True):
times.append(time)
num_nodes.append(len(nodes))
len_metadata.append(len(metadata))
Expand Down
5 changes: 5 additions & 0 deletions doc/misc/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,11 @@ Bug fixes
Issue {issue}`836`
PR {pr}`838`

Dependencies

* Deprecate `fwdpy11.tskit_tools.WrappedTreeSequence`
PR {pr}`841`

Back end changes:

* Add more runtime checks to {func}`fwdpy11.DiploidPopulation.add_mutation`
Expand Down
11 changes: 11 additions & 0 deletions fwdpy11/_types/diploid_population.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import warnings
from typing import IO, Dict, Iterable, Iterator, List, Optional, Tuple, Union

import demes
Expand Down Expand Up @@ -276,7 +277,17 @@ def dump_tables_to_tskit(
Fixed bug that could generate a :class:`tskit.PopulationTable`
with an incorrect number of rows.
.. versionchanged:: 0.17.0
The `wrapped` keyword argument is deprecated.
"""
if wrapped is True:
warnings.warn(
FutureWarning(
"the wrapped kwarg is deprecated and will be removed in a future release"
)
)
return fwdpy11.tskit_tools._dump_tables_to_tskit._dump_tables_to_tskit(
self,
model_params=model_params,
Expand Down
68 changes: 68 additions & 0 deletions fwdpy11/tskit_tools/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,74 @@
.. versionadded:: 0.8.0
"""

import typing
import warnings

import numpy as np
import tskit

from ._flags import * # NOQA
from .metadata import (DiploidMetadata, decode_individual_metadata,
decode_mutation_metadata)
from .trees import WrappedTreeSequence


def get_toplevel_metadata(ts: tskit.TreeSequence, name) -> typing.Optional[object]:
if name in ts.metadata:
return ts.metadata[name]
return None


def iterate_timepoints_with_individuals(
ts: tskit.TreeSequence, *, decode_metadata=False
):
"""
Return an iterator over all unique time points with individuals.
:param ts: A tree sequence
:type ts: tskit.TreeSequence
:param decode_metadata: Flag to decode individual metadata or not
:type decode_metadata: bool
For each time point a tuple of (time, nodes, metadata) is yielded.
:param decode_individual_metadata: Whether to return decoded metadata.
:type decode_individual_metadata: bool
If `decode_individual_metadata` is `True`, metadata will be stored in
a :class:`list` of :class:`fwdpy11.tskit_tools.DiploidMetadata`.
If `False`, `None` will be yielded.
"""

# Get rows of the node table where the nodes are in individuals
nodes_in_individuals = np.where(ts.tables.nodes.individual != tskit.NULL)[0]

# Get the times
node_times = ts.tables.nodes.time[nodes_in_individuals]

unique_node_times = np.unique(node_times)

for utime in unique_node_times[::-1]:
# Get the node tables rows in individuals at this time
x = np.where(node_times == utime)
node_table_rows = nodes_in_individuals[x]
assert np.all(ts.tables.nodes.time[node_table_rows] == utime)

# Get the individuals
individuals = np.unique(ts.tables.nodes.individual[node_table_rows])
assert not np.any(individuals == tskit.NULL)

if decode_metadata is True:
# now, let's decode the individual metadata for this time slice
decoded_individual_metadata = decode_individual_metadata(
ts,
individuals,
)
else:
decoded_individual_metadata = None
yield utime, node_table_rows, decoded_individual_metadata


def load(filename: str):
"""
Load a tree sequence from a file.
Expand All @@ -42,5 +104,11 @@ def load(filename: str):
"""
import tskit

warnings.warn(
FutureWarning(
"fwdpy11.tskit_tools.load is deprecated. Please use tskit.load instead."
)
)

ts = tskit.load(filename)
return WrappedTreeSequence(ts=ts)
8 changes: 8 additions & 0 deletions fwdpy11/tskit_tools/metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,10 @@ def decode_individual_metadata(
.. versionchanged:: 0.15.0
Add index/slice access to the table.
.. versionchanged:: 0.17.0
Change input from TableCollection to TreeSequence
"""
rv = []

Expand Down Expand Up @@ -197,6 +201,10 @@ def decode_mutation_metadata(
some elements to be `None`.
Add index/slice access to the table.
.. versionchanged:: 0.17.0
Change input from TableCollection to TreeSequence
"""
mutations = []
if rows is None:
Expand Down
64 changes: 21 additions & 43 deletions fwdpy11/tskit_tools/trees.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#
import json
import typing
import warnings

import numpy as np
import tskit
Expand All @@ -38,14 +39,23 @@ class WrappedTreeSequence(object):
or a call to :func:`fwdpy11.tskit_tools.load`.
.. versionadded:: 0.15.0
.. deprecated:: 0.17.0
"""

def _toplevel_metadata_value(self, name):
if name in self._ts.metadata:
return self._ts.metadata[name]
return None
from fwdpy11.tskit_tools import get_toplevel_metadata

return get_toplevel_metadata(self._ts, name)

def __init__(self, ts: tskit.TreeSequence):
warnings.warn(
FutureWarning(
"fwdpy11.tskit_tools.WrappedTreeSequence is deprecated. "
"Please use tskit.load and standalone functions "
"from fwdpy11.tskit_tools instead"
)
)
found = False
for row in ts.provenances():
record = json.loads(row.record)
Expand All @@ -58,49 +68,17 @@ def __init__(self, ts: tskit.TreeSequence):
self._ts = ts

def timepoints_with_individuals(self, *, decode_metadata=False):
"""
Return an iterator over all unique time points with individuals.
For each time point a tuple of (time, nodes, metadata) is yielded.
:param decode_individual_metadata: Whether to return decoded metadata.
:type decode_individual_metadata: bool
from fwdpy11.tskit_tools import iterate_timepoints_with_individuals

If `decode_individual_metadata` is `True`, metadata will be stored in
a :class:`list` of :class:`fwdpy11.tskit_tools.DiploidMetadata`.
If `False`, `None` will be yielded.
"""
# Get rows of the node table where the nodes are in individuals
nodes_in_individuals = np.array(
[i for i, j in enumerate(self._ts.nodes()) if j.individual != tskit.NULL]
warnings.warn(
FutureWarning(
"use fwdpy11.tskit_tools.iterate_timepoints_with_individuals instead"
)
)

# Get the times
node_times = np.array([self._ts.node(n).time for n in nodes_in_individuals])

unique_node_times = np.unique(node_times)

for utime in unique_node_times[::-1]:
# Get the node tables rows in individuals at this time
x = np.where(node_times == utime)[0]
node_table_rows = nodes_in_individuals[x]
assert np.all([self._ts.node(n).time == utime for n in node_table_rows])

# Get the individuals
individuals = np.unique(
[self._ts.node(n).individual for n in node_table_rows]
)
assert not np.any(individuals == tskit.NULL)

if decode_metadata is True:
# now, let's decode the individual metadata for this time slice
decoded_individual_metadata = decode_individual_metadata(
self._ts,
individuals,
)
else:
decoded_individual_metadata = None
yield utime, node_table_rows, decoded_individual_metadata
return iterate_timepoints_with_individuals(
self._ts, decode_metadata=decode_metadata
)

def decode_individual_metadata(
self, rows: typing.Optional[typing.Union[int, slice]] = None
Expand Down
14 changes: 11 additions & 3 deletions tests/test_tskit_metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@
# along with fwdpy11. If not, see <http://www.gnu.org/licenses/>.
#

import warnings

import demes
import fwdpy11
import pytest
Expand Down Expand Up @@ -117,7 +119,9 @@ def test_user_defined_data(pop):
assert ts.metadata["data"]["mydata"] == 11

# Test WrappedTreeSequence propery
wts = fwdpy11.tskit_tools.WrappedTreeSequence(ts)
with warnings.catch_warnings():
warnings.filterwarnings("ignore", category=FutureWarning)
wts = fwdpy11.tskit_tools.WrappedTreeSequence(ts)
assert wts.data["mydata"] == 11

class MyType(object):
Expand All @@ -131,7 +135,9 @@ def __repr__(self):
assert eval(ts.metadata["data"]).x == 11

# Test WrappedTreeSequence property
wts = fwdpy11.tskit_tools.WrappedTreeSequence(ts)
with warnings.catch_warnings():
warnings.filterwarnings("ignore", category=FutureWarning)
wts = fwdpy11.tskit_tools.WrappedTreeSequence(ts)
assert eval(wts.data).x == 11


Expand All @@ -141,7 +147,9 @@ def test_seed(pop):
assert ts.metadata["seed"] == 333

# Test WrappedTreeSequence property
wts = fwdpy11.tskit_tools.WrappedTreeSequence(ts)
with warnings.catch_warnings():
warnings.filterwarnings("ignore", category=FutureWarning)
wts = fwdpy11.tskit_tools.WrappedTreeSequence(ts)
assert wts.seed == 333

with pytest.raises(ValueError):
Expand Down

0 comments on commit c5f682d

Please sign in to comment.