-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,6 +4,10 @@ | |
|
||
- The initial size of a population which emerges from a split from another population is now printed in a population history summary in the R console. ([#6525bf3](https://github.com/bodkan/slendr/commit/6525bf3)) | ||
|
||
- A couple of fixes to support loading, processing, and plotting of "manually" created tree sequences have been implemented (see [this](https://tskit.dev/tutorials/tables_and_editing.html#constructing-a-tree-sequence)). Not sure how practically useful, but it's important to be able to load even "pure" tree sequences which are not from simulators such as SLiM and msprime. A set of [unit tests](https://github.com/bodkan/slendr/blob/9611437554bbb171f3df6374651acc3d73c63426/tests/testthat/test-manual-ts.R) has been added, making sure that a minimalist nodes & edges table can be loaded, as well as nodes & edges & individuals, plus tables of populations and sites & mutations. PRs with more extensive unit tests and bug reports of tree sequences which are failing to load would be appreciated! The code for handling cases of "manually-created" tree sequences which have missing individual table, missing populations table, etc. seems especially brittle at the moment ([#2f5fc32](https://github.com/bodkan/slendr/commit/2f5fc32)). | ||
|
||
- The `-1` value as a missing value indicator used in tskit is now replaced with the more R-like `NA` in various tree-sequence tables (annotated by _slendr_ or original through tskit itself) ([#2f5fc32](https://github.com/bodkan/slendr/commit/2f5fc32)). | ||
This comment has been minimized.
Sorry, something went wrong.
This comment has been minimized.
Sorry, something went wrong.
bodkan
Author
Owner
|
||
|
||
# slendr 0.3.0 | ||
|
||
- SLiM 4.0 is now required for running simulations with the `slim()` engine. If you want to run _slendr_ simulations with SLiM (spatial or non-spatial), you will need to upgrade you SLiM installation. SLiM 3.7.1 version is no longer supported as the upcoming new _slendr_ spatial features will depend on SLiM 4.x and maintaining two functionally identical yet syntactically different back ends is not feasible (PR [#104](https://github.com/bodkan/slendr/pull/104)). | ||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
import tskit | ||
import numpy | ||
import tempfile | ||
|
||
tables = tskit.TableCollection(sequence_length=1e5) | ||
|
||
node_table = tables.nodes | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE, individual=0, population=0) # node 0 | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE, individual=0, population=0) # node 1 | ||
node_table.add_row(time=3, population=1) # node 2 | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE, individual=1, population=2) # node 3 | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE, individual=1, population=2) # node 4 | ||
node_table.add_row(time=7, population=2) # node 5 | ||
node_table.add_row(time=10, population=1) # node 6 | ||
node_table | ||
|
||
edge_table = tables.edges | ||
edge_table.set_columns( | ||
left=numpy.array([0, 0, 0, 0, 0, 0]), | ||
right=numpy.array([1e5, 1e5, 1e5, 1e5, 1e5, 1e5]), | ||
parent=numpy.array([2, 2, 5, 5, 6, 6], dtype=numpy.int32), | ||
child=numpy.array([0, 1, 3, 4, 2, 5], dtype=numpy.int32) | ||
) | ||
edge_table | ||
|
||
ind_table = tables.individuals | ||
ind_table.add_row() | ||
ind_table.add_row() | ||
ind_table.add_row() | ||
ind_table.add_row() | ||
|
||
pop_table = tables.populations | ||
pop_table.add_row() | ||
pop_table.add_row() | ||
pop_table.add_row() | ||
|
||
sites_table = tables.sites | ||
sites_table.add_row(position=42, ancestral_state="0") | ||
sites_table.add_row(position=123, ancestral_state="0") | ||
|
||
mutations_table = tables.mutations | ||
mutations_table.add_row(site=0, node=2, derived_state="1") | ||
mutations_table.add_row(site=1, node=5, derived_state="1") | ||
|
||
tseq = tables.tree_sequence() | ||
|
||
filename = tempfile.NamedTemporaryFile().name | ||
|
||
tseq.dump(filename) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
import tskit | ||
import numpy | ||
import tempfile | ||
|
||
tables = tskit.TableCollection(sequence_length=1e5) | ||
|
||
node_table = tables.nodes | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE, individual=0, population=0) # node 0 | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE, individual=0, population=0) # node 1 | ||
node_table.add_row(time=3, population=1) # node 2 | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE, individual=1, population=2) # node 3 | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE, individual=1, population=2) # node 4 | ||
node_table.add_row(time=7, population=2) # node 5 | ||
node_table.add_row(time=10, population=1) # node 6 | ||
node_table | ||
|
||
edge_table = tables.edges | ||
edge_table.set_columns( | ||
left=numpy.array([0, 0, 0, 0, 0, 0]), | ||
right=numpy.array([1e5, 1e5, 1e5, 1e5, 1e5, 1e5]), | ||
parent=numpy.array([2, 2, 5, 5, 6, 6], dtype=numpy.int32), | ||
child=numpy.array([0, 1, 3, 4, 2, 5], dtype=numpy.int32) | ||
) | ||
edge_table | ||
|
||
ind_table = tables.individuals | ||
ind_table.add_row() | ||
ind_table.add_row() | ||
ind_table.add_row() | ||
ind_table.add_row() | ||
|
||
pop_table = tables.populations | ||
pop_table.add_row() | ||
pop_table.add_row() | ||
pop_table.add_row() | ||
|
||
tseq = tables.tree_sequence() | ||
|
||
filename = tempfile.NamedTemporaryFile().name | ||
|
||
tseq.dump(filename) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
import tskit | ||
import numpy | ||
import tempfile | ||
|
||
tables = tskit.TableCollection(sequence_length=1e5) | ||
|
||
node_table = tables.nodes | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE, individual=0) # node 0 | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE, individual=0) # node 1 | ||
node_table.add_row(time=3) # node 2 | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE, individual=1) # node 3 | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE, individual=1) # node 4 | ||
node_table.add_row(time=7) # node 5 | ||
node_table.add_row(time=10) # node 6 | ||
node_table | ||
|
||
edge_table = tables.edges | ||
edge_table.set_columns( | ||
left=numpy.array([0, 0, 0, 0, 0, 0]), | ||
right=numpy.array([1e5, 1e5, 1e5, 1e5, 1e5, 1e5]), | ||
parent=numpy.array([2, 2, 5, 5, 6, 6], dtype=numpy.int32), | ||
child=numpy.array([0, 1, 3, 4, 2, 5], dtype=numpy.int32) | ||
) | ||
edge_table | ||
|
||
ind_table = tables.individuals | ||
ind_table.add_row() | ||
ind_table.add_row() | ||
ind_table.add_row() | ||
ind_table.add_row() | ||
|
||
tseq = tables.tree_sequence() | ||
|
||
filename = tempfile.NamedTemporaryFile().name | ||
|
||
tseq.dump(filename) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
import tskit | ||
import numpy | ||
import tempfile | ||
|
||
tables = tskit.TableCollection(sequence_length=1e5) | ||
|
||
node_table = tables.nodes | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE) # node 0 | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE) # node 1 | ||
node_table.add_row(time=3) # node 2 | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE) # node 3 | ||
node_table.add_row(flags=tskit.NODE_IS_SAMPLE) # node 4 | ||
node_table.add_row(time=7) # node 5 | ||
node_table.add_row(time=10) # node 6 | ||
node_table | ||
|
||
edge_table = tables.edges | ||
edge_table.set_columns( | ||
left=numpy.array([0, 0, 0, 0, 0, 0]), | ||
right=numpy.array([1e5, 1e5, 1e5, 1e5, 1e5, 1e5]), | ||
parent=numpy.array([2, 2, 5, 5, 6, 6], dtype=numpy.int32), | ||
child=numpy.array([0, 1, 3, 4, 2, 5], dtype=numpy.int32) | ||
) | ||
edge_table | ||
|
||
tseq = tables.tree_sequence() | ||
|
||
filename = tempfile.NamedTemporaryFile().name | ||
|
||
tseq.dump(filename) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
test_that("minimal tree sequence (nodes+edges) is correctly loaded", { | ||
reticulate::py_run_file("manual_ts_nodes+edges.py") | ||
|
||
path <- reticulate::py$filename | ||
ts <- ts_load(path) | ||
|
||
# ts_tree(ts, 1) %>% ts_draw() | ||
|
||
edges <- ts_table(ts, "edges") | ||
expect_true(all(edges$child == c(0, 1, 3, 4, 2, 5))) | ||
expect_true(all(edges$parent == c(2, 2, 5, 5, 6, 6))) | ||
|
||
expect_true(all(ts_table(ts, "nodes")$node_id == seq(0, 6))) | ||
expect_true(all(ts_table(ts, "nodes")$time_tskit == c(0, 0, 3, 0, 0, 7, 10))) | ||
|
||
expect_true(all(is.na(ts_table(ts, "nodes")$pop_id))) | ||
expect_true(all(is.na(ts_nodes(ts)$pop_id))) | ||
|
||
# check the annotated nodes table | ||
expect_true(all(ts_nodes(ts)$node_id == seq(0, 6))) | ||
expect_true(all(ts_nodes(ts)$time_tskit == c(0, 0, 3, 0, 0, 7, 10))) | ||
|
||
# check the annotated edges table | ||
expect_true(all(ts_edges(ts)$child_node_id == c(0, 1, 2, 3, 4, 5))) | ||
expect_true(all(ts_edges(ts)$parent_node_id == c(2, 2, 6, 5, 5, 6))) | ||
}) | ||
|
||
test_that("minimal tree sequence (nodes+edges+inds) is correctly loaded", { | ||
reticulate::py_run_file("manual_ts_nodes+edges+inds.py") | ||
|
||
path <- reticulate::py$filename | ||
ts <- ts_load(path) | ||
|
||
# ts_tree(ts, 1) %>% ts_draw() | ||
|
||
edges <- ts_table(ts, "edges") | ||
expect_true(all(edges$child == c(0, 1, 3, 4, 2, 5))) | ||
expect_true(all(edges$parent == c(2, 2, 5, 5, 6, 6))) | ||
|
||
expect_true(all(ts_table(ts, "nodes")$node_id == seq(0, 6))) | ||
expect_true(all(ts_table(ts, "nodes")$time_tskit == c(0, 0, 3, 0, 0, 7, 10))) | ||
|
||
expect_true(all(is.na(ts_table(ts, "nodes")$pop_id))) | ||
expect_true(all(is.na(ts_nodes(ts)$pop_id))) | ||
|
||
# check the annotated nodes table | ||
expect_true(all(ts_nodes(ts)$node_id == c(0, 1, 3, 4, 2, 5, 6))) | ||
expect_true(all(ts_nodes(ts)$time_tskit == c(0, 0, 0, 0, 3, 7, 10))) | ||
|
||
# check the annotated edges table | ||
expect_true(all(ts_edges(ts)$child_node_id == c(0, 1, 3, 4, 2, 5))) | ||
expect_true(all(ts_edges(ts)$parent_node_id == c(2, 2, 5, 5, 6, 6))) | ||
}) | ||
|
||
test_that("minimal tree sequence (nodes+edges+inds+pops) is correctly loaded", { | ||
reticulate::py_run_file("manual_ts_nodes+edges+inds+pops.py") | ||
|
||
path <- reticulate::py$filename | ||
ts <- ts_load(path) | ||
|
||
# ts_tree(ts, 1) %>% ts_draw() | ||
|
||
edges <- ts_table(ts, "edges") | ||
expect_true(all(edges$child == c(0, 1, 3, 4, 2, 5))) | ||
expect_true(all(edges$parent == c(2, 2, 5, 5, 6, 6))) | ||
|
||
expect_true(all(ts_table(ts, "nodes")$node_id == seq(0, 6))) | ||
expect_true(all(ts_table(ts, "nodes")$time_tskit == c(0, 0, 3, 0, 0, 7, 10))) | ||
|
||
# check the annotated nodes table | ||
expect_true(all(ts_nodes(ts)$node_id == c(0, 1, 3, 4, 2, 5, 6))) | ||
expect_true(all(ts_nodes(ts)$time_tskit == c(0, 0, 0, 0, 3, 7, 10))) | ||
expect_true(all(ts_nodes(ts)$pop_id == c(0, 0, 2, 2, 1, 2, 1))) | ||
|
||
# check the annotated edges table | ||
expect_true(all(ts_edges(ts)$child_node_id == c(0, 1, 3, 4, 2, 5))) | ||
expect_true(all(ts_edges(ts)$parent_node_id == c(2, 2, 5, 5, 6, 6))) | ||
expect_true(all(ts_edges(ts)$child_pop == c(0, 0, 2, 2, 1, 2))) | ||
expect_true(all(ts_edges(ts)$parent_pop == c(1, 1, 2, 2, 1, 1))) | ||
}) | ||
|
||
test_that("minimal tree sequence (nodes+edges+inds+pops+muts) is correctly loaded", { | ||
reticulate::py_run_file("manual_ts_nodes+edges+inds+pops+muts.py") | ||
|
||
path <- reticulate::py$filename | ||
ts <- ts_load(path) | ||
|
||
# ts_tree(ts, 1) %>% ts_draw() | ||
|
||
edges <- ts_table(ts, "edges") | ||
expect_true(all(edges$child == c(0, 1, 3, 4, 2, 5))) | ||
expect_true(all(edges$parent == c(2, 2, 5, 5, 6, 6))) | ||
|
||
expect_true(all(ts_table(ts, "nodes")$node_id == seq(0, 6))) | ||
expect_true(all(ts_table(ts, "nodes")$time_tskit == c(0, 0, 3, 0, 0, 7, 10))) | ||
|
||
expect_true(all(ts_table(ts, "mutations")$node == c(2, 5))) | ||
|
||
# check the annotated nodes table | ||
expect_true(all(ts_nodes(ts)$node_id == c(0, 1, 3, 4, 2, 5, 6))) | ||
expect_true(all(ts_nodes(ts)$time_tskit == c(0, 0, 0, 0, 3, 7, 10))) | ||
expect_true(all(ts_nodes(ts)$pop_id == c(0, 0, 2, 2, 1, 2, 1))) | ||
|
||
# check the annotated edges table | ||
expect_true(all(ts_edges(ts)$child_node_id == c(0, 1, 3, 4, 2, 5))) | ||
expect_true(all(ts_edges(ts)$parent_node_id == c(2, 2, 5, 5, 6, 6))) | ||
expect_true(all(ts_edges(ts)$child_pop == c(0, 0, 2, 2, 1, 2))) | ||
expect_true(all(ts_edges(ts)$parent_pop == c(1, 1, 2, 2, 1, 1))) | ||
}) |
Wow, hmm, interesting. Might surprise those who are used to tskit; be sure to document this thoroughly. Also: do you then convert from NA back to -1 if the user in some way sends the tree sequence back to tskit-land again? And are you doing this everywhere, thoughout all the (visible) tskit data structures? If it's -1 in some places and NA in others, that seems like it could be quite confusing. It kind of makes sense to R-ify things like this, I guess, but I hope you're being super careful with this!