Fix saving / loading of tree sequences after simplification to a subset of sampled individuals #137

bodkan · 2023-05-08T15:00:06Z

An attempt at fixing #136.

codecov · 2023-05-08T17:14:14Z

Codecov Report

Merging #137 (6c3a7e8) into main (01af510) will increase coverage by 0.30%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #137      +/-   ##
==========================================
+ Coverage   84.04%   84.35%   +0.30%     
==========================================
  Files           7        7              
  Lines        3109     3144      +35     
==========================================
+ Hits         2613     2652      +39     
+ Misses        496      492       -4

Impacted Files	Coverage Δ
R/interface.R	`80.36% <100.00%> (+0.06%)`	⬆️
R/tree-sequences.R	`88.39% <100.00%> (+0.61%)`	⬆️
R/utils.R	`88.37% <100.00%> (+0.28%)`	⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

…plification

bodkan · 2023-05-11T11:42:08Z

Merging now. From the changelog:

Fix for ts_load() failing to load slendr-produced tree sequences after they were simplified down to a smaller set of sampled individuals (reported here). The issue was caused by incompatible sizes of the sampling table (always in the same form as used during simulation) and the table of individuals stored in the tree sequence after simplification (potentially containing a smaller set of individuals than in the original sampling table). To fix this, slendr tree sequence objects now track information about which individuals are regarded as "samples" (i.e. those with symbolic names) which is maintained through simplification, serialization and loading, and used by slendr's internal machinery during join operations. (PR #137)

bodkan · 2023-05-11T11:45:25Z

I have a feeling there's a more elegant solution to this. It's more and more clear that a Great Internal Refactoring will have to happen, eventually. The fact that slendr started as a purely SLiM-focused thing, had tskit support bolted on after, and yet after that added support for msprime coalescent models is... starting to show.

In the meantime, there's an acute need for Real Research(tm) to be done, so an emergency fix works just fine.

bodkan added 5 commits May 8, 2023 10:32

Improve robustness of sampling schedules

b61d1be

Keep track of names of sampled individuals at each point

02477cc

Only test for subsetted simplification result for slendr outputs

5349115

Add pure SLiM/msprime simplification tests

7c988d5

Fix issue with single pop sampling at multiple time points

b7188de

bodkan added 7 commits May 9, 2023 18:46

Track information about sampled individuals' pedigree IDs through sim…

376b58f

…plification

Add a comment

0c0bde1

Add condition to unit test

140b67e

Update bundled example model data

e50300f

Replace print.slendr_nodes with summary.slendr_nodes

d4c0984

Update NEWS

7392377

Update website

6c3a7e8

bodkan merged commit f2c61f1 into main May 11, 2023
8 checks passed

bodkan deleted the fix-saving-simplified-ts branch June 1, 2023 08:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix saving / loading of tree sequences after simplification to a subset of sampled individuals #137

Fix saving / loading of tree sequences after simplification to a subset of sampled individuals #137

bodkan commented May 8, 2023

codecov bot commented May 8, 2023 •

edited

bodkan commented May 11, 2023

bodkan commented May 11, 2023

Fix saving / loading of tree sequences after simplification to a subset of sampled individuals #137

Fix saving / loading of tree sequences after simplification to a subset of sampled individuals #137

Conversation

bodkan commented May 8, 2023

codecov bot commented May 8, 2023 • edited

Codecov Report

bodkan commented May 11, 2023

bodkan commented May 11, 2023

codecov bot commented May 8, 2023 •

edited