Skip to content

Conversation

@benjeffery
Copy link
Member

@benjeffery benjeffery commented Nov 11, 2020

Description

Adds dump and load to TableCollection.
Fixes #14

PR Checklist:

  • Tests that fully cover new/changed functionality.
  • Documentation including tutorial content if appropriate.
  • Changelogs, if there are API changes.

@AdminBot-tskit
Copy link
Collaborator

📖 Docs for this PR can be previewed here

@codecov
Copy link

codecov bot commented Nov 11, 2020

Codecov Report

Merging #986 (9400ee7) into main (24861e9) will decrease coverage by 0.06%.
The diff coverage is 80.58%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #986      +/-   ##
==========================================
- Coverage   93.71%   93.64%   -0.07%     
==========================================
  Files          26       26              
  Lines       20622    20680      +58     
  Branches      836      835       -1     
==========================================
+ Hits        19326    19366      +40     
- Misses       1259     1277      +18     
  Partials       37       37              
Flag Coverage Δ
c-tests 92.45% <ø> (ø)
lwt-tests 93.57% <ø> (ø)
python-c-tests 94.82% <80.58%> (-0.15%) ⬇️
python-tests 98.56% <97.87%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
python/_tskitmodule.c 91.29% <66.07%> (-0.30%) ⬇️
python/tskit/tables.py 99.58% <92.30%> (-0.14%) ⬇️
python/tskit/trees.py 97.35% <100.00%> (+0.07%) ⬆️
python/tskit/util.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 24861e9...9400ee7. Read the comment docs.

@benjeffery benjeffery force-pushed the tc-dump branch 4 times, most recently from 606db0a to 2978a9a Compare November 11, 2020 16:05
@benjeffery benjeffery marked this pull request as ready for review November 11, 2020 16:08
@benjeffery
Copy link
Member Author

Codecov isn't happy due to lines in _tskitmodule.c which it says are not covered. When I set a breakpoint on those lines it gets hit though! Not sure what is happening there.

Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but a few comments above.

tc._ll_tables = ll_tc
return tc

def dump(self, file_or_path, skip_checks=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want the skip_checks argument here as it might have some unwanted effects, and might be expensive. I think we should just write out the state as it is.

If we do have the checks we should probably implement at the C level using the check integrity function, rather than just calling self.tree_sequence().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 82edb1f

@jeromekelleher
Copy link
Member

Codecov has definitely gone a bit mad on the TableCollection_alloc coverage - that function must be being called. So, we can ignore codecov here I think.

@benjeffery
Copy link
Member Author

@jeromekelleher Fixed up, still needs a squash as left several commits for easy change viewing.

Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks @benjeffery! There's one potential cleanup, but merge away whenever.

with open(tmp_path / "temp.trees", "rb") as f:
ts2 = _tskit.TreeSequence()
ts2.load(f)
assert ts.get_num_samples() == ts2.get_num_samples()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did all this stuff long-hand before because we didn't have the table collection equality operator - we could just replace with the standard ts.tables == ts2.tables now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is the low-level ts I had to go with:

            tc = _tskit.TableCollection(ts.get_sequence_length())
            ts.dump_tables(tc)
            tc2 = _tskit.TableCollection(ts2.get_sequence_length())
            ts2.dump_tables(tc2)
            assert tc.equals(tc2)

@mergify mergify bot merged commit f250abd into tskit-dev:main Nov 12, 2020
@benjeffery benjeffery deleted the tc-dump branch November 12, 2020 15:03
@petrelharp
Copy link
Contributor

yay! this will simplify some other code!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provide TableCollection.dump( ) method

4 participants