-
Notifications
You must be signed in to change notification settings - Fork 79
Add nbytes to Table and TableCollection. #871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
|
@jeromekelleher Are you happy for me to pick this up and push commits to this PR? |
|
Yes please @benjeffery, that would be great, thanks. I think we can define the nbytes as "the total number of bytes required for the values in the dict encoding". I don't think there's any point in including the keys. Therefore, the nbytes for a TreeSequence is the same as its underlying TableCollection. |
5d4aa1c to
40d4ec9
Compare
|
📖 Docs for this PR can be previewed here |
Codecov Report
@@ Coverage Diff @@
## main #871 +/- ##
=======================================
Coverage 93.64% 93.65%
=======================================
Files 26 26
Lines 20680 20696 +16
Branches 835 838 +3
=======================================
+ Hits 19366 19382 +16
Misses 1277 1277
Partials 37 37
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
|
@jeromekelleher Think I'm happy with this now. It has a few test cleanups included that I came across. Still needs a squash. |
benjeffery
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving as it is your PR and mergify needs a non-owner approval.
jeromekelleher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! See comment above about pytest patterns, though
python/CHANGELOG.rst
Outdated
|
|
||
| - Added ``nbytes`` method to tables, ``TableCollection`` and ``TreeSequence`` which | ||
| reports the size in bytes of those objects. | ||
| (:user:`jeromekelleher`, :issue:`54`, :pr:`871`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should add your name here too
python/tests/test_highlevel.py
Outdated
| left, right = st1.get_interval() | ||
| breakpoints.append(right) | ||
| self.assertAlmostEqual(left, length) | ||
| assert left == approx(length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a good look at this, and I think it'd be better if we used the pattern x == pytest.approx(y). We're using pytest. for a bunch of other things, and I don't think there's a good reason for making an exception here. It's more obvious to me what it's actually doing as well, when I see x == pytest.approx(y) in isolation (i.e., if I was coming in fresh to this code without reading the imports)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, looks like something local. Fixed.
1484d9e to
42db20e
Compare
42db20e to
43785e7
Compare
|
Fixed up, rebased and squashed. |
It's pretty useful to have a summary of the number of bytes used by a table collection. This uses a definition based on the total bytes used to encode the data, which seems like as good a definition as any. Trying to capture the actual memory used seems like a fiddly waste of time.
However, this currently doesn't work because of a few limitations:
Closes #54