-
Notifications
You must be signed in to change notification settings - Fork 78
Do not use kastore memory for columns. #530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not use kastore memory for columns. #530
Conversation
|
This looks great, so much simpler to use I've installed SLiM and confirmed that @petrelharp 's example now works for this branch, and understand how zero edges was happening before (think I should get a |
|
Oh!! Thanks for the reminder, I think the code I am working on needs a |
|
Awesome, thanks @benjeffery! I'll finish this up first thing tomorrow morning. |
Codecov Report
@@ Coverage Diff @@
## master #530 +/- ##
==========================================
+ Coverage 86.84% 87.32% +0.48%
==========================================
Files 21 21
Lines 15996 16014 +18
Branches 3109 3117 +8
==========================================
+ Hits 13891 13984 +93
+ Misses 1055 1000 -55
+ Partials 1050 1030 -20
Continue to review full report at Codecov.
|
06aa379 to
f46e920
Compare
|
Some more updates here which should push the test coverage in C right up, as well as put some infrastructure in place for making all the "x, x_offset" columns optional. I still need to add some tests for handling the input of bad offsets, but it'll be ready for final review and merge then. |
|
Great, I think making the "both columns" error explicit is a good choice. Speaking of coverage - is it worth getting the coverage to ignore some of the lines that we can't easily hit like OOMs? I'd like to enforce 100% coverage then. |
If there's an easy way to do this, then great. I'm wary about enforcing 100% test coverage as it leads to weird code and tests sometimes, but having more insight into the catchable errors would be good. It is possible to test all the OOM corner cases (I've done it for kastore) but it's pretty tedious. |
8582b93 to
39d4164
Compare
d2daf56 to
c242f31
Compare
|
OK, I think this is ready to go. Can you have a look over please @benjeffery? We probably won't be able to hit the test coverage threshold as there were a few incidental changes that have a lot of uncatchable errors, but I think we're covering the file format stuff pretty well now. I hope you don't mind me squashing your "fixup!" commits, but I figured you wouldn't since they were marked as fixups. |
benjeffery
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Just a couple of questions/suggestions.
| if (ret != 0) { | ||
| goto out; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines have no effect. Is their aim to prevent an unchecked ret if code is added after at a later date? The pattern in the rest of the file is not to have these, but I can see an argument for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm glad you brought this up! I was thinking about this, and I think we should probably have explicit ret != 0 checks after every call even if it's at the end of the function. There's two good reasons for this:
- As you say, if code gets added later we won't accidentally skip the check (and it's really easy to miss this in review, as the diffs often won't show enough context to spot the problem)
- Also it helps us with figuring out where test coverage is missing. If we're testing error conditions on every other call except the last one, we could miss testing some important error conditions.
So, even though it's obviously stupid, I think it's the right thing to do. It might result in a tiny bit more code (but maybe not: the compiler might spot that it's redundant), but it's not important. Since you're in favour, I'm going to loop back and make the change on all the code I've just updated now.
c/tskit/tables.c
Outdated
| /* Any read errors will have already happened so we ignore any errors here. */ | ||
| kastore_close(&store); | ||
| return ret; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite get this - in the case where there have been no read errors we're ignoring errors from kastore_close, which can still return errors, even when the mode is KAS_READ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yes. You're right, I'll update.
|
|
||
| int TSK_WARN_UNUSED | ||
| tsk_table_collection_init(tsk_table_collection_t *self, tsk_flags_t options) | ||
| tsk_table_collection_init(tsk_table_collection_t *self, tsk_flags_t TSK_UNUSED(options)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the method signature kept the same for backwards compatibility? Or in case we need flags in future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both maybe, but more the latter.
Closes tskit-dev#536 Closes tskit-dev#528 Closes tskit-dev#527 Closes tskit-dev#506
d08b9ff to
d482151
Compare
| ret = TSK_REQUIRED_COL_NOT_FOUND; | ||
| ret = TSK_ERR_REQUIRED_COL_NOT_FOUND; | ||
| goto out; | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benjeffery, I changed the semantics here slightly because we were getting inconsistent behaviour when the first column in a list was optional (so, everything else was expected to have 0 rows). Since we're already setting default values for the incoming columns, it seemed best to just use those values as the indicator and not to use the lengths at all. That's why we're now checking to see if metadata schema was set, rather than passing NULL, -1 through to set_metadata_schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like read_table_cols function though, it's too tricky. I think #537 will make things a lot simpler.
Slightly changes the error semantics when loading in malformed files to be more consistent.
d482151 to
e63b9ae
Compare
|
Fantastic! :yay: |
Seemed simplest to tackle this head on, as it's complicating quite a lot of what we're doing. It should close #527 (there's a test in here to verify it), but we really should get some files in from earlier file minor versions to make sure we're doing the right thing.
Closes #528
Closes #527
Closes #506
Not ready for merging yet though, as it needs some more C test coverage and there's something weird going on in one of the tests that needs figuring out.