Skip to content

Conversation

@benjeffery
Copy link
Member

Closes #585

I've got far enough with this that feedback would be useful. The big decision is to add options to tsk_edge_table_t. This seems cleaner and more preferable to using metadata==NULL to tell if the table has metadata.

Currently when an edge table with TSK_NO_METADATA is dumped the metadata columns are absent from the kastore. This is fine as those are optional. When the table is re-loaded, however metadata will be re-enabled. If we stored options in the kastore we could load a table with metadata disabled.

I've also opted to error if you try to set metadata on the table.

There are probably a few more tests needed. I've started work on #708, but in doing that realised it would be nice to have this done first.

@benjeffery benjeffery requested a review from jeromekelleher July 9, 2020 11:26
@codecov
Copy link

codecov bot commented Jul 9, 2020

Codecov Report

Merging #712 into master will increase coverage by 11.47%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #712       +/-   ##
===========================================
+ Coverage   87.46%   98.93%   +11.47%     
===========================================
  Files          24       15        -9     
  Lines       18162     4240    -13922     
  Branches     3609      750     -2859     
===========================================
- Hits        15886     4195    -11691     
+ Misses       1099       17     -1082     
+ Partials     1177       28     -1149     
Flag Coverage Δ
#c_tests 98.93% <ø> (+9.92%) ⬆️
#python_c_tests ?
#python_tests 99.00% <ø> (?)
Impacted Files Coverage Δ
c/tskit/genotypes.c
python/_tskitmodule.c
c/tskit/convert.c
c/tskit/haplotype_matching.c
c/tskit/stats.c

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bfee79d...c8a54d3. Read the comment docs.

Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM so far, some minor comments.

ret = tsk_table_collection_init(&tables, TSK_NO_EDGE_METADATA);
CU_ASSERT_EQUAL_FATAL(ret, 0);
tables.sequence_length = 10;
tsk_edge_table_add_row(&tables.edges, 0, 0, 1, 1, "metadata", 8);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're missing a ret check here, aren't we? This call should fail, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c/tskit/tables.c Outdated
(void **) &self->metadata_offset, new_size + 1, sizeof(tsk_size_t));
if (ret != 0) {
goto out;
if (!(self->options & TSK_NO_METADATA)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll probably be doing this a lot, so maybe worth a static function, tsk_edge_table_has_metadata(self)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c/tskit/tables.c Outdated

if (!(options & TSK_NO_INIT)) {
ret = tsk_edge_table_init(dest, 0);
ret = tsk_edge_table_init(dest, self->options);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me we want to do this. Perhaps the table options should from the options parameter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this to avoid the case of turning metadata back on for a table that was init'd with it off. I've added an error for that case so that now you can disable metadata on a copy. 071d93b

c/tskit/tables.c Outdated
self->metadata + self->metadata_offset[j]);
if (err < 0) {
goto out;
if (self->options & TSK_NO_METADATA) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be better if we use ``edges_table_get_row here and print out the properties of the returned edge object (we can ignore the return value). That way we can just print the values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjeffery
Copy link
Member Author

@jeromekelleher I've addressed comments and implemented the change so that metadata is re-instated when tables are used as a tree sequence. Unless you can think of tests I've missed this is good to go.

@jeromekelleher
Copy link
Member

Thanks @benjeffery. Sorry, i don't have time to go through it now. Is it OK if I go through an make any final tweaks next week and merge? (Maybe you could squash now?)

@benjeffery
Copy link
Member Author

I've squashed - go ahead with any changes you need to any branch while I'm off.

@jeromekelleher
Copy link
Member

Looks great, thanks again @benjeffery! I made a couple of minor tweaks. Mainly to make absolutely sure that equals in EdgeTable is safe.

@mergify mergify bot merged commit 55391fa into tskit-dev:master Jul 14, 2020
@benjeffery benjeffery deleted the no-edge-metadata branch August 27, 2020 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[C API] Add TSK_NO_EDGE_METADATA option

2 participants