Skip to content

Conversation

@benjeffery
Copy link
Member

@benjeffery benjeffery commented Jun 18, 2020

Fixes #686
This is backward compatible with existing dict representations as schemas (and top-level metadata and schema) are not required on fromdict and if empty are not included in asdict.
Note that this also adds a metadata_schema argument to set_columns.

@codecov
Copy link

codecov bot commented Jun 18, 2020

Codecov Report

Merging #687 into master will decrease coverage by 0.28%.
The diff coverage is 67.48%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #687      +/-   ##
==========================================
- Coverage   87.71%   87.42%   -0.29%     
==========================================
  Files          23       23              
  Lines       17810    17963     +153     
  Branches     3526     3575      +49     
==========================================
+ Hits        15622    15705      +83     
- Misses       1073     1091      +18     
- Partials     1115     1167      +52     
Flag Coverage Δ
#c_tests 88.96% <100.00%> (+0.01%) ⬆️
#python_c_tests 90.61% <67.48%> (-0.65%) ⬇️
#python_tests 99.00% <100.00%> (+<0.01%) ⬆️
Impacted Files Coverage Δ
python/_tskitmodule.c 82.85% <63.69%> (-1.06%) ⬇️
python/tskit/tables.py 99.67% <100.00%> (+<0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 99ebd93...d771988. Read the comment docs.

@benjeffery
Copy link
Member Author

I'm missing some coverage, don't merge for now.

@benjeffery benjeffery force-pushed the round-trip-metadata branch from 3dcdaf5 to 5a161d9 Compare June 19, 2020 12:42
@benjeffery benjeffery mentioned this pull request Jun 19, 2020
17 tasks
@benjeffery benjeffery force-pushed the round-trip-metadata branch 3 times, most recently from f813721 to 6978388 Compare June 22, 2020 13:33
@benjeffery
Copy link
Member Author

@jeromekelleher This is now fully-backward compatible with older asdict and fromdict. Should be good to go.

Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @benjeffery. I think there's a memory leak in the dict_encoding. Can you run stress_lowlevel and verify please? Otherwise, I think we need to get a little more test coverage on the behaviour when metadata_schema etc is missing from the dict encoding (not mapped to None, but not in the dictionary at all).

"""
self.ll_table.append_columns(
dict(metadata=metadata, metadata_offset=metadata_offset)
dict(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be able to just leave metadata_schema out of this, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't as this is calling the low-level method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be OK now, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep gone in 682b97d

@benjeffery
Copy link
Member Author

@jeromekelleher Thanks, think I have addressed them all. Sorry there were quite a few!

Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments above @benjeffery. I think it's probably worth manually pasting in a few small examples of dict encoded output from msprime and tsinfer, just to make sure that we can actually parse these successfully.

As I was going through this, I realised that the dict encoding should really be versioned. Maybe we should add a key "encoding_version" = (major,minor) to the dictionary, which (if not present) defaults to 1.0. That way we can think a bit more clearly about how this encoding changes over time.

(Obviously need to be careful that this doesn't break old versions of tskit, though. Should be OK, shouldn't it?)

"""
self.ll_table.append_columns(
dict(metadata=metadata, metadata_offset=metadata_offset)
dict(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be OK now, right?

@benjeffery
Copy link
Member Author

I've added a version in 37f2bd6. I've used 1.1 as 1.0 is the verison that had no version. In your comment you mention setting a default to 1.0 - I think this would go in code that is reading back the version, which we're not doing yet as we're not using it.

@benjeffery
Copy link
Member Author

I've added a hardcoded msprime example - but I can't see where in tsinfer it is making dicts. A search for "asdict" fails in the tsinfer source.

Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, few small changes but ready for final review then I think?

@benjeffery
Copy link
Member Author

@jeromekelleher Ok, let me know if good for a squash.

@jeromekelleher
Copy link
Member

I'll give one more look over once #687 is merged @benjeffery, but I'd be surprised if anything else shows up at this point.

@benjeffery
Copy link
Member Author

#687 is this PR?

@jeromekelleher
Copy link
Member

#687 is this PR?

🤦

Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, squash'n'merge!

@benjeffery benjeffery force-pushed the round-trip-metadata branch from 1218d25 to d771988 Compare June 29, 2020 23:54
@mergify mergify bot merged commit 71efe44 into tskit-dev:master Jun 30, 2020
@benjeffery benjeffery deleted the round-trip-metadata branch June 30, 2020 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Metadata schemas and top-level metadata not included in asdict and fromdict

2 participants