Skip to content

Conversation

@benjeffery
Copy link
Member

@benjeffery benjeffery commented Feb 8, 2021

Fixes #1137 by adding checks for individual reference integrity and ordering.

@codecov
Copy link

codecov bot commented Feb 10, 2021

Codecov Report

Merging #1192 (774f74a) into main (00c28c4) will increase coverage by 1.19%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1192      +/-   ##
==========================================
+ Coverage   92.52%   93.72%   +1.19%     
==========================================
  Files          10       26      +16     
  Lines       11643    21507    +9864     
  Branches        0      904     +904     
==========================================
+ Hits        10773    20157    +9384     
- Misses        870     1312     +442     
- Partials        0       38      +38     
Flag Coverage Δ
c-tests 92.50% <100.00%> (+0.02%) ⬆️
lwt-tests 92.97% <ø> (ø)
python-c-tests 94.91% <ø> (?)
python-tests 98.63% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
c/tskit/core.h 100.00% <ø> (ø)
c/tskit/core.c 97.03% <100.00%> (+0.04%) ⬆️
c/tskit/tables.c 90.83% <100.00%> (+0.03%) ⬆️
python/tskit/drawing.py 99.10% <0.00%> (ø)
python/tskit/combinatorics.py 99.20% <0.00%> (ø)
python/tskit/vcf.py 100.00% <0.00%> (ø)
python/tskit/__main__.py 0.00% <0.00%> (ø)
python/tskit/__init__.py 100.00% <0.00%> (ø)
python/tskit/metadata.py 98.60% <0.00%> (ø)
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00c28c4...774f74a. Read the comment docs.

@benjeffery benjeffery force-pushed the check_parents branch 2 times, most recently from 049d301 to dd8ff23 Compare February 10, 2021 16:01
@benjeffery benjeffery marked this pull request as ready for review February 10, 2021 16:05
Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I don't think we need to check for ordered individuals in simplify. We should relax the requirement and put in a test (in Python) to verify that we can successfully simplify with unordered parents.

c/tskit/tables.c Outdated
k < individuals.parents_offset[j + 1]; k++) {
if (individuals.parents[k] != TSK_NULL
&& individuals.parents[k] >= (tsk_id_t) j) {
ret = TSK_ERR_UNSORTED_INDIVIDUALS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we might check that no individual is its own parent (even if we are not checking individual ordering).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be a useful diagnostic. No good reason to be your own parent (unless people are being sloppy with the tables and want to fix things up at the end).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a test for this, as this code already forbids it.

@petrelharp
Copy link
Contributor

petrelharp commented Feb 11, 2021

I realized we can check for loops reasonably easily:

for ind in ts.individuals():
  for p in ind.parents:
    while p != tskit.NULL:
     if p == ind.id:
         raise ValueError("Inconsistent individual parents (time travel)")

Edit: this is computationally terrible, since it traverses every possible path on the pedigree. I say we skip this check out figure out a smarter way.

Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still needs some tweaks on testing simplify.

@jeromekelleher
Copy link
Member

I realized we can check for loops reasonably easily:

Doesn't parent-before-child rule out loops?

@petrelharp
Copy link
Contributor

Doesn't parent-before-child rule out loops?

Yes, sorry - never mind this.

@benjeffery benjeffery force-pushed the check_parents branch 2 times, most recently from 31219de to 4a2df5c Compare February 12, 2021 13:36
@benjeffery
Copy link
Member Author

Ok, I've added the test for self-parentage and removed the simplify test that was doing nothing. Should be good to go.

@petrelharp
Copy link
Contributor

I was actually suggesting that we check for self parentage always (not just when checking for ordering). And, turns out that test I said wasn't testing anything actually was - it was testing whether simplify runs even if the individuals are unordered. I've just gone ahead and done those things; feel free to do whatever to this commit.

@benjeffery
Copy link
Member Author

@jeromekelleher I like @petrelharp 's addition here, more precise error messages are always a good thing. I think this PR is ready to go.

@jeromekelleher
Copy link
Member

OK, great!

@mergify mergify bot merged commit d45c6ad into tskit-dev:main Feb 15, 2021
@benjeffery benjeffery deleted the check_parents branch March 19, 2021 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Individual.parents should be added to integrity checks

3 participants