Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow haplotypes and alignments to have left, right, samples params #2397

Merged
merged 1 commit into from
Jul 13, 2022

Conversation

hyanwong
Copy link
Member

@hyanwong hyanwong commented Jul 10, 2022

Description

Adds "samples", "start", and "stop" parameters to ts.haplotypes() and ts.alignments(). This (partially) fixes #2092. I was worried that calling var.decode(site_id) by simply iterating over the site ids would be slow, but that's how the sites() method does it anyway.

Is this the right approach? I haven't written any new tests until it's deemed sensible.

PR Checklist:

  • Tests that fully cover new/changed functionality.
  • Documentation including tutorial content if appropriate.
  • Changelogs, if there are API changes.

@codecov
Copy link

codecov bot commented Jul 10, 2022

Codecov Report

Merging #2397 (31b7e63) into main (b74b98c) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2397   +/-   ##
=======================================
  Coverage   93.33%   93.33%           
=======================================
  Files          28       28           
  Lines       26952    26977   +25     
  Branches     1236     1245    +9     
=======================================
+ Hits        25155    25180   +25     
  Misses       1763     1763           
  Partials       34       34           
Flag Coverage Δ
c-tests 92.25% <ø> (ø)
lwt-tests 89.05% <ø> (ø)
python-c-tests 71.00% <9.75%> (-0.11%) ⬇️
python-tests 98.96% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
python/tskit/trees.py 98.66% <100.00%> (+0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b74b98c...31b7e63. Read the comment docs.

@benjeffery
Copy link
Member

(Note that as written your description will close #2092)

Should variants also get start and stop? Not sure why it shouldn't. And yes using decode is the way to do this. All the heavy lifting is in seeking and decoding so the Python overhead isn't bad.

@hyanwong
Copy link
Member Author

(Note that as written your description will close #2092)

Yep, That was basically deliberate. I will open a new issue with the unaddressed bits there.

Should variants also get start and stop? Not sure why it shouldn't.

I guess it could. Do you want me to add it to this PR then?

@hyanwong
Copy link
Member Author

p.s. do we want "start" and "stop" or "left" and "right"?

@benjeffery
Copy link
Member

p.s. do we want "start" and "stop" or "left" and "right"?

Everywhere else that we talk about genomic intervals we talk about left, right, so that seems to make the most sense.

@hyanwong
Copy link
Member Author

p.s. do we want "start" and "stop" or "left" and "right"?

Everywhere else that we talk about genomic intervals we talk about left, right, so that seems to make the most sense.

I agree. It's clearly a genomic position if we say left and right.

@hyanwong hyanwong force-pushed the haplotype-start-stop branch 3 times, most recently from 07f6175 to d74df96 Compare July 11, 2022 22:29
@hyanwong hyanwong changed the title Allow haplotypes and alignments to have start, stop, samples params Allow haplotypes and alignments to have left, right, samples params Jul 11, 2022
@benjeffery
Copy link
Member

@hyanwong Is this ready for review?

@hyanwong
Copy link
Member Author

@hyanwong Is this ready for review?

Yes, it is.. Thanks Ben.

Copy link
Member

@benjeffery benjeffery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Thanks for picking this one up - great to use all the new Variant machinery.

python/tests/test_genotypes.py Outdated Show resolved Hide resolved
python/tskit/trees.py Outdated Show resolved Hide resolved
python/tskit/trees.py Outdated Show resolved Hide resolved
python/tskit/trees.py Outdated Show resolved Hide resolved
@hyanwong hyanwong force-pushed the haplotype-start-stop branch 4 times, most recently from f16ffae to e296327 Compare July 13, 2022 08:50
@hyanwong
Copy link
Member Author

Changes made. Would be nice to merge before the new point release, but whatever.

@benjeffery benjeffery added the AUTOMERGE-REQUESTED Ask Mergify to merge this PR label Jul 13, 2022
@mergify mergify bot merged commit 31efb75 into tskit-dev:main Jul 13, 2022
@mergify mergify bot removed the AUTOMERGE-REQUESTED Ask Mergify to merge this PR label Jul 13, 2022

def test_bad_left(self):
ts = tskit.TableCollection(10).tree_sequence()
for bad_left in [-1, 10, 100, np.nan, np.inf, -np.inf]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We prefer pytest.mark.parametrize for this sort of thing going forward

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feedback about reference sequences: reference_start and restrict alignments options
3 participants