Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regions index #37

Merged
merged 10 commits into from
Jul 29, 2024
Merged

Regions index #37

merged 10 commits into from
Jul 29, 2024

Conversation

tomwhite
Copy link
Contributor

Following discussion in sgkit-dev/vcf-zarr-spec#21

@tomwhite
Copy link
Contributor Author

Tests are failing because we need https://github.com/tomwhite/bio2zarr/tree/variant-end (or similar).

Copy link
Contributor

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fantastic, thanks @tomwhite! Some minor comments.

Should we get the variant_length added to vcf2zarr first (and maybe even compute the index, optionally, there)?


contig = root["variant_contig"]
pos = root["variant_position"]
end = root["variant_position_end"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

variant_length might be an easier name to understand and computing the end is simple from that. But that's a discussion for elsewhere

c_end_idx - c_start_idx + 1, # num records
)
)
c_start_idx = c_end_idx + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks just right! 🤩

vcztools/vcf_writer.py Show resolved Hide resolved
@tomwhite
Copy link
Contributor Author

Should we get the variant_length added to vcf2zarr first (and maybe even compute the index, optionally, there)?

Yes, we definitely want variant length/end in vcf2zarr first. (I see you've opened sgkit-dev/bio2zarr#278 for that - thanks.)

I think we should compute the index in vcf2zarr - it's cheap to do so may not need to be optional. But we could wait a bit on that until we are sure everything's working with the index we've designed first.

@tomwhite tomwhite marked this pull request as ready for review July 29, 2024 07:57
@jeromekelleher
Copy link
Contributor

Thanks @tomwhite, this is a big step forward!

@jeromekelleher jeromekelleher merged commit 8b00a1c into sgkit-dev:main Jul 29, 2024
9 checks passed
@tomwhite tomwhite deleted the regions-index branch July 29, 2024 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants