Table.coordinate_system and Table.interval_type are silently ignored by spatial-predicate transpilation — Closes #88#90
Merged
conradbzura merged 2 commits intoApr 27, 2026
Conversation
dde13b6 to
444919e
Compare
…cates INTERSECTS, CONTAINS, and WITHIN ignored the table-storage convention declared via Table(coordinate_system=..., interval_type=...) and emitted SQL hard-coded for 0-based half-open columns. Tables backed by other conventions (e.g. VCF-style 1-based closed) silently received off-by-one results in both range-literal and column-join forms; SpatialSetPredicate inherited the same bug because it routes through the same helper. Introduce two staticmethod helpers, _canonical_start and _canonical_end, that wrap a raw start/end column expression to yield its canonical 0-based half-open value, plus _resolve_table to look up the Table backing a column reference. Wire the helpers into _generate_range_predicate and _generate_column_join so the table side is canonicalized before the existing comparison operators run. The literal side already arrives in canonical form via RangeParser.to_zero_based_half_open, so no per-operator branching is needed. DISTANCE and NEAREST still ignore coordinate_system and conflate interval_type with bedtools-compatibility math; that is tracked as a separate follow-up and out of scope here.
…val combinations Add eleven tests asserting the new canonicalization behavior for INTERSECTS, CONTAINS, WITHIN, and SpatialSetPredicate: - All three predicates against a 1-based-closed table in literal form - Column-join form for INTERSECTS, CONTAINS, and WITHIN with one default-convention table joined against a 1-based-closed table - INTERSECTS literal against the rare 0-based-closed and 1-based-half-open conventions to cover the remaining two cells of the (coordinate_system, interval_type) matrix - INTERSECTS literal against an explicitly-default Table to guard against accidental wrapping when defaults are passed in - CONTAINS point-query branch against a 1-based-closed table - INTERSECTS ANY against a 1-based-closed table to verify the fix flows through SpatialSetPredicate transitively Tests follow the BDD naming pattern test_<method>_should_<outcome>_when_<condition> and the AAA structure with Given/When/Then docstrings.
444919e to
1179636
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Make INTERSECTS, CONTAINS, and WITHIN honor
Table.coordinate_systemandTable.interval_typeso tables that store intervals in non-default conventions (e.g. VCF-style 1-based closed) produce correct SQL. Previously the table side of every spatial comparison was hard-coded for 0-based half-open columns, so any non-defaultTableconfiguration silently produced off-by-one results in both range-literal and column-join forms;SpatialSetPredicate(INTERSECTS ANY/ALL) inherited the same bug because it routes through the same helper.The fix introduces three private helpers in
BaseGIQLGeneratorthat wrap rawstart/endcolumn expressions to yield their canonical 0-based half-open equivalents, then wires those helpers into_generate_range_predicateand_generate_column_join. The literal side already arrives in canonical form viaRangeParser.to_zero_based_half_open, so once both sides are canonicalized the existing comparison operators (<,>,<=,>=) are correct without per-operator branching.Conversion table the helpers implement:
coordinate_systeminterval_typestart_canonicalend_canonical0basedhalf_openstart_colend_col0basedclosedstart_col(end_col + 1)1basedhalf_open(start_col - 1)(end_col - 1)1basedclosed(start_col - 1)end_colDISTANCE and NEAREST still ignore
coordinate_systemand conflateinterval_typewith bedtools-compatibility math; that is tracked separately in #89 and out of scope here.Closes #88
Proposed changes
src/giql/generators/base.pyAdd three helpers near
_get_column_refs:_resolve_table(column_ref, table_name=None)— resolve theTableconfig that backs a column reference, mirroring the alias-resolution logic of_get_column_refsso each side of a column-join can be canonicalized independently._canonical_start(raw_start, table)— staticmethod; wrap a raw start column expression to yield canonical 0-based half-open start. Returns(start_col - 1)for 1-based tables, the raw expression otherwise._canonical_end(raw_end, table)— staticmethod; wrap a raw end column expression to yield canonical 0-based half-open end. Returns(end_col + 1)for 0-based-closed,(end_col - 1)for 1-based-half-open, raw expression for 0-based-half-open and 1-based-closed.Modify two existing methods:
_generate_range_predicate— call_resolve_table(column_ref, self._current_table), pass the result through_canonical_start/_canonical_end, then run the unchanged comparison-operator branches._generate_column_join— same pattern, but resolve aTablefor each side independently to support cross-convention joins.tests/generators/test_base.pyAdd four fixtures and eleven tests asserting the canonicalization behavior for every
(coordinate_system, interval_type)combination across all three predicates and both forms (range-literal and column-join), plus theSpatialSetPredicateflow-through. Tests follow the BDD naming patterntest_<method>_should_<outcome>_when_<condition>and the AAA structure with Given/When/Then docstrings.Test cases
TestBaseGIQLGenerator(start - 1)and end stays rawTestBaseGIQLGenerator(start - 1)and end stays rawTestBaseGIQLGenerator(start - 1)and end stays rawTestBaseGIQLGenerator(start - 1), default side stays rawTestBaseGIQLGeneratorcoordinate_system="0based"andinterval_type="half_open"TableTestBaseGIQLGenerator(end + 1)and start stays rawTestBaseGIQLGeneratorTestBaseGIQLGenerator<=/>=TestBaseGIQLGenerator>=/<=TestBaseGIQLGeneratorchr1:1500(start - 1)TestBaseGIQLGenerator(start - 1)on the table sideSpatialSetPredicateflow-through under 1-based-closed