refactor: PR7 — naming pass + type:ignore audit#192
Merged
Conversation
Local variable in Haps._get_geno_offset_idx is the (region, sample, ploid) index tuple passed to np.ravel_multi_index. The 'rsp' acronym was opaque to new readers.
Kernel parameters used the plural form while field/local names used the singular. Field name 'geno_offset_idx' (the single conceptual index array) wins — kernels in _genotypes.py and _tracks.py now use the same name as ReconstructionRequest.geno_offset_idx and the local at the call sites in _haps.py. Pure rename; semantics unchanged.
Field name now spells out what it is (a permutation array). Local variables in callers also renamed (perm -> permutation) where they refer to the field. Loop variables and 'permuted_*' result names were already distinct and remain as-is. Pure rename; SplicePlan is internal (not in __all__).
…nels Both names stay (reconstruct_haplotype_from_sparse is the per-(query, hap) inner kernel; reconstruct_haplotypes_from_sparse is the batched parallel driver that dispatches to it). Add a one-line docstring note on each so the relationship is explicit without forcing readers to read both bodies.
Audit pass for PR7 task 2. Stripped all 68 ignores, ran pyrefly, then restored only the ~29 that suppress real warnings/errors. Each remaining ignore now has a narrow rule code and a one-line reason. Removed (stale, no longer needed): - _ragged.py: ufunc_comp_dna numba ufunc call - _dataset/_haps.py: ak.to_packed / ak.to_regular sites; pylance-only note - _dataset/_impl.py: self._seqs.genotypes.shape index - _dataset/_indexing.py: ak.flatten().to_numpy() narrowing - _dataset/_query.py: ak.where + reverse_complement; recon return widen - _dataset/_rag_variants.py: alleles.content layout walk, reverse-complement field assignment, NDArray casts that pyrefly already narrows - _dataset/_reference.py: ref.reshape / to_padded / squeeze on Ragged; torch import guards - _dataset/_tracks.py / _dataset/_write.py: misc ndarray construction sites - _torch.py / data_registry.py: dead/unreachable branches - _variants/_sitesonly.py: raise ValueError unreachable annotation Annotated (kept with rule code + reason): - HashTable max=int across _indexing/_reference/_splice -> hirola stubs require numpy.Number but int works at runtime - np.unravel_index / np.ravel_multi_index on Ragged.shape (_haps) -> Ragged.shape is tuple[int|None,...]; numpy overload expects all-int - np.ones with ak.Array shape (_rag_variants) -> same shape-with-None issue - ak.str.length attribute lookup -> ak.str submodule absent from top-level awkward stubs - RaggedIntervals / RaggedAlleles constructor calls (_ragged, _dummy) -> seqpro Ragged stubs widen __getitem__/squeeze/from_offsets returns - replace(self, ...) and super().__getitem__(idx) returns (_impl) -> typevar narrowing not preserved across base-class return - to_kind(_kind) (_impl) -> _kind union widened by control-flow merge - DataFrame[regions] / DataFrame.filter(regions) (_reference) -> polars stubs reject some union members our runtime accepts - recon = tuple(o.reshape/squeeze ...) (_query) -> heterogeneous dispatch across array kinds - cast() on offsets after layout walk (_rag_variants) -> documents narrowing pyrefly already infers
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Final refactor-campaign PR. Two passes folded into one branch.
Naming pass:
rsp_idx→region_sample_ploid_idx(single local inHaps._get_geno_offset_idx; was an unexplained 3-tuple of region/sample/ploid)SplicePlan.perm→SplicePlan.permutation(clearer at call sites; loop vars never usedperm)geno_offset_idxs→geno_offset_idx(kernel param names now match theReconstructionRequestfield; the prior mismatch was the real bug)reconstruct_haplotype_from_sparse(singular kernel) andreconstruct_haplotypes_from_sparse(batched driver) kept distinct, with cross-referencing docstrings disambiguating the pair`# type: ignore` audit:
No public API changes. No on-disk format changes.
Test plan
🤖 Generated with Claude Code