Skip to content

Ancestors parallel#1124

Merged
jeromekelleher merged 15 commits intotskit-dev:dev-1.0from
jeromekelleher:ancestors-parallel
Mar 15, 2026
Merged

Ancestors parallel#1124
jeromekelleher merged 15 commits intotskit-dev:dev-1.0from
jeromekelleher:ancestors-parallel

Conversation

@jeromekelleher
Copy link
Member

Note to self - I'm not convinced futures approach is correct here and could lead to unbounded memory use. Need to figure out how to deal with potentially empty ancestors though and whether they are a real issue.

Extract per-ancestor work into _build_one_ancestor() and submit it to
a thread pool during Pass 2. A SynchronousExecutor (matching bio2zarr's
pattern) is used when num_threads<=0 so threaded and non-threaded paths
share the same code. Futures are consumed in submission order to ensure
deterministic output. The --threads CLI flag is wired through to
infer_ancestors for both infer-ancestors and run commands.
Add INFO logging throughout infer_ancestors for source path, store
dimensions, active filters, sample selection, site stats, position
range, threading mode, and per-interval ancestor counts. Add DEBUG
logging to AncestorWriter for chunk flush timing and finalize timing.
Suppress zarr/numcodecs debug logs in CLI logging setup.
Null out each future in the list after its result is consumed so the
AncestorResult and its large haplotype array can be garbage collected
immediately rather than being kept alive for the entire interval.
@codecov
Copy link

codecov bot commented Mar 12, 2026

Codecov Report

❌ Patch coverage is 92.69663% with 13 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (dev-1.0@f9fb2fc). Learn more about missing BASE report.

Additional details and impacted files
@@            Coverage Diff             @@
##             dev-1.0    #1124   +/-   ##
==========================================
  Coverage           ?   84.30%           
==========================================
  Files              ?       16           
  Lines              ?     4193           
  Branches           ?      683           
==========================================
  Hits               ?     3535           
  Misses             ?      530           
  Partials           ?      128           
Flag Coverage Δ
C 81.57% <ø> (?)
c-python 58.31% <ø> (?)
python-tests 89.38% <92.69%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Python API 89.38% <0.00%> (?)
Python C interface 66.66% <0.00%> (?)
C library 88.68% <0.00%> (?)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Make index a required parameter of add_ancestor and use a pending dict
to buffer out-of-order arrivals, draining contiguously into the write
buffer. Remove the all-missing ancestor guard from _build_one_ancestor
since the C engine guarantees focal sites are always set, and use the
C-returned start/end directly. Add thorough tests for indexed insertion
across chunk boundaries, multiple intervals, and shuffled orders.
Use concurrent.futures.as_completed instead of consuming futures in
submission order. The AncestorWriter's index-aware pending dict ensures
deterministic output regardless of completion order, while as_completed
gives better throughput by processing results as soon as they are ready.
Use a set for futures and discard each after its result is consumed,
so the AncestorResult and its haplotype array can be garbage collected
immediately rather than being retained for the entire interval.
Revert lazy future approach which reintroduced memory leak. Restore
eager SynchronousExecutor with as_completed and per-future discard.
Add CLI warning that ancestor-level progress requires threads >= 1.
Wire the existing C-level genotype_encoding flag through AncestorsConfig
and into the AncestorBuilder constructor. TOML config accepts both integer
(0/1) and string ("eight_bit"/"one_bit") values. Tests verify one-bit
produces the same ancestor set as eight-bit across single/multi-interval,
threaded, many-sample, and small-chunk scenarios.
Report ab.mem_size in the per-interval info log so the builder's direct
allocation can be compared against process RSS to isolate memory growth.
Replace AncestorResult with Ancestor dataclass that stores only the
active haplotype fragment (start:end site range) instead of a full
num_sites-length array. The full haplotype is expanded on demand at
flush time in AncestorWriter. Also simplify AncestorWriter.add_ancestor
to accept a single Ancestor object, and cap in-flight futures to
2*num_threads using wait(FIRST_COMPLETED) following the upstream
threaded_map pattern.
…act helpers

- Remove add_terminal_site() calls and +1 max_sites padding; per-interval
  builders naturally stop at the last site without a sentinel
- Replace conditional tqdm imports with module-level import and
  disable=not progress; remove interval-level bar, add genotype-loading bar
- Extract _open_source, _resolve_haplotype_count, _apply_site_stats, and
  _process_interval from infer_ancestors (~280 -> ~60 lines of orchestration)
- Add time.monotonic() timing for Pass 1, Pass 2, and total elapsed
- Make infer_ancestors accept only Source (not bare zarr.Group); tests use
  _src() helper to wrap in-memory groups
The resource module is Unix-only and causes ImportError on Windows.
Use psutil (now a declared dependency) for portable RSS reporting.
@jeromekelleher jeromekelleher merged commit c519b0f into tskit-dev:dev-1.0 Mar 15, 2026
13 checks passed
@jeromekelleher jeromekelleher deleted the ancestors-parallel branch March 15, 2026 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant