Merged
Conversation
Delete `build_narrow_context`, `process_query`, and `process_batch` — the old uncalibrated pipeline that is no longer called. The CLI exclusively uses `prescore_batch` + `score_calibrated_batch`. Also remove unused `info` import and clean up the stale module-level doc comment.
Delete ToleranceHierarchy struct and inline its prescore/secondary fields directly onto Scorer as broad_tolerance and secondary_tolerance. The tertiary_tolerance() method is inlined at its single call site. Update all re-exports, consumer function signatures, and construction sites.
Ports SearchResultBuilder → ScoredCandidateBuilder in results.rs using the new ScoringFields field names and [f32; N] arrays for per-ion data. Updates finalize_results in pipeline.rs to use the new builder and return Result<ScoredCandidate, DataProcessingError>. Callers (Task 5) still expect IonSearchResults; those mismatches are intentional intermediate state.
Replace all IonSearchResults references with ScoredCandidate in the accumulator, score_calibrated_extraction, score_calibrated_batch, and FullQueryResult. The pipeline now produces ScoredCandidate throughout; CLI callers in processing.rs will be updated in Task 6.
…erive Delete search_results.rs entirely and remove all references to IonSearchResults and SearchResultBuilder across the codebase. All consumers now use the new ScoredCandidate/CompetedCandidate/FinalResult types. Drop the parquet_derive dependency since the derive macro is no longer used.
- ScoreTraces → ElutionTraces; fields main_score → apex_profile, ms2_cosine_ref_sim → cosine_trace - ScoringContext → Extraction; field query_values → chromatograms - ApexLocation/ApexScore::raising_cycles → rising_cycles - PeptideMetadata::ref_rt_seconds → query_rt_seconds - build_candidate_context → build_broad_extraction in pipeline.rs - build_calibrated_context → build_calibrated_extraction in pipeline.rs - main_loop → execute_pipeline in processing.rs - process_speclib → run_pipeline in processing.rs / main.rs - Move SCRIBE_FLOOR from scribe.rs to apex_features.rs; delete scribe.rs - Update iter_scores() string literals: "main_score" → "apex_profile", "ms2_cosine_ref_sim" → "cosine_trace" - Update viewer files (computed_state.rs, plot_renderer.rs) for all renames
…emove secondary_tolerance Add get_spectral_tolerance() and get_isotope_tolerance() to CalibrationResult, thread them as explicit parameters through execute_secondary_query and its callers, and delete the secondary_tolerance field from Scorer.
- Add `lookback: usize` parameter to `find_optimal_path` (calibrt), removing the hardcoded `let lookback = 30` - Add `lookback: usize` parameter to `calibrate_with_ranges`; update the `calibrate` convenience wrapper to pass 30 - Pass `config.dp_lookback` from `CalibrationConfig` through `calibrate_from_phase1` in processing.rs - Pass lookback=10 in the identity fallback in rt_calibration.rs (2-point curve needs no large window) - Remove unused `lowess_frac` field from `CalibrationConfig` - Update calibrt integration tests to supply the new lookback argument
…ult to ViewerResult Update viewer-facing API to use clearer names: FullQueryResult → ViewerResult with fields traces/longitudinal_apex_profile/chromatograms/scored, and the method process_query_full → score_for_viewer on Scorer.
Separate CLI output into three streams for clearer user experience:
- stdout: brief phase milestones + 1% FDR result summary
- log file: full tracing record at configured level (default: {output_dir}/timsseek.log)
- stderr: progress bars (TTY only) + warn/error tracing messages
Replace -v/-q verbosity flags with --log-path and --log-level options.
Progress bars auto-hide when stderr is not a terminal.
… debug-level for expected failures Replace StorageProvider probe in sniff_cached_index with a direct Path::exists() check for local paths, eliminating spurious ERROR/INFO lines on the normal "not a cached index" code path. Cloud paths keep the probe but log at debug! instead of error!. Also demote cache-miss error! to debug! in try_load_from_cache, simplify the load_index_auto detection log, and drop noisy info! calls in timscentroid storage.rs to trace!/debug!.
…g, max-qvalue - Upgrade forust-ml 0.4.8 → 0.5.0 - GBM early stopping at 100 rounds (PrecomputedFeatures row-major matrix) - Load speclib/calib_lib once in main(), pass by &Speclib reference - Add RunReport (per-invocation) with speclib/index loading timings - Add --max-qvalue CLI arg (default 0.5, filters Parquet output) - Fix duplicate NUM_MS2_IONS/NUM_MS1_IONS constants (import from mod.rs) - Re-export FileReport, RunReport from scoring mod
Change find_optimal_path signature to accept &mut Vec<f64> and &mut Vec<Option<usize>> buffers instead of allocating them internally. Caller in calibrate_with_ranges creates temporary buffers for now; Task 3 will move ownership to CalibrationState for reuse across calls.
Introduces CalibrationState, a reusable struct that owns Grid, DP buffers, and path indices to enable incremental calibration without repeated allocation. Makes CalibrationCurve::new pub(crate) so fit() can construct it within the crate.
…ibrantHeap::iter()
…ces + cached profiles
…_apex_location as wrappers
Move the core extraction logic from Scorer::build_broad_extraction into a standalone generic function that works with any KeyLike type (IonAnnot or String), enabling reuse by both the CLI scorer and the viewer.
Introduce the calibration state machine and background scoring thread infrastructure for the viewer's live RT calibration panel: - Create calibration.rs with ViewerCalibrationState (Idle/Running/Paused/Done), background thread using AtomicU8 control + thread::park for pause, bounded sync_channel for CalibrationMessage snapshots, and CalibrantHeap accumulation with periodic snapshot sends. - Add Pane::Calibration variant to the dock layout. - Wrap ElutionGroupData in Arc<> for sharing with the background thread. - Add calibration.poll() to the update loop with request_repaint. - Make build_extraction() pub (was pub(crate)) so the viewer can call it. - Add calibrt dependency to the viewer.
Add render_panel() to ViewerCalibrationState with: - Context-sensitive control buttons (Start/Pause/Resume/Stop/Reset) - Progress counters (scored / total, calibrants / capacity) - egui_plot scatter showing suppressed, retained, and path grid cells - Fitted calibration curve as a cyan line sampled at 200 points - WRMSE display and RT tolerance suggestion with Apply button Also adds CalibrationCurve::points() public accessor to calibrt.
…ge) reference lines
- CalibrationState::measure_ridge_width() expands from path cells into
adjacent cells above a weight threshold fraction
- Returns RidgeMeasurement { x, half_width, total_weight } per column
- Viewer: weighted-average half-width as global tolerance (heavy columns
count more), replaces non-suppressed cell residual approach
- CalibrationResult stores ridge widths and interpolates at query RT - get_tolerance() now returns position-dependent RT tolerance (wider at edges, tighter in middle) based on the calibration grid ridge width - CLI's calibrate_from_phase1 switched to CalibrationState API for ridge measurement after curve fitting - get_tolerance receives library RT (ridge widths indexed by library RT) - Fallback to uniform tolerance when no ridge data available - RIDGE_WIDTH_MULTIPLIER (1.0) and MIN_RT_TOLERANCE_MINUTES (0.5) tunable
Move Array2D and ArrayElement from timsquery into a new rust/array2d workspace crate with its own Array2DError type. timsquery re-exports from array2d and bridges errors via From<Array2DError> for DataProcessingError. calibrt gains array2d as a direct dependency.
Makes the semantic meaning explicit: library RT on the x-axis and observed RT on the y-axis. Also renames RidgeMeasurement.x to .library.
…ation viz Pathfinding: use geometric mean (sqrt weights) for edge weight formula to reduce bias against sparse regions. After DP pass, greedily extend path backward/forward through monotonic non-suppressed cells to cover full RT range. Viewer: draw extrapolated prediction as dashed red line beyond curve bounds, clamped to grid y-range to prevent runaway extrapolation.
Print training fold progress and scoring time to stderr so users can tell the rescore phase isn't frozen. Uses Duration debug format for automatic unit selection. Also fix: score() was calling assign_scores() redundantly after fit() already assigned them — removed the double scoring pass.
…tric Add column_weight to RidgeMeasurement (total weight in full column) alongside ridge_weight (weight inside ridge bounds). Compute in-ridge ratio in RidgeWidthSummary and display as percentage in both CLI calibration summary and viewer tolerance panel.
- Viewer deadlock: drop channel receiver before joining background thread in reset() and Drop, use try_send for Done messages so the background thread never blocks on a full channel. - RT fields: replace ambiguous query_rt_seconds/delta_rt/sq_delta_rt/ recalibrated_rt with explicit library_rt, calibrated_rt_seconds, obs_rt_seconds, and calibrated_sq_delta_rt computed from calibrated residuals. ML features updated accordingly. - Batch error handling: replace .unwrap() on run_pipeline with proper error propagation; abort batch on I/O errors (disk full, permissions) instead of retrying every remaining file.
- Grid::reset() preserves bin center geometry instead of zeroing nodes; add Grid::reconfigure() for changing dimensions. - CalibrationState::update returns Result — rejects NaN/Inf coordinates and weights at the grid boundary instead of silently accumulating them. Propagated as error in CLI, logged as warning in viewer. - Replace bare .unwrap() on partial_cmp with descriptive .expect() messages documenting the invariant that NaN scores should not reach the sort phase.
- I4: Document count_falling_steps convention (apex counts as 1) - I5: get_frag_range returns Result instead of panicking on non-DIA files - I6: n_scored in calibration JSON now reflects Phase 1 library size, not calibrant count (which was redundant with n_calibrants) - I7: rt_range_seconds in calibration JSON now uses raw file RT range from the cycle mapping, not the calibrant subset - I8: Fold progress uses atomic eprintln! lines instead of split eprint!/eprintln! to avoid interleaving with progress bars - I9: CalibrantCandidate Ord uses f32::total_cmp for sound ordering - I10: Parquet writer returns Result from add/flush/close instead of panicking on write errors; propagated as TimsSeekError::Io
- Extract save_calibration_dialog helper to deduplicate Paused/Done save button logic in the viewer - Fix redundant sqrt() call in apex_finding compute_pass_1 - Mark Parquet columns as non-nullable (no data is ever null, saves validity bitmap overhead)
prescore() only needs ApexLocation — the metadata (including a cloned digest String per peptide) was built and immediately discarded. Call build_extraction directly instead of build_broad_extraction to avoid the unnecessary allocation in the Phase 1 hot loop.
- Make rayon an optional dependency (default on). Serial mode is now --no-default-features instead of the dead serial_scoring feature. - Gate parallel code behind #[cfg(feature = "rayon")], serial behind #[cfg(not(feature = "rayon"))]. - Add PrescoreTimings struct with extraction/scoring breakdown and n_passed_filter/n_scored counters. Aggregated via fold/reduce in both parallel and serial paths. - Rename thread-summed timing fields to *_thread_ms to distinguish from wall-clock phase timings. - Remove dead serial_scoring feature, dead rayon re-export in cv.rs. - Remove per-item filter timing (measured clock overhead, not work).
Replace manual Instant::now()/elapsed()/println!() boilerplate with two
primitives: timed! for hot-path Duration accumulation (pipeline.rs) and
TimedStep for progressive CLI output with auto dot-padding and tracing
spans (main.rs, processing.rs, cv.rs). Duration display uses {:?} for
automatic unit selection everywhere.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
... and A LOT more