Skip to content

v0.4.0

Choose a tag to compare

@praneethnamburi praneethnamburi released this 12 May 15:46
· 5 commits to main since this release

Headline feature: a Log-integrated EMG/EKG artifact cleaning
pipeline. Ports the multi-stage cleaner from
pn-projects/projects/emg_ica_cleaning.py (preprocess → ICA-based
ECG suppression → ACC-guided motion regression with safety gates)
into the package, with a clean splice-back into lf.signals /
lf.sensors[*].emg.

Added

  • Log.clean_emg_ekg_artifact(*, config, motion, in_place, generate_report, splice_source)
    end-to-end pipeline running on every EMG channel in the Log. By
    default mutates lf.signals in place, rebuilds the affected
    Sensor.emg bundles, and writes a multi-page PDF report next to
    the source CSV. Pass in_place=False to inspect diagnostics
    without mutating, or generate_report=False to skip the PDF step.
    splice_source chooses which cleaned variant gets spliced back —
    "combined" (default), "ekgonly", or "motiononly".
  • delsys.cleaning module — building blocks for users who want to
    drive the pipeline manually (fit_ica,
    score_components_against_ekg, auto_select_ekg_components,
    reconstruct_without_components, regress_out_ekg_from_emg,
    regress_out_motion_from_emg, harmonize_multirate_inputs,
    run_pipeline).
  • CleaningConfig / CleaningResult dataclasses (re-exported from
    delsys) — the configuration and result containers for
    Log.clean_emg_ekg_artifact. CleaningResult carries
    cleaned_emg_ekgonly (preprocess+ECG only) and
    cleaned_emg_motiononly (preprocess+motion only) variants
    alongside the combined cleaned_emg, plus feature_names and
    fname so the report and review helpers can label channels and
    default the output path.
  • CleaningResult.generate_report(path=None) — writes a single
    multi-page PDF (page 1: ranked summary table; subsequent pages:
    one per EMG channel with raw vs each cleaning variant + PSD).
    Defaults to <source_csv_stem>_cleaning_report.pdf next to the
    input CSV when result.fname is stamped.
  • CleaningResult.review(channels=None) — interactive matplotlib
    viewer with three stacked time-domain panels (raw vs ekg-only, raw
    vs motion-only, raw vs cleaned), arrow-key channel navigation, and
    per-overlay toggles (e / m / c / o). The three panels share
    both x and y axes so amplitude comparisons across stages line up
    without zoom-juggling.
  • CleaningResult.review_components(components=None) — stacked
    4-panel viewer over the ICA components: top panel is the IC time
    course, the next three are the input signals it most contributes to
    (ranked by |A[i, c]|). Arrow-key cycling, home / end jumps,
    q to close. Use to decide whether to manually add or drop a
    component from the auto-detected set.
  • CleaningResult.ica and CleaningResult.ica_input_feature_names
    fields — full ICAResult (model, sources, mixing, feature names)
    from the ECG stage plus the per-input-row labels (EMG names with
    "EKG" appended). Both are None when the ECG stage didn't run.
    Powers review_components and is exposed for power-user
    introspection.
  • PDF report layout — page 1 is the new ECG diagnostics page (bar
    plot of per-IC correlation against the EKG reference, threshold
    line, and a text block listing the components removed); page 2 is
    the ranked summary table, now with a numeric channel column, a
    per-channel location label (from lf.emg.signal_names), and a
    motion dB column isolating the motion stage's contribution;
    pages 3..N are the per-channel pages (now with both x- and y-axis
    sharing across the three time-domain panels). The cleaner shifts
    the EMG baseline up front via pysampled.Data.shift_baseline so
    the dB metrics are not biased by a constant DC offset.
  • tutorials/cleaning_emg_ekg_artifact.md — end-to-end walkthrough
    covering load → dry-run → PDF report → interactive review →
    in-place mutation → power-user knobs. Also covers
    review_components, splice_source, and the new tutorial sample.
  • scripts/make_tutorial_sample.py and the bundled
    tutorials/data/taichi_trial5_6s.csv (6 s, every sensor kept)
    • matching reference report PDF — sample data the tutorial points
      at, big enough for ICA to converge on a real recording.

Internal

  • ECG component selection defaults to lagged-correlation
    auto-detection. Manual override via
    CleaningConfig.ecg_components_to_remove.
  • Motion regression default ACC source is sensor-paired auto-discovery
    (Trigno Avanti sensors that carry both EMG and ACC). Custom
    pairings via motion={emg_num: acc_num_or_location}.
  • Pipeline runs offline only in v1. The realtime / overlap-add
    variant from the source is intentionally not ported — restore if
    a real streaming use case appears.
  • run_pipeline runs one extra regress_out_motion_from_emg pass
    on the preprocessed signal (skipping the ECG step) to populate
    cleaned_emg_motiononly. Cheap compared to the ICA fit, which is
    not duplicated.
  • Reporting / review helpers (_rank_channels_by_attenuation,
    _draw_channel_panels, _motion_outcome_for_channel, etc.) live
    in src/delsys/cleaning.py alongside the dataclass. Matplotlib /
    scipy.signal.welch are imported lazily inside the helpers so
    run_pipeline-only callers don't pay the import cost.

Fixed

  • clean_emg_ekg_artifact() no longer crashes on Log objects
    loaded from very-old pickles where the per-:class:Signal meta
    dict is empty. _normalize_signal_lengths reads meta.get("modality")
    defensively; the splice-back updates each affected sensor's
    emg bundle via :meth:pysampled.Data._clone instead of
    rebuilding the whole :class:Sensor from lf.signals — so the
    cleaning lands on lf.emg even when per-:class:Signal access
    paths can't be repaired.
  • Auto-report path is checked for write access before the cleaning
    pipeline runs. A locked PDF (file open in another viewer) now
    raises a clear :class:PermissionError with a "close it and
    re-run, or pass generate_report=False" hint up front — no more
    wasted ICA work plus a half-applied in-place splice with no fresh
    report to match it.
  • _band_power (Welch integral used by the report's ecg-band dB
    column) is NumPy 2.0-compatible. The previous
    getattr(np, "trapezoid", np.trapz) fallback evaluated the
    default eagerly and tripped the expired-attribute error on NumPy
    2.0; the new form uses hasattr(np, "trapezoid") so the legacy
    np.trapz is only accessed on NumPy < 1.26.