Migrate FLASHApp visualizations to OpenMS-Insight components#92
Migrate FLASHApp visualizations to OpenMS-Insight components#92t0mdavid-m wants to merge 18 commits into
Conversation
Foundation for migrating FLASHApp visualizations onto OpenMS-Insight (Phase 0):
- src/parse/long_format.py: pure-polars adapters that explode FLASHApp's
arrays-per-scan caches into the long format OpenMS-Insight components filter
by column value:
* explode_spectrum_long: MonoMass[]/SumIntensity[] -> one row per peak with
explicit index + per-scan mass_id (assigned pre-filter so massIndex maps
to the original array position) -> LinePlot/Table with filters={scanIndex:index}
* explode_combined_spectrum_long: deconv + annotated -> primary + overlay
series for the tagger LinePlot
* explode_signal_peaks_long: nested SignalPeaks/NoisyPeaks -> Scatter3D long
format (index, mass_id, mz, charge, intensity, kind); robust to empty
nested peak columns
* density_series_long: precomputed target/decoy curves -> DensityPlot long
{series,x,y}; handles empty decoy
These are additive — the existing render pipeline is untouched, so old and new
paths coexist during the phased rollout.
- requirements.txt: declare openms-insight (git dependency; editable/path for
local dev) — previously absent.
- tests/test_long_format.py: 11 tests incl. index->value filtering reproducing
iloc semantics, mass_id-before-intensity-filter, row-count preservation,
signal/noise + massIndex isolation, empty-decoy. Full suite: 53 passed.
https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
Capture the component mapping, state-key mapping, the index->value / long-format data-model change, the phased rollout, layout-parity requirements, and the no-feature-loss verification gate. Serves as the source-of-truth contract to audit each workflow against before retiring src/render/*. https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
Validate the migration end-to-end on the 266MB bundled example data (example-data/workspaces/default), feeding the ACTUAL cached parquet frames through the long-format adapters + OpenMS-Insight components: - deconv_spectrum explodes to the exact peak count; LinePlot value-filter reproduces the old iloc[scanIndex] slice. - threedim_SN_plot validates the optional-massIndex Scatter3D semantics on real data (scan-only shows all peaks; massIndex isolates one mass). - density_decoy confirmed empty/Null-typed (empty-decoy path). - FeatureView.explode_traces handles the real quant_dfs (1437 groups -> 722645 trace points). - InternalFragmentMap port matches the CURRENT src.render.sequence algorithm (the bundled internal_fragment_data.pkl.gz is stale from an older algorithm version while the feature was disabled; documented in the test). Tests skip cleanly when example data / openms_insight are absent. Suite: 48 passed, 2 skipped. https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
Add src/render_oi/ — the OpenMS-Insight rendering engine that replaces the monolithic flash_viewer_grid for FLASHDeconv, composing individual components with native Streamlit layout: - build_component(): factory mapping each FLASHDeconvLayoutManager.COMPONENT_NAMES entry to an OI component, loading the existing .pq caches through the long_format adapters. Covers heatmaps (MS1/MS2 deconv/raw), scan/mass Tables, deconv/annotated LinePlots, 3D Scatter3D (optional massIndex), FDR DensityPlot (precomputed), and SequenceView/InternalFragmentMap when a sequence is set. - render_experiment(): renders one [row][col] panel with a PER-EXPERIMENT StateManager (distinct session_key) so side-by-side panels don't share selections. Cross-link via scanIndex/massIndex identifier->column model. - FLASHDeconvViewer.py: render_panel() dispatches to the new engine when FLASHAPP_USE_OPENMS_INSIGHT is set, else the legacy render_grid. Default OFF — opt-in, reversible rollout. Layout manager needs NO changes (it only emits component-name strings). Verified end-to-end on the real example_fd cache: all 10 components build; a simulated scan-273 click filters mass_table->122, deconv_spectrum->122, 3D->2074 (all) / 3 (mass0) — exact parity with the source row counts. The index->value migration (highest-risk change) is proven correct on real data. +11 tests; FLASHApp suite: 59 passed, 2 skipped. https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
Claude Code on the web starts from a fresh container with no Python deps and an unbuilt Vue bundle. This hook reproduces the full bootstrap so tests and the OpenMS-Insight migration work out of the box: - pip install FLASHApp requirements + pytest (user-site) - npm install + build the OpenMS-Insight Vue bundle, mirror dist into the package path the wheel force-includes, then editable-install the local sibling repo - persist PYTHONPATH (repo root for src.*) and PATH (~/.local/bin) Critically, the hook strips the 'openms-insight @ git+...' line from requirements.txt before installing: building OI from the git URL fails (no prebuilt bundle in a fresh clone) AND aborts the whole requirements install, which had been silently dropping scipy/pyopenms. OI is instead installed from the local sibling with its freshly-built bundle. Web-only (CLAUDE_CODE_REMOTE), idempotent, synchronous. Validated: hook exits 0; all core deps + all 8 OI components import via user-site; 11 FLASHApp viewer + 29 OI tests pass through the hook environment (no venv). https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
Add src/render_oi/tnt_viewer.py — the OpenMS-Insight engine for FLASHTnT,
composing protein Table, tag Table, sequence view, augmented combined spectrum,
score-distribution DensityPlot, and heatmaps with a per-experiment StateManager.
Reproduces FLASHTnT's proteoform→scan resolution: the protein table sets
proteinIndex; render_experiment_tnt resolves it via build_proteoform_scan_map to
deconvIndex BEFORE rendering downstream panels, so:
- spectra / mass / sequence filter by {deconvIndex: index}
- tag table filters by {proteinIndex: ProteinIndex}
- sequence view filters by {proteinIndex: proteoform_index} with per-proteoform
coverage coloring (coverage[]/maxCoverage from the cached sequence_data)
The augmented spectrum uses LinePlot's overlay extension (deconv primary +
annotated overlay). Handles both pickled (.pkl.gz) and parquet sequence_data
caches; empty id-FDR densities degrade cleanly via the precomputed DensityPlot.
FLASHTnTViewer.py: render_panel_tnt() dispatches to the new engine under
FLASHAPP_USE_OPENMS_INSIGHT, else legacy render_grid (default OFF). Layout
manager unchanged.
Verified on the real antibody cache: all 7 components build; proteoform→scan
resolves (protein 0→deconv 0); protein-0 click → 15 tags (exact); sequence view
→ 450 residues with coverage; combined spectrum → 564 primary + 65657 overlay
peaks (exact). +10 tests; FLASHApp suite: 69 passed, 2 skipped.
https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
Add src/render_oi/quant_viewer.py — the OpenMS-Insight engine for FLASHQuant's
single feature-group view: a feature-group selector Table (click sets
featureGroup) above a FeatureView filtered by {featureGroup: feature_group}.
Add explode_quant_traces_long() to src/parse/long_format.py: FLASHQuant stores
per-feature-group parallel arrays (Charges/IsotopeIndices/CentroidMzs as scalars
per trace; RTs/MZs/Intensities as comma-joined point strings per trace). The
adapter zips the per-trace lists, splits/explodes the point strings, and emits
FeatureView's long format (one row per trace point: feature_group, charge,
isotope, centroid_mz, rt, mz, intensity).
FLASHQuantViewer.py dispatches to render_experiment_quant under
FLASHAPP_USE_OPENMS_INSIGHT, else legacy render_grid (default OFF).
Verified on the real quant cache: 1437 feature groups → 722,645 trace points;
group 0/5/100 explode to 1384/239/208 points exactly; FeatureView filters to
those exact counts; selector table shows all 1437 groups. +3 tests; FLASHApp
suite: 72 passed, 2 skipped.
https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
Update docs/openms-insight-migration.md: mark all 3 phases built (flag-gated in src/render_oi/), refresh test counts (FLASHApp 72 passed/2 skipped; OI 464), and add the per-workflow no-feature-loss audit that gates retiring src/render/*. For each workflow the audit maps every legacy StateTracker key and update.py filter to its src/render_oi replacement (with file:line), accounts for every component's columns/axes, marks each ✅ data-path-verified (with the real-data test name) or 👁 needs-browser, and gives a click-by-click browser checklist. Also documents the exact post-audit deletion list — keeping scan_resolution.py and sequence_data_store.py (still imported by tnt_viewer) while removing components/render/update/StateTracker/initialize/compression.py and the js-component bundle. https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
…dle) in Docker Viewers: flip FLASHAPP_USE_OPENMS_INSIGHT from opt-in to opt-out across the three viewers (FLASHDeconv/TnT/Quant). Unset now renders via src/render_oi; set the var to 0/false/no/off to fall back to the legacy flash_viewer_grid. Dockerfiles (x86 + arm): add an openms-insight-build node stage that clones OpenMS-Insight at the validated commit (ARG OPENMS_INSIGHT_REF), runs npm build, and mirrors js-component/dist into openms_insight/js-component/dist — the path the hatchling wheel force-includes and rendering/bridge.py loads at runtime. The run-app stage installs that bundle-built checkout. Without this, the package's git tree omits the gitignored bundle and pip would ship a frontend-less install, so the requirements.txt openms-insight git line is stripped before pip install -r (alongside pyopenms) and replaced by this explicit, pinned install. Docs: update the migration doc — engine is now default-on (opt-out documented), add a Docker/packaging section explaining the bundle-build requirement, and note the browser checklist now gates only the deletion of the legacy engine, not the default flip. Verified: FLASHApp suite 72 passed/2 skipped; flag boolean truth-table; the Docker requirements-strip pipeline (drops pyopenms== and openms-insight@git, keeps pyopenms-viz); multi-stage ordering and the dist path vs bridge.py. https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
|
Warning Review limit reached
More reviews will be available in 2 minutes and 25 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (7)
📝 WalkthroughWalkthroughIntegrates OpenMS-Insight renderers behind a feature flag, adds Polars long-format adapters, updates packaging/CI/Docker to build and embed the Vue bundle, wires viewers to the new renderers, and adds unit and real-data tests for Deconv, TnT, and Quant. ChangesOpenMS-Insight Integration for FLASHApp Visualization
Possibly Related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (6)
.claude/hooks/session-start.sh (1)
32-32: 💤 Low valueStrip pattern is narrower than the Dockerfiles'.
Here you match
^openms-insight[[:space:]]*@, while the Dockerfiles use^openms-insight[[:space:]]*[@=<>!~]. Both strip the current@ git+...line, but if the requirement is ever changed to a pinned version (openms-insight==…), this hook would stop stripping it and the install would diverge from Docker. Align the patterns for consistency.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.claude/hooks/session-start.sh at line 32, The grep pattern in the session-start hook is too narrow and only strips lines matching '^openms-insight[[:space:]]*@', so update the grep call that writes to "$REQ_TMP" (the line using grep -ivE) to use the same character class as the Dockerfiles — '^openms-insight[[:space:]]*[@=<>!~]' — so pinned versions like 'openms-insight==...' are also removed and behavior stays consistent with Dockerfile installs.docs/openms-insight-migration.md (1)
137-139: ⚡ Quick winDoc should call out the CI install pitfall, not just Docker/dev.
This section frames the
requirements.txtgit ref as a harmless "declarative pointer" that both install paths strip. But the CI unit-test job does not strip it and currently fails on it (see the requirements.txt comment). Update this note to state that any path running a rawpip install -r requirements.txt(CI included) must strip/replace the line, so the contract here matches reality.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/openms-insight-migration.md` around lines 137 - 139, Update the note about requirements.txt to explicitly call out CI: state that any process which runs a raw pip install -r requirements.txt (including the CI unit-test job) will fail if the git+…@<branch> ref remains, so those paths must strip or replace that line before installing; reference the requirements.txt entry and the CI unit-test job/pip install -r requirements.txt command so readers know which workflows need handling.content/FLASHTnT/FLASHTnTViewer.py (1)
14-16: ⚡ Quick winCentralize the
USE_OPENMS_INSIGHTflag parsing (DRY).This exact env-flag block is duplicated verbatim in
content/FLASHQuant/FLASHQuantViewer.py(Lines 15-17) and shared with FLASHDeconv. Extracting a single helper (e.g. insrc/common) avoids drift if the truthy/falsy set ever changes.♻️ Suggested shared helper
# src/common/common.py import os def use_openms_insight() -> bool: return os.environ.get("FLASHAPP_USE_OPENMS_INSIGHT", "1").strip().lower() not in ( "0", "false", "no", "off", "", )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@content/FLASHTnT/FLASHTnTViewer.py` around lines 14 - 16, Extract the duplicated environment-parsing logic into a single helper (e.g., function use_openms_insight) and replace the module-level constant USE_OPENMS_INSIGHT in FLASHTnTViewer.py (and the corresponding duplicate in FLASHQuantViewer.py / FLASHDeconv) with a call to that helper; create the helper in a shared module (suggested: src/common/common.py) that exposes use_openms_insight(), import that function in FLASHTnTViewer.py and use it where USE_OPENMS_INSIGHT was referenced so the truthy/falsy set is maintained centrally.src/render_oi/tnt_viewer.py (2)
47-49: ⚡ Quick winAvoid the pandas→Polars→pandas round trip.
_load_pandas_plreads the parquet via pandas, converts to Polars, and you immediately call.to_pandas()again. Use the pandas loader (_load_pandas) directly to skip two conversions.♻️ Suggested change
- from src.render.scan_resolution import build_proteoform_scan_map - - prot = _load_pandas_pl(file_manager, dataset_id, "protein_dfs").to_pandas() - scan = _load_pandas_pl(file_manager, dataset_id, "scan_table").to_pandas() + from src.render.scan_resolution import build_proteoform_scan_map + from .deconv_viewer import _load_pandas + + prot = _load_pandas(file_manager, dataset_id, "protein_dfs") + scan = _load_pandas(file_manager, dataset_id, "scan_table") return build_proteoform_scan_map(prot[["index", "Scan"]], scan[["index", "Scan"]])🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/render_oi/tnt_viewer.py` around lines 47 - 49, The code currently loads parquet via _load_pandas_pl then calls .to_pandas(), causing an unnecessary pandas→Polars→pandas round trip; replace those calls with the pandas loader _load_pandas to read directly into pandas for both "protein_dfs" and "scan_table" (use _load_pandas(file_manager, dataset_id, "protein_dfs") and _load_pandas(file_manager, dataset_id, "scan_table")) and pass their ["index","Scan"] slices into build_proteoform_scan_map unchanged.
284-290: ⚡ Quick winResolver loads caches on every rerun even with no selection.
_build_proteoform_scan_mapreadsprotein_dfsandscan_tableon each rerun before checkingprotein_index. When nothing is selected the result is unused. Consider skipping the resolve (or caching it) whenstate_manager.get_selection(PROTEIN)isNone.♻️ Suggested guard
- scan_map = _build_proteoform_scan_map(file_manager, dataset_id) protein_index = state_manager.get_selection(PROTEIN) if protein_index is not None: + scan_map = _build_proteoform_scan_map(file_manager, dataset_id) entry = scan_map.get(int(protein_index)) deconv_index = entry["deconv_index"] if entry else None if state_manager.get_selection(DECONV) != deconv_index: state_manager.set_selection(DECONV, deconv_index)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/render_oi/tnt_viewer.py` around lines 284 - 290, Avoid building the proteoform scan map when no protein is selected: first check state_manager.get_selection(PROTEIN) and only call _build_proteoform_scan_map(file_manager, dataset_id) if that selection is not None; if you need the map elsewhere, cache the result (e.g., memoize by dataset_id) so repeated reruns don't re-read protein_dfs/scan_table. Update the logic around _build_proteoform_scan_map, entry, deconv_index, and state_manager.set_selection(DECONV, deconv_index) so the expensive read is skipped when protein_index is None and the DECONV selection update remains correct when a protein is selected.src/render_oi/deconv_viewer.py (1)
310-314: 💤 Low valueNo-op
renamecan be dropped.
{"mass": "mass", "index": "index"}renames both columns to their current names, so the call does nothing. Either remove it or fix it if a different target name was intended.♻️ Suggested cleanup
peaks_long = ( explode_spectrum_long(per_scan) - .rename({"mass": "mass", "index": "index"}) .with_columns(pl.int_range(pl.len()).over("index").alias("peak_id")) )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/render_oi/deconv_viewer.py` around lines 310 - 314, The call to .rename({"mass": "mass", "index": "index"}) in the peaks_long pipeline is a no-op; remove that .rename(...) call or replace it with the intended mapping if you meant to change column names. Update the expression that creates peaks_long (which uses explode_spectrum_long, .with_columns, pl.int_range(pl.len()).over("index").alias("peak_id")) by dropping the redundant .rename or correcting its mapping to the target column names so the pipeline is meaningful.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@requirements.txt`:
- Around line 143-148: The requirements entry for openms-insight is using a
moving branch ref (claude/peaceful-mayer-YqiXZ) which causes unit-tests CI to
install an unbuilt frontend; update the requirement to pin the same commit SHA
used by the Docker builds (9b4c57a60e63a4319c392afd6000aa205cdcbe36) by
replacing the existing "openms-insight @ git+...@claude/peaceful-mayer-YqiXZ"
line in requirements.txt so CI installs the identical, built revision as the
Docker/session-start paths; ensure the unit-tests workflow continues to run pip
install -r requirements.txt without any strip/replace steps so the pinned SHA is
used consistently.
---
Nitpick comments:
In @.claude/hooks/session-start.sh:
- Line 32: The grep pattern in the session-start hook is too narrow and only
strips lines matching '^openms-insight[[:space:]]*@', so update the grep call
that writes to "$REQ_TMP" (the line using grep -ivE) to use the same character
class as the Dockerfiles — '^openms-insight[[:space:]]*[@=<>!~]' — so pinned
versions like 'openms-insight==...' are also removed and behavior stays
consistent with Dockerfile installs.
In `@content/FLASHTnT/FLASHTnTViewer.py`:
- Around line 14-16: Extract the duplicated environment-parsing logic into a
single helper (e.g., function use_openms_insight) and replace the module-level
constant USE_OPENMS_INSIGHT in FLASHTnTViewer.py (and the corresponding
duplicate in FLASHQuantViewer.py / FLASHDeconv) with a call to that helper;
create the helper in a shared module (suggested: src/common/common.py) that
exposes use_openms_insight(), import that function in FLASHTnTViewer.py and use
it where USE_OPENMS_INSIGHT was referenced so the truthy/falsy set is maintained
centrally.
In `@docs/openms-insight-migration.md`:
- Around line 137-139: Update the note about requirements.txt to explicitly call
out CI: state that any process which runs a raw pip install -r requirements.txt
(including the CI unit-test job) will fail if the git+…@<branch> ref remains, so
those paths must strip or replace that line before installing; reference the
requirements.txt entry and the CI unit-test job/pip install -r requirements.txt
command so readers know which workflows need handling.
In `@src/render_oi/deconv_viewer.py`:
- Around line 310-314: The call to .rename({"mass": "mass", "index": "index"})
in the peaks_long pipeline is a no-op; remove that .rename(...) call or replace
it with the intended mapping if you meant to change column names. Update the
expression that creates peaks_long (which uses explode_spectrum_long,
.with_columns, pl.int_range(pl.len()).over("index").alias("peak_id")) by
dropping the redundant .rename or correcting its mapping to the target column
names so the pipeline is meaningful.
In `@src/render_oi/tnt_viewer.py`:
- Around line 47-49: The code currently loads parquet via _load_pandas_pl then
calls .to_pandas(), causing an unnecessary pandas→Polars→pandas round trip;
replace those calls with the pandas loader _load_pandas to read directly into
pandas for both "protein_dfs" and "scan_table" (use _load_pandas(file_manager,
dataset_id, "protein_dfs") and _load_pandas(file_manager, dataset_id,
"scan_table")) and pass their ["index","Scan"] slices into
build_proteoform_scan_map unchanged.
- Around line 284-290: Avoid building the proteoform scan map when no protein is
selected: first check state_manager.get_selection(PROTEIN) and only call
_build_proteoform_scan_map(file_manager, dataset_id) if that selection is not
None; if you need the map elsewhere, cache the result (e.g., memoize by
dataset_id) so repeated reruns don't re-read protein_dfs/scan_table. Update the
logic around _build_proteoform_scan_map, entry, deconv_index, and
state_manager.set_selection(DECONV, deconv_index) so the expensive read is
skipped when protein_index is None and the DECONV selection update remains
correct when a protein is selected.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d646420f-c45c-4fc2-bc76-362de9efc3a4
📒 Files selected for processing (19)
.claude/hooks/session-start.sh.claude/settings.jsonDockerfileDockerfile.armcontent/FLASHDeconv/FLASHDeconvViewer.pycontent/FLASHQuant/FLASHQuantViewer.pycontent/FLASHTnT/FLASHTnTViewer.pydocs/openms-insight-migration.mdrequirements.txtsrc/parse/long_format.pysrc/render_oi/__init__.pysrc/render_oi/deconv_viewer.pysrc/render_oi/quant_viewer.pysrc/render_oi/tnt_viewer.pytests/test_deconv_viewer_realdata.pytests/test_long_format.pytests/test_long_format_realdata.pytests/test_quant_viewer_realdata.pytests/test_tnt_viewer_realdata.py
Fix FLASHApp#92 pytest: the unit-tests workflow ran a raw 'pip install -r
requirements.txt', which fails on the 'openms-insight @ git+...' line because the
wheel force-includes a gitignored Vue bundle (hatchling FileNotFoundError at
metadata generation). Per the chosen approach, the workflow now sets up node,
clones OpenMS-Insight on the shared branch, builds + mirrors the bundle, installs
the checkout, then installs the rest of requirements with the openms-insight line
stripped. (Pinning the requirements line to a SHA would NOT fix this — the bundle
is gitignored regardless of branch vs SHA.)
Also bump the Docker OPENMS_INSIGHT_REF to the now-CI-green OI commit, and apply
the actionable CodeRabbit review points:
- session-start.sh: widen the strip char-class to match the Dockerfiles
([@=<>!~]) so a future pinned 'openms-insight==' would also be stripped.
- common.use_openms_insight(): centralize the flag parsing; the three viewers now
call it (DRY) and drop their now-unused 'import os'.
- tnt_viewer: load protein/scan tables via _load_pandas directly (drop a
pandas->polars->pandas round trip); only build the proteoform scan map when a
protein is actually selected.
- deconv_viewer: drop a no-op .rename({'mass':'mass','index':'index'}).
- migration doc: correct the packaging section (CI builds the bundle, not the
hook) and spell out that any raw 'pip install -r requirements.txt' must strip
the git line + build the bundle first.
Verified: py_compile of touched modules, unit-tests.yml YAML parse, helper
toggle, suite 72 passed/2 skipped.
https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
…data) A fresh FLASHTnT run stores sequence_data as a parquet dataset. _sequence_table called file_manager.get_results(...) WITHOUT use_pyarrow=True, so FileManager returned a pandas DataFrame; the code then fell through to reconstruct_all(df), which raised, and a bare 'except: return None' swallowed it -> build returned None -> the viewer showed 'Component unavailable: sequence_view'. The bundled example fixture is a stale .pkl.gz dict and the test's fake FileManager returned a Path for .pq, so both masked the bug. Fix: request use_pyarrow=True (FileManager then returns a pyarrow Dataset that reconstruct_all reads), keep the dict/.gz fallbacks for older caches, and stop swallowing failures so a genuine error surfaces as a render error instead of a silent 'unavailable'. Add tests/test_tnt_sequence_loader.py with a FileManager fake that mirrors real loader semantics (pandas by default, pyarrow Dataset on use_pyarrow) — it fails on the pre-fix code and passes on the fix, independent of the bundled data. This restores the proteoform sequence + per-residue coverage shading. Remaining TnT Sequence View parity (b/y fragment-ion flags, matching-fragments table, and the theoretical/observed/delta mass header) is tracked separately: those need peaks_data and/or an OpenMS-Insight SequenceView capability addition. Verified: full suite 73 passed, 2 skipped. https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
Now that OpenMS-Insight's SequenceView accepts a theoretical/observed mass header
and precomputed per-residue fragment masses (OI ec8e061), wire the FLASHTnT viewer
to reach legacy parity:
- _sequence_table now also carries, per proteoform: theoretical_mass and
observed_mass (= computed_mass / ProteoformMass; the -1.0 'unmatched' sentinel
maps to null so the header is omitted), plus precomputed fragment_masses_{a..z}
(list[list[float]] with mod-ambiguity variants) straight from the entry.
- new _peaks_table builds the observed peaks SequenceView matches against: each
proteoform's scan deconv peaks (neutral MonoMass + SumIntensity from
deconv_spectrum), stamped with proteoform_index via the proteoform->scan map.
- build_component_tnt's sequence_view passes theoretical_mass_column /
observed_mass_column / fragment_mass_columns + peaks_data + interactivity so the
Theo/Obs/Delta header, b/y annotation flags, and Matching Fragments table render
from the precomputed (modification-aware) masses rather than a bare-sequence
recompute.
Also bump the Docker OPENMS_INSIGHT_REF to ec8e061 (the enhanced OI).
Verified: full suite 73 passed, 2 skipped; the realdata test builds sequence_view
through the new path against the bundled fixture, and a strengthened loader test
locks the mass/fragment plumbing + the -1.0 sentinel. Browser verification of the
Vue rendering is still required (see the checklist in chat).
https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
…label Three regressions found by the parity audit of the OpenMS-Insight Deconv viewer: - Heatmaps had no click interactivity, so the legacy heatmap->spectra/mass/3D cross-link was dead. Restore it: deconv heatmaps emit scanIndex+massIndex, raw heatmaps emit scanIndex (the caches already carry scan_idx/mass_idx, and OI Heatmap preserves interactivity columns through compression). - 3D S/N plot plotted raw m/z on x; the legacy plots the deconvoluted mass (mz * charge). explode_signal_peaks_long keeps mz/charge separate, so derive a 'mass' column and point Scatter3D's mz_column at it. - Raw heatmaps' y-axis was mislabeled 'Monoisotopic mass'; the legacy labels raw heatmaps 'm/z' (deconv stays 'Monoisotopic mass'). Verified: deconv real-data suite 11 passed; full suite 73 passed/2 skipped; and mass == mz*charge confirmed on the bundled fixture (263k peaks). https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/render_oi/deconv_viewer.py (1)
323-326:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSame per-group row numbering issue for
peak_id.This uses the same
pl.int_range(pl.len())pattern flagged above. Apply the same fix for consistency and correctness.Proposed fix
peaks_long = ( explode_spectrum_long(per_scan) - .with_columns(pl.int_range(pl.len()).over("index").alias("peak_id")) + .with_columns(pl.int_range(pl.count()).over("index").alias("peak_id")) )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/render_oi/deconv_viewer.py` around lines 323 - 326, The group-local peak_id generation using pl.int_range(pl.len()).over("index") in the peaks_long creation is incorrect; replace that expression with a per-group row counter such as pl.cumcount().over("index").alias("peak_id") (applied to the exploded result from explode_spectrum_long) so peak_id is numbered correctly within each "index" group instead of globally.
🧹 Nitpick comments (1)
src/render_oi/deconv_viewer.py (1)
63-69: ⚡ Quick winDead code:
_load_pandasis defined but never used.This function is not called anywhere in the module. The similar
_load_pandas_plfunction (lines 268-275) is used instead for density curve loading.🧹 Remove unused function
-def _load_pandas(file_manager, dataset_id: str, name: str): - """Load a cached parquet result as a pandas DataFrame (via Path).""" - res = file_manager.get_results(dataset_id, [name]) - p = res[name] - import pandas as pd - - return pd.read_parquet(p) if not isinstance(p, pd.DataFrame) else p - -🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/render_oi/deconv_viewer.py` around lines 63 - 69, The function `_load_pandas` is dead code and should be removed; delete the `_load_pandas` function definition from this module and ensure no external references rely on it (the module already uses `_load_pandas_pl` for density curve loading), leaving `_load_pandas_pl` as the canonical loader for parquet results.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/render_oi/deconv_viewer.py`:
- Around line 323-326: The group-local peak_id generation using
pl.int_range(pl.len()).over("index") in the peaks_long creation is incorrect;
replace that expression with a per-group row counter such as
pl.cumcount().over("index").alias("peak_id") (applied to the exploded result
from explode_spectrum_long) so peak_id is numbered correctly within each "index"
group instead of globally.
---
Nitpick comments:
In `@src/render_oi/deconv_viewer.py`:
- Around line 63-69: The function `_load_pandas` is dead code and should be
removed; delete the `_load_pandas` function definition from this module and
ensure no external references rely on it (the module already uses
`_load_pandas_pl` for density curve loading), leaving `_load_pandas_pl` as the
canonical loader for parquet results.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 068ec608-3263-490b-9632-6498ad0e7036
📒 Files selected for processing (1)
src/render_oi/deconv_viewer.py
|
Thanks @coderabbitai — both findings on
Generated by Claude Code |
|
Tip For best results, initiate chat on the files or code changes.
🐇 ✏️ Learnings added
|
Now that OI's LinePlot supports a tagger overlay (OI 2adb7d7), feed it the data:
- tag_table gains interactivity={tagIndex: TagIndex}, so clicking a tag records
the selection.
- render_experiment_tnt resolves the selected tagIndex into a tagData dict
(_tag_data: parses tag_dfs.mzs into fragment masses, carries TagSequence /
StartPos / EndPos, flags N-terminal tags via Nmass==-1) and pushes it into the
panel StateManager under 'tagData' (alongside the existing proteinIndex ->
deconvIndex resolution). set_selection no-ops when unchanged.
- combined_spectrum's LinePlot now builds its primary data via
_deconv_signal_peaks_long, which carries a per-peak signal_peaks column of
[mz, intensity, charge] triplets (from combined_spectrum.SignalPeaks, dropping
the bin index), and is constructed with tag_overlay=True +
signal_peaks_column='signal_peaks'. With a tag selected, OI then highlights the
tag-matched sticks and draws the per-charge buttons + inter-residue amino-acid
arrows over the spectrum.
The sequence-view residue -> AApos cross-link (gold selected-residue highlight +
tag-table residue filtering) is still pending and tracked separately; the overlay
renders fully without it (selectedAA defaults to none).
Also bump the Docker OPENMS_INSIGHT_REF to 2adb7d7.
Verified: full suite 74 passed/2 skipped (new test_tagger_overlay_data locks the
signal_peaks triplets + tagData shape against the bundled fixture). The Vue
rendering itself needs browser verification (see checklist in chat).
https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
…spectrum Close the lower-severity parity gaps from the multi-angle audit (FLASHApp-side; all use existing OpenMS-Insight Table/LinePlot params). FLASHDeconv (deconv_viewer.py): - scan_table / mass_table: legacy column_definitions (titles, descriptive headerTooltips, numeric formatters) recovered from the legacy bundle, built only from the columns actually present. - deconv_spectrum: per-peak charge labels (z=N) via annotation_column, derived locally from combined_spectrum.SignalPeaks (most-intense peak's charge per mass). anno_spectrum left unannotated (its series carries no charge data). - 3D_SN_plot: dynamic title 'Precursor Signals' / 'Mass Signals' by selection. (Confirmed the legacy does NOT default to precursor-only -> all-masses-for-scan is already correct; not a gap.) FLASHTnT (tnt_viewer.py): - protein_table: restore dropped columns (Scan/length/MatchingFragments/ModCount/ TagCount/Score/ProteoformLevelQvalue/...) with titles+tooltips, initial_sort by Score desc, go_to_fields; 'Best per spectrum' toggle (default on, matching legacy) keeping the top-Score row per Scan. - tag_table: restore Nmass/Cmass, column_definitions, Score-desc sort, go-to. - FLASHTnT's -1.0 'unmatched' sentinel mapped to null for mass/q-value columns so cells render blank (OI column_definitions can't carry inline JS formatters). docs: correct the Quant section — neither engine implements isotope hover or conflict-resolution highlighting (conflict_resolution_dfs is parsed but unused); FeatureView gains per-isotope-trace breaks. Verified: full suite 83 passed, 2 skipped (+9 new tests across the two viewers). https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
…isotope traces
FLASHTnT sequence-view residue (AApos) cross-link, closing the last legacy gap:
- SequenceView emits the clicked residue's 0-based index under AApos via the
POSITION_SENTINEL ("<position>"), alongside the existing peak interactivity.
- Tag table restricts server-side to tags spanning the selected residue
(StartPos <= AApos <= EndPos), built in a render closure so AApos is read at
render time with a cache_id that varies per residue.
- _tag_data marks selectedAA as the within-tag offset (AApos - StartPos) so the
augmented-spectrum tagger overlay draws the gold selected-residue highlight.
- render_experiment_tnt clears the tag + residue sub-selections when the protein
changes (a private _tag_protein marker), matching the legacy reset-on-click.
FLASHDeconv: deconv_spectrum gains interactivity={massIndex: mass_id} so a mass
-table click highlights the matching stick (and vice-versa) -- the legacy
mass-table -> spectrum highlight cross-link, reading the shared selection store.
FLASHQuant: FeatureView trace_key_column="isotope" breaks the polyline between
isotope traces within a charge, matching the legacy per-isotope-trace breaks.
Bump Docker OPENMS_INSIGHT_REF to 2767f1d (residue-position interactivity +
FeatureView per-trace breaks). CI / requirements track the OI branch directly.
Adds test_residue_aapos_cross_link covering the offset math, the server-side
residue filter (cache_id variation + row count), and the SequenceView sentinel.
https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
…coped state
Re-audit (fan-out) of FLASHQuant surfaced four divergences from the legacy
FLASHQuantView, all fixed here:
- Feature-group table had no column_definitions, so OpenMS-Insight auto-titled
the CamelCase fields ("Monoisotopicmass", "Startretentiontime(Fwhm)", ...).
Add explicit column_definitions mirroring the legacy
featureGroupTableColumnDefinitions (Index, Monoisotopic Mass, Average Mass,
Start/End Retention Time (FWHM), Feature Group Quantity, Min/Max Charge, Most
Abundant Charge, Isotope Cosine Score) with tooltips + number sorter/precision,
reusing the Deconv _column_definitions helper.
- Drop HighestApexRetentionTime: the legacy table never surfaced it.
- Match legacy titles: table "Feature groups", 3D view "Feature group signals"
(was "Feature Groups" / "Feature Group Visualization").
- Scope the quant StateManager session_key to dataset_id so switching the
selected experiment starts from a clean selection (default-row-0) instead of
inheriting the previous dataset's featureGroup -- reproducing the legacy
render_grid reset on dataset change (src/render/render.py:80-82). Most acute
for Quant, whose single FeatureView is entirely driven by featureGroup.
https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
Deconv (deconv_viewer.py):
- Remove the always-on z=N charge labels from the deconvolved spectrum. The
legacy draws annotation boxes ONLY for the selected mass (computeAnnotationBoxes
returns [] when nothing is highlighted) and gates z=N labels to the m/z axis, so
per-stick labels were a regression. The massIndex highlight cross-link already
reproduces the on-selection emphasis. Drops the now-dead _deconv_spectrum_with_charge
helper (explode_spectrum_long already yields the mass_id-aligned long frame).
- Reset the selected mass to the new scan's first peak on a scan-TABLE click
(legacy updateSelectedScan -> updateSelectedMass(0)) without clobbering a heatmap
click that sets scan+mass together (_reset_mass_on_scan_change, dual-tracked).
- Scope the StateManager session_key to dataset_id so switching the selected
experiment starts clean instead of inheriting the previous dataset's scan/mass
(legacy render_grid reset, src/render/render.py:80-82).
- Restore scan/mass table "go to" navigation (go_to_fields).
- FLASHDeconv sequence view defaults to a/x/y ions (legacy SequenceViewInformation
default; the OI default is b/y).
- Heatmap axis labels Title Case ("Retention Time"/"Monoisotopic Mass") and 3D
title lowercase ("Precursor signals"/"Mass signals"), matching the bundle.
TnT (tnt_viewer.py):
- Scope the StateManager session_key to dataset_id (same dataset-switch reset).
- Protein-table accession/description sort as strings (legacy gave them no numeric
sorter; a number sorter parses text as NaN and fails to order it). TagSequence
keeps its numeric sorter to match the legacy tag table.
- id_fdr_plot x-axis label "QScore" (legacy FDRPlotly), matching the Deconv plot.
- Raw heatmap labelled m/z (was hardcoded to mass for all heatmaps); Title-Case
axis labels, mirroring the Deconv heatmap.
Tests: scan->mass reset (incl. the heatmap-keeps-its-mass case), deconvolved
spectrum carries no always-on annotation + the massIndex cross-link, protein
string-sorter, and 3D title casing.
https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
Summary
This PR completes the phased migration of FLASHApp's visualization pipeline from the monolithic
flash_viewer_gridVue component to reusable OpenMS-Insight components. All three workflows (FLASHDeconv, FLASHTnT, FLASHQuant) now render through the new engine by default, with an opt-out flag for rollback during validation.Key Changes
New Rendering Engines
src/render_oi/deconv_viewer.py— Phase 1: FLASHDeconv visualization (heatmaps, scan/mass tables, spectra, 3D S/N plot, FDR plot, sequence view, internal fragment map). Each experiment gets its ownStateManagerto prevent cross-panel selection leakage.src/render_oi/tnt_viewer.py— Phase 2: FLASHTnT visualization (protein table, tag table, combined spectrum, sequence view, FDR plot, heatmaps). Includes proteoform→scan resolution to mapproteinIndexto deconvolution row indices.src/render_oi/quant_viewer.py— Phase 3: FLASHQuant visualization (feature-group selector table + FeatureView for mass traces/XIC/isotope patterns).Data Model Bridge
src/parse/long_format.py— Pure Polars adapters that explode FLASHApp's arrays-per-scan format into long format (one row per peak) for OpenMS-Insight filtering:explode_spectrum_long()— convertsMonoMass[]/SumIntensity[]to long format with explicitindex(scan) andmass_id(peak position) columnsexplode_combined_spectrum_long()— handles deconvolved + annotated spectrum pairs for overlay plotsexplode_signal_peaks_long()— nested signal/noise peak arrays → Scatter3D formatdensity_series_long()— precomputed target/decoy curves → DensityPlot formatThese preserve the original row counts and enable value-based filtering (
filters={'scanIndex': 'index'}) to reproduce the old row-index filtering exactly.Cross-Linking Model
StateTrackerkeys to OpenMS-Insight identifier→column filters:scanIndex→ scan-tableinteractivity={'scanIndex':'index'}; downstream panels filter byindexmassIndex→ mass-tableinteractivity={'massIndex':'mass_id'}; 3D plot optionally filters bymass_idproteinIndex→ protein-tableinteractivity; tag/sequence/spectrum panels filter byProteinIndexzoom_identifierreplaces the four bespoke range keysIntegration & Rollout
content/FLASHDeconv/FLASHDeconvViewer.py,content/FLASHTnT/FLASHTnTViewer.py,content/FLASHQuant/FLASHQuantViewer.py— Updated to checkFLASHAPP_USE_OPENMS_INSIGHTenv flag (default ON) and route to new engine or legacyrender_gridopenms-insight-buildstage to build the Vue component bundle from the pinned commit; bothDockerfileandDockerfile.armstripopenms-insightfrom requirements before pip install (built separately to ensure js bundle is present)requirements.txt— Addedopenms-insightdependency with git URL pinned to the migration branchTesting & Documentation
tests/test_long_format.py— Unit tests for the data-model adapters (synthetic data, pure Polars)https://claude.ai/code/session_01DzcAbhoSHQbA8EQ6ARFPRj
Summary by CodeRabbit
New Features
Documentation
Tests
Chores