Make Arrow/Parquet a required dependency#8991
Conversation
Arrow/Parquet is now always built — no CMake option needed. This removes the WITH_PARQUET option, all #ifdef/#ifndef WITH_PARQUET preprocessor guards, compile definitions, CMake conditionals, and CI flag overrides across 86 files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughRemoved the Changes
Sequence Diagram(s)sequenceDiagram
participant Dev as CI/Developer
participant CMake
participant Finder as "find_package(Arrow/Parquet)"
participant Compiler
participant Runtime as "TOPP / pyOpenMS"
Dev->>CMake: configure (no WITH_PARQUET)
CMake->>Finder: discover Arrow & Parquet (required)
Finder-->>CMake: provide OPENMS_ARROW_TARGET / OPENMS_PARQUET_TARGET
CMake->>Compiler: add Arrow/Parquet targets, sources, modules
Compiler-->>Runtime: build binaries & Python modules with Arrow/Parquet
Runtime->>Finder: call Arrow/Parquet APIs (I/O, consumers, bindings)
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
src/topp/OpenSwathWorkflow.cpp (2)
1269-1276:⚠️ Potential issue | 🟠 MajorPreserve the parent directory when prefixing multi-run
-out_chromfiles.This rewrite prefixes the whole path, not just the filename.
results/chrom.xicbecomessample_results/chrom.xic, and/tmp/chrom.xicbecomessample_/tmp/chrom.xic, so multi-run chromatogram output can land in the wrong location or fail. Please mirror theFile::path/File::basenamesplit already used forout_mobilogram.Suggested fix
String out_chrom_current = out_chrom; if (!out_chrom.empty() && run_groups.size() > 1) { - // For multi-run, use basename prefix to make unique filenames - String base_name = out_chrom.substr(0, out_chrom.find_last_of('.')); - String extension = out_chrom.substr(out_chrom.find_last_of('.')); - out_chrom_current = file_basename + "_" + base_name + extension; + // Preserve parent directory when creating per-run filenames. + String parent = File::path(out_chrom); + String filename = File::basename(out_chrom); + String stem = filename.substr(0, filename.find_last_of('.')); + String extension = filename.substr(filename.find_last_of('.')); + String fname_with_prefix = file_basename + "_" + stem + extension; + out_chrom_current = (parent == "." ? fname_with_prefix : parent + "/" + fname_with_prefix); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/topp/OpenSwathWorkflow.cpp` around lines 1269 - 1276, The current multi-run prefixing logic incorrectly prefixes the entire out_chrom path; update the code that sets out_chrom_current (using out_chrom, file_basename) to split out_chrom into directory path and basename (mirror the approach used for out_mobilogram/File::path and File::basename), then prefix only the basename with file_basename and rejoin with the original directory so the parent directory is preserved; locate the logic around out_chrom_current/out_chrom and replace the basename/ext manipulation with a path+basename split, prefix basename with file_basename + "_" and append extension, then combine path + prefixed_basename to form out_chrom_current.
980-993:⚠️ Potential issue | 🔴 Critical
-append_oswpqcurrently drops existing archive contents.For
.oswpqoutputs, this path always writes into a fresh temp directory and then deletes/rebuilds the target archive. Nothing hydratesparquet_dirfrom the existing.oswpq, so a second invocation with-append_oswpqoverwrites prior runs instead of appending them. Please seed the temp directory from the existing archive beforewrite(), and only replace the final archive after the new package is assembled successfully.Also applies to: 1346-1367
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/topp/OpenSwathWorkflow.cpp` around lines 980 - 993, The append flow for .oswpq currently always creates an empty parquet_dir and rebuilds the archive, losing prior runs; update the logic around parquet_temp_dir/parquet_dir (the blocks using getFlag_("append_oswpq"), parquet_temp_dir, parquet_dir and calling OpenSwathOSWParquetWriter::write()) to, when append_oswpq is set and the target .oswpq exists, first extract/unpack the existing .oswpq into parquet_dir so existing run data is present before calling parquet_writer.write(), and only replace the final .oswpq after the new write succeeds (create the new archive in a temp file and atomically rename/replace the original). Apply the same change to the other identical block (around the later code referenced at the second instance) so both code paths hydrate the temp dir from the existing archive and only overwrite the archive upon successful assembly.src/pyOpenMS/tests/unittests/test_XIMParquetFile.py (1)
30-38:⚠️ Potential issue | 🟠 MajorStop skipping the required parquet bindings.
After this PR, missing
XIMParquetFileor a"Parquet support"constructor failure is a regression, not a supported configuration. Keeping thesepytest.skip()paths can let a broken pyOpenMS build pass this suite as skipped instead of failed.Suggested fix
def _get_xim(): import pyopenms as poms - # Check if XIMParquetFile class exists - if not hasattr(poms, "XIMParquetFile"): - pytest.skip("pyopenms built without parquet support") - - try: - return poms.XIMParquetFile(_xim_path()) - except RuntimeError as e: - if "Parquet support" in str(e): - pytest.skip("pyopenms built without parquet support") - raise + if not hasattr(poms, "XIMParquetFile"): + pytest.fail("XIMParquetFile must be available when Arrow/Parquet is required") + return poms.XIMParquetFile(_xim_path())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/pyOpenMS/tests/unittests/test_XIMParquetFile.py` around lines 30 - 38, The test currently skips when the XIMParquetFile class is missing or when its constructor raises a RuntimeError containing "Parquet support", which hides regressions; remove the pytest.skip paths and instead assert presence/constructability: replace the hasattr(poms, "XIMParquetFile") check and the try/except around poms.XIMParquetFile(_xim_path()) with an explicit assertion that XIMParquetFile exists and that constructing it does not raise (or let any exception bubble so the test fails), referencing XIMParquetFile and the _xim_path() call to locate the code to change.src/pyOpenMS/tests/unittests/test_XICParquetFile.py (1)
30-38:⚠️ Potential issue | 🟠 MajorStop skipping the required parquet bindings.
This helper still treats missing
XICParquetFileor a"Parquet support"constructor error as skippable. With Arrow/Parquet now mandatory, that will hide a broken pyOpenMS build instead of failing the regression test.Suggested fix
def _get_xic(): import pyopenms as poms - # Check if XICParquetFile class exists - if not hasattr(poms, 'XICParquetFile'): - pytest.skip("pyopenms built without parquet support") - - try: - return poms.XICParquetFile(_xic_path()) - except RuntimeError as e: - if "Parquet support" in str(e): - pytest.skip("pyopenms built without parquet support") - raise + if not hasattr(poms, "XICParquetFile"): + pytest.fail("XICParquetFile must be available when Arrow/Parquet is required") + return poms.XICParquetFile(_xic_path())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/pyOpenMS/tests/unittests/test_XICParquetFile.py` around lines 30 - 38, Remove the logic that treats missing parquet bindings as skippable: do not call pytest.skip when poms lacks XICParquetFile or when constructing poms.XICParquetFile(_xic_path()) raises a RuntimeError mentioning "Parquet support"; instead assert or call pytest.fail so the test fails loudly. Locate the helper that references poms and XICParquetFile (the hasattr check and the try/except around poms.XICParquetFile(_xic_path())) and replace the skip branches with explicit failures (e.g., assert hasattr(poms, 'XICParquetFile') or pytest.fail with a clear message, and re-raise or pytest.fail on constructor RuntimeError) so missing Arrow/Parquet bindings cause test failures rather than skipped tests.
🧹 Nitpick comments (2)
src/pyOpenMS/pyopenms/addons/msexperiment.py (1)
157-167: Updated warning message is appropriate.The removal of the
-DWITH_PARQUET=ONrebuild suggestion is correct since this flag no longer exists. However, now that Arrow/Parquet is a required dependency, this import failure path should be rare (indicating a broken/incomplete installation). Consider whether the warning should reflect that this is an unexpected state rather than a normal fallback scenario.💡 Optional: Clarify that this is an unexpected state
try: from pyopenms._arrow_zerocopy import spectra_to_arrow, chromatograms_to_arrow _use_zerocopy = True except ImportError: _use_zerocopy = False warnings.warn( - "pyopenms._arrow_zerocopy not available — falling back to slow Python " - "Arrow export. This module requires Arrow/Parquet for 4-14x faster export.", + "pyopenms._arrow_zerocopy not available — falling back to slower Python " + "Arrow export. This is unexpected as Arrow/Parquet is a required dependency. " + "The installation may be incomplete.", stacklevel=2, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/pyOpenMS/pyopenms/addons/msexperiment.py` around lines 157 - 167, The ImportError fallback treating missing pyopenms._arrow_zerocopy as a normal path should be updated to reflect that Arrow/Parquet are required and the import failure indicates a broken/incomplete installation; in the except ImportError block (symbols: pyopenms._arrow_zerocopy, spectra_to_arrow, chromatograms_to_arrow, _use_zerocopy) set _use_zerocopy = False but change the warnings.warn message to state this is unexpected (e.g., "pyopenms._arrow_zerocopy not found — Arrow/Parquet are required; this indicates a broken or incomplete installation, falling back to slower Python export") and keep stacklevel=2 so diagnostics point to user code.src/tests/class_tests/openms/CMakeLists.txt (1)
81-116: Formatting: inconsistent indentation for library arguments.The unconditional
target_link_librariescalls have no indentation for the library arguments, whereas the guarded blocks below (lines 118-147) properly indent the arguments. Consider aligning for consistency.🎨 Suggested formatting fix
target_link_libraries(Arrow_test -${OPENMS_ARROW_TARGET} -${OPENMS_PARQUET_TARGET} + ${OPENMS_ARROW_TARGET} + ${OPENMS_PARQUET_TARGET} ) target_link_libraries(MSExperimentArrowExport_test -${OPENMS_ARROW_TARGET} -${OPENMS_PARQUET_TARGET} + ${OPENMS_ARROW_TARGET} + ${OPENMS_PARQUET_TARGET} ) target_link_libraries(ConsensusMapArrowExport_test -${OPENMS_ARROW_TARGET} -${OPENMS_PARQUET_TARGET} + ${OPENMS_ARROW_TARGET} + ${OPENMS_PARQUET_TARGET} ) target_link_libraries(QPXFile_test -${OPENMS_ARROW_TARGET} -${OPENMS_PARQUET_TARGET} + ${OPENMS_ARROW_TARGET} + ${OPENMS_PARQUET_TARGET} ) target_link_libraries(ProteinIdentificationArrowIO_test -${OPENMS_ARROW_TARGET} -${OPENMS_PARQUET_TARGET} + ${OPENMS_ARROW_TARGET} + ${OPENMS_PARQUET_TARGET} ) target_link_libraries(FeatureMapArrowIO_test -${OPENMS_ARROW_TARGET} -${OPENMS_PARQUET_TARGET} + ${OPENMS_ARROW_TARGET} + ${OPENMS_PARQUET_TARGET} ) target_link_libraries(ConsensusMapArrowIO_test -${OPENMS_ARROW_TARGET} -${OPENMS_PARQUET_TARGET} + ${OPENMS_ARROW_TARGET} + ${OPENMS_PARQUET_TARGET} ) target_link_libraries(Libzip_test -${OPENMS_ARROW_TARGET} -${OPENMS_PARQUET_TARGET} + ${OPENMS_ARROW_TARGET} + ${OPENMS_PARQUET_TARGET} ) target_link_libraries(ZipRandomAccessFile_test -${OPENMS_ARROW_TARGET} -${OPENMS_PARQUET_TARGET} + ${OPENMS_ARROW_TARGET} + ${OPENMS_PARQUET_TARGET} )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/tests/class_tests/openms/CMakeLists.txt` around lines 81 - 116, The target_link_libraries calls for Arrow_test, MSExperimentArrowExport_test, ConsensusMapArrowExport_test, QPXFile_test, ProteinIdentificationArrowIO_test, FeatureMapArrowIO_test, ConsensusMapArrowIO_test, Libzip_test, and ZipRandomAccessFile_test have unindented library arguments; update each of these target_link_libraries invocations so that the library names (${OPENMS_ARROW_TARGET} and ${OPENMS_PARQUET_TARGET}) are indented on the following line (matching the style used in the guarded blocks) to ensure consistent CMake indentation and readability.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/pyOpenMS/CMakeLists.txt`:
- Around line 303-326: Replace the permissive fallback with a mandatory,
versioned Arrow lookup and pick the same static/shared target as core OpenMS: if
OPENMS_ARROW_TARGET is not set, call find_package(Arrow 23 CONFIG REQUIRED) (not
QUIET), then set OPENMS_ARROW_TARGET to Arrow::arrow_static or
Arrow::arrow_shared based on ARROW_USE_STATIC (or by checking TARGET
Arrow::arrow_static / Arrow::arrow_shared) so the chosen target matches core;
only add/build the _arrow_zerocopy module (nanobind_add_module and
target_link_libraries(_arrow_zerocopy ... ${OPENMS_ARROW_TARGET})) when
OPENMS_ARROW_TARGET is defined.
In `@src/pyOpenMS/tests/unittests/test_arrow_zerocopy.py`:
- Around line 13-18: The test currently swallows ImportError for
pyopenms._arrow_zerocopy by skipping the module, which hides build/package
regressions; instead remove the try/except skip and import spectra_to_arrow and
chromatograms_to_arrow directly from pyopenms._arrow_zerocopy (or, if you prefer
an explicit failure, replace the pytest.skip in the except with pytest.fail) so
that an ImportError fails the test and surfaces the broken build. Use the
module/function names pyopenms._arrow_zerocopy, spectra_to_arrow, and
chromatograms_to_arrow to locate and update the import.
In `@src/tests/class_tests/openms/executables.cmake`:
- Around line 294-303: The test target OpenSwathOSWParquetRoundTrip_test is
registered unconditionally but depends on OpenSwath headers; move the symbol
OpenSwathOSWParquetRoundTrip_test out of the unconditional list(APPEND
format_executables_list and place it inside the existing NOT DISABLE_OPENSWATH
guarded block where TransitionParquetFile_test, OpenSwathOSWParquetReader_test,
and OpenSwathOSWParquetWriter_test are registered so the test is only added when
DISABLE_OPENSWATH is off; ensure you remove it from the top-level list and add
it alongside those three test names in the guarded block.
---
Outside diff comments:
In `@src/pyOpenMS/tests/unittests/test_XICParquetFile.py`:
- Around line 30-38: Remove the logic that treats missing parquet bindings as
skippable: do not call pytest.skip when poms lacks XICParquetFile or when
constructing poms.XICParquetFile(_xic_path()) raises a RuntimeError mentioning
"Parquet support"; instead assert or call pytest.fail so the test fails loudly.
Locate the helper that references poms and XICParquetFile (the hasattr check and
the try/except around poms.XICParquetFile(_xic_path())) and replace the skip
branches with explicit failures (e.g., assert hasattr(poms, 'XICParquetFile') or
pytest.fail with a clear message, and re-raise or pytest.fail on constructor
RuntimeError) so missing Arrow/Parquet bindings cause test failures rather than
skipped tests.
In `@src/pyOpenMS/tests/unittests/test_XIMParquetFile.py`:
- Around line 30-38: The test currently skips when the XIMParquetFile class is
missing or when its constructor raises a RuntimeError containing "Parquet
support", which hides regressions; remove the pytest.skip paths and instead
assert presence/constructability: replace the hasattr(poms, "XIMParquetFile")
check and the try/except around poms.XIMParquetFile(_xim_path()) with an
explicit assertion that XIMParquetFile exists and that constructing it does not
raise (or let any exception bubble so the test fails), referencing
XIMParquetFile and the _xim_path() call to locate the code to change.
In `@src/topp/OpenSwathWorkflow.cpp`:
- Around line 1269-1276: The current multi-run prefixing logic incorrectly
prefixes the entire out_chrom path; update the code that sets out_chrom_current
(using out_chrom, file_basename) to split out_chrom into directory path and
basename (mirror the approach used for out_mobilogram/File::path and
File::basename), then prefix only the basename with file_basename and rejoin
with the original directory so the parent directory is preserved; locate the
logic around out_chrom_current/out_chrom and replace the basename/ext
manipulation with a path+basename split, prefix basename with file_basename +
"_" and append extension, then combine path + prefixed_basename to form
out_chrom_current.
- Around line 980-993: The append flow for .oswpq currently always creates an
empty parquet_dir and rebuilds the archive, losing prior runs; update the logic
around parquet_temp_dir/parquet_dir (the blocks using getFlag_("append_oswpq"),
parquet_temp_dir, parquet_dir and calling OpenSwathOSWParquetWriter::write())
to, when append_oswpq is set and the target .oswpq exists, first extract/unpack
the existing .oswpq into parquet_dir so existing run data is present before
calling parquet_writer.write(), and only replace the final .oswpq after the new
write succeeds (create the new archive in a temp file and atomically
rename/replace the original). Apply the same change to the other identical block
(around the later code referenced at the second instance) so both code paths
hydrate the temp dir from the existing archive and only overwrite the archive
upon successful assembly.
---
Nitpick comments:
In `@src/pyOpenMS/pyopenms/addons/msexperiment.py`:
- Around line 157-167: The ImportError fallback treating missing
pyopenms._arrow_zerocopy as a normal path should be updated to reflect that
Arrow/Parquet are required and the import failure indicates a broken/incomplete
installation; in the except ImportError block (symbols:
pyopenms._arrow_zerocopy, spectra_to_arrow, chromatograms_to_arrow,
_use_zerocopy) set _use_zerocopy = False but change the warnings.warn message to
state this is unexpected (e.g., "pyopenms._arrow_zerocopy not found —
Arrow/Parquet are required; this indicates a broken or incomplete installation,
falling back to slower Python export") and keep stacklevel=2 so diagnostics
point to user code.
In `@src/tests/class_tests/openms/CMakeLists.txt`:
- Around line 81-116: The target_link_libraries calls for Arrow_test,
MSExperimentArrowExport_test, ConsensusMapArrowExport_test, QPXFile_test,
ProteinIdentificationArrowIO_test, FeatureMapArrowIO_test,
ConsensusMapArrowIO_test, Libzip_test, and ZipRandomAccessFile_test have
unindented library arguments; update each of these target_link_libraries
invocations so that the library names (${OPENMS_ARROW_TARGET} and
${OPENMS_PARQUET_TARGET}) are indented on the following line (matching the style
used in the guarded blocks) to ensure consistent CMake indentation and
readability.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0e453a69-4aea-4b5f-96cf-e0b91acbea00
📒 Files selected for processing (86)
.github/workflows/openms_ci_matrix_full.yml.github/workflows/pyopenms-wheels-cibuildwheel.ymlAGENTS.mdCMakeLists.txtcmake/OpenMSConfig.cmake.incmake/cmake_findExternalLibs.cmakecmake/package_general.cmakedoc/CMakeLists.txtdoc/doxygen/public/TOPP.doxygensrc/openms/CMakeLists.txtsrc/openms/include/OpenMS/FORMAT/ArrowSchemaRegistry.hsrc/openms/include/OpenMS/FORMAT/ConsensusMapArrowExport.hsrc/openms/include/OpenMS/FORMAT/ConsensusMapArrowIO.hsrc/openms/include/OpenMS/FORMAT/FeatureMapArrowIO.hsrc/openms/include/OpenMS/FORMAT/MSExperimentArrowExport.hsrc/openms/include/OpenMS/FORMAT/ParquetFile.hsrc/openms/include/OpenMS/FORMAT/ProteinGroupArrowExport.hsrc/openms/include/OpenMS/FORMAT/ProteinIdentificationArrowIO.hsrc/openms/include/OpenMS/FORMAT/QPXFile.hsrc/openms/include/OpenMS/FORMAT/ZipRandomAccessFile.hsrc/openms/include/OpenMS/FORMAT/sources.cmakesrc/openms/source/ANALYSIS/OPENSWATH/OpenSwathOSWParquetReader.cppsrc/openms/source/ANALYSIS/OPENSWATH/OpenSwathOSWParquetWriter.cppsrc/openms/source/ANALYSIS/OPENSWATH/TransitionParquetFile.cppsrc/openms/source/APPLICATIONS/OpenSwathBase.cppsrc/openms/source/APPLICATIONS/ToolHandler.cppsrc/openms/source/FORMAT/ArrowSchemaRegistry.cppsrc/openms/source/FORMAT/ConsensusMapArrowExport.cppsrc/openms/source/FORMAT/ConsensusMapArrowIO.cppsrc/openms/source/FORMAT/DATAACCESS/MSChromatogramParquetConsumer.cppsrc/openms/source/FORMAT/DATAACCESS/MobilogramParquetConsumer.cppsrc/openms/source/FORMAT/DATAACCESS/sources.cmakesrc/openms/source/FORMAT/FeatureMapArrowIO.cppsrc/openms/source/FORMAT/MSExperimentArrowExport.cppsrc/openms/source/FORMAT/ParquetFile.cppsrc/openms/source/FORMAT/ProteinGroupArrowExport.cppsrc/openms/source/FORMAT/ProteinIdentificationArrowIO.cppsrc/openms/source/FORMAT/QPXFile.cppsrc/openms/source/FORMAT/XICParquetFile.cppsrc/openms/source/FORMAT/XIMParquetFile.cppsrc/openms/source/FORMAT/ZipRandomAccessFile.cppsrc/openms/source/FORMAT/sources.cmakesrc/pyOpenMS/CLAUDE.mdsrc/pyOpenMS/CMakeLists.txtsrc/pyOpenMS/README.mdsrc/pyOpenMS/bindings/arrow_zerocopy.cppsrc/pyOpenMS/bindings/bind_format.cppsrc/pyOpenMS/pxds/ConsensusMapArrowIO.pxdsrc/pyOpenMS/pxds/FeatureMapArrowIO.pxdsrc/pyOpenMS/pxds/ProteinIdentificationArrowIO.pxdsrc/pyOpenMS/pxds/QPXFile.pxdsrc/pyOpenMS/pyopenms/__init__.pysrc/pyOpenMS/pyopenms/addons/msexperiment.pysrc/pyOpenMS/tests/benchmark_pyopenms.pysrc/pyOpenMS/tests/unittests/test_XICParquetFile.pysrc/pyOpenMS/tests/unittests/test_XIMParquetFile.pysrc/pyOpenMS/tests/unittests/test_arrow_io_classes.pysrc/pyOpenMS/tests/unittests/test_arrow_zerocopy.pysrc/tests/class_tests/openms/CMakeLists.txtsrc/tests/class_tests/openms/executables.cmakesrc/tests/class_tests/openms/source/ArrowSchemaRegistry_test.cppsrc/tests/class_tests/openms/source/ConsensusMapArrowExport_test.cppsrc/tests/class_tests/openms/source/ConsensusMapArrowIO_test.cppsrc/tests/class_tests/openms/source/FeatureMapArrowIO_test.cppsrc/tests/class_tests/openms/source/Libzip_test.cppsrc/tests/class_tests/openms/source/MSChromatogramParquetConsumer_test.cppsrc/tests/class_tests/openms/source/MobilogramParquetConsumer_test.cppsrc/tests/class_tests/openms/source/OpenSwathOSWParquetReader_test.cppsrc/tests/class_tests/openms/source/OpenSwathOSWParquetRoundTrip_test.cppsrc/tests/class_tests/openms/source/ProteinIdentificationArrowIO_test.cppsrc/tests/class_tests/openms/source/TransitionParquetFile_test.cppsrc/tests/class_tests/openms/source/XICParquetFile_test.cppsrc/tests/class_tests/openms/source/XIMParquetFile_test.cppsrc/tests/topp/CMakeLists.txtsrc/topp/IsobaricWorkflow.cppsrc/topp/OpenSwathAssayGenerator.cppsrc/topp/OpenSwathDecoyGenerator.cppsrc/topp/OpenSwathWorkflow.cppsrc/topp/ParquetConverter.cppsrc/topp/ProteinQuantifier.cppsrc/topp/ProteomicsLFQ.cppsrc/topp/QPXConverter.cppsrc/topp/TargetedFileConverter.cppsrc/topp/TextExporter.cppsrc/topp/executables.cmaketools/ci/capture-env.sh
💤 Files with no reviewable changes (58)
- src/topp/ParquetConverter.cpp
- src/openms/source/FORMAT/FeatureMapArrowIO.cpp
- src/openms/include/OpenMS/FORMAT/QPXFile.h
- src/openms/source/FORMAT/ProteinGroupArrowExport.cpp
- src/topp/QPXConverter.cpp
- src/tests/class_tests/openms/source/ConsensusMapArrowIO_test.cpp
- doc/CMakeLists.txt
- src/openms/include/OpenMS/FORMAT/FeatureMapArrowIO.h
- src/openms/source/APPLICATIONS/ToolHandler.cpp
- src/openms/include/OpenMS/FORMAT/MSExperimentArrowExport.h
- cmake/OpenMSConfig.cmake.in
- src/tests/class_tests/openms/source/XICParquetFile_test.cpp
- src/openms/include/OpenMS/FORMAT/ConsensusMapArrowExport.h
- doc/doxygen/public/TOPP.doxygen
- src/tests/class_tests/openms/source/ConsensusMapArrowExport_test.cpp
- src/tests/class_tests/openms/source/TransitionParquetFile_test.cpp
- src/tests/class_tests/openms/source/MobilogramParquetConsumer_test.cpp
- src/openms/include/OpenMS/FORMAT/ProteinIdentificationArrowIO.h
- src/tests/class_tests/openms/source/OpenSwathOSWParquetRoundTrip_test.cpp
- tools/ci/capture-env.sh
- src/openms/source/ANALYSIS/OPENSWATH/TransitionParquetFile.cpp
- src/openms/source/FORMAT/ConsensusMapArrowExport.cpp
- src/openms/source/FORMAT/MSExperimentArrowExport.cpp
- src/openms/source/FORMAT/ZipRandomAccessFile.cpp
- src/openms/source/FORMAT/ConsensusMapArrowIO.cpp
- src/openms/include/OpenMS/FORMAT/ConsensusMapArrowIO.h
- src/tests/class_tests/openms/source/OpenSwathOSWParquetReader_test.cpp
- src/openms/source/FORMAT/ParquetFile.cpp
- src/tests/class_tests/openms/source/ArrowSchemaRegistry_test.cpp
- src/tests/class_tests/openms/source/FeatureMapArrowIO_test.cpp
- src/openms/source/FORMAT/ProteinIdentificationArrowIO.cpp
- src/openms/include/OpenMS/FORMAT/ArrowSchemaRegistry.h
- src/topp/OpenSwathAssayGenerator.cpp
- CMakeLists.txt
- src/tests/class_tests/openms/source/Libzip_test.cpp
- src/openms/source/FORMAT/ArrowSchemaRegistry.cpp
- src/tests/class_tests/openms/source/MSChromatogramParquetConsumer_test.cpp
- src/openms/include/OpenMS/FORMAT/ProteinGroupArrowExport.h
- src/openms/source/FORMAT/QPXFile.cpp
- src/topp/ProteomicsLFQ.cpp
- src/tests/class_tests/openms/source/XIMParquetFile_test.cpp
- src/topp/OpenSwathDecoyGenerator.cpp
- src/openms/source/ANALYSIS/OPENSWATH/OpenSwathOSWParquetWriter.cpp
- src/openms/include/OpenMS/FORMAT/ZipRandomAccessFile.h
- src/openms/source/FORMAT/XIMParquetFile.cpp
- src/topp/ProteinQuantifier.cpp
- src/topp/IsobaricWorkflow.cpp
- src/openms/source/APPLICATIONS/OpenSwathBase.cpp
- src/openms/source/FORMAT/DATAACCESS/MSChromatogramParquetConsumer.cpp
- .github/workflows/openms_ci_matrix_full.yml
- src/topp/TargetedFileConverter.cpp
- src/tests/class_tests/openms/source/ProteinIdentificationArrowIO_test.cpp
- src/openms/include/OpenMS/FORMAT/ParquetFile.h
- src/openms/source/FORMAT/XICParquetFile.cpp
- src/openms/source/FORMAT/DATAACCESS/MobilogramParquetConsumer.cpp
- src/openms/source/ANALYSIS/OPENSWATH/OpenSwathOSWParquetReader.cpp
- src/topp/TextExporter.cpp
- src/pyOpenMS/bindings/bind_format.cpp
Doxygen 1.9.8 fails to register the @page when it is indented inside a /** */ block and followed by raw HTML at column 0 (<CENTER>). Align with the pattern used by all other TOPP tools. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Make standalone pyOpenMS Arrow lookup mandatory and version-constrained (Arrow 23 CONFIG REQUIRED), matching core OpenMS - Respect ARROW_USE_STATIC preference in standalone Arrow target selection - Remove try/except skip in test_arrow_zerocopy.py — ImportError should fail, not skip - Move OpenSwathOSWParquetRoundTrip_test into NOT DISABLE_OPENSWATH guard Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/pyOpenMS/CMakeLists.txt (1)
320-339:⚠️ Potential issue | 🟡 MinorKeep
pyopenms_stubsin sync with the new required module.Line 339 adds
_arrow_zerocopyto the normal build, butpyopenms_stubsstill depends on a separate list that excludes it. Buildingpyopenms_stubsdirectly can therefore run before_arrow_zerocopyexists, leaving the new module out of generated stubs and potentially breaking stub generation once it is imported.🧩 Proposed fix
if(PYOPENMS_GENERATE_STUBS) - set(_STUB_DEPS _pyopenms) - foreach(domain ${PYOPENMS_DOMAINS}) - list(APPEND _STUB_DEPS "_pyopenms_${domain}") - endforeach() - add_custom_target(pyopenms_stubs COMMAND ${Python_EXECUTABLE} -m nanobind.stubgen -m pyopenms @@ - DEPENDS ${_STUB_DEPS} + DEPENDS pyopenms_compile COMMENT "Generating .pyi stub files for pyopenms" ) endif()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/pyOpenMS/CMakeLists.txt` around lines 320 - 339, The pyopenms stubs target isn't updated to depend on the new _arrow_zerocopy module, so building pyopenms_stubs can run before _arrow_zerocopy exists; update the pyopenms_stubs dependency list to include the new target (referencing _arrow_zerocopy and the existing PYOPENMS_DOMAINS/_pyopenms_* targets) or add add_dependencies(pyopenms_stubs _arrow_zerocopy) where pyopenms_compile and other stub dependencies are declared so stub generation waits for the new module.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/pyOpenMS/CMakeLists.txt`:
- Around line 320-339: The pyopenms stubs target isn't updated to depend on the
new _arrow_zerocopy module, so building pyopenms_stubs can run before
_arrow_zerocopy exists; update the pyopenms_stubs dependency list to include the
new target (referencing _arrow_zerocopy and the existing
PYOPENMS_DOMAINS/_pyopenms_* targets) or add add_dependencies(pyopenms_stubs
_arrow_zerocopy) where pyopenms_compile and other stub dependencies are declared
so stub generation waits for the new module.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8c083734-9afb-4a58-ad01-e0197fa923d4
📒 Files selected for processing (3)
src/pyOpenMS/CMakeLists.txtsrc/pyOpenMS/tests/unittests/test_arrow_zerocopy.pysrc/tests/class_tests/openms/executables.cmake
- Fix multi-run out_chrom path prefixing to preserve parent directory (mirror the File::path/File::basename split used for mobilograms) - Extract existing .oswpq archive before appending new runs so prior data is preserved when -append_oswpq is set - Replace pytest.skip with assert in XIC/XIMParquetFile tests so missing bindings fail loudly instead of silently skipping - Update _arrow_zerocopy ImportError warning to indicate broken install - Use pyopenms_compile as stubs dependency (includes _arrow_zerocopy) - Fix CMake target_link_libraries indentation for Arrow/Parquet tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/topp/OpenSwathWorkflow.cpp (1)
1276-1283: Path preservation fix looks correct, but consider applying consistently to other debug outputs.The fix properly preserves the parent directory for multi-run chromatogram output paths.
However, similar multi-run path handling at lines 1147-1149 (
irt_trafo_out) and 1156-1158 (irt_mzml_out) still uses the old pattern that doesn't preserve parent directories:String base_name = irt_trafo_out.substr(0, irt_trafo_out.find_last_of('.')); String extension = irt_trafo_out.substr(irt_trafo_out.find_last_of('.')); irt_trafo_out = file_basename + "_" + base_name + extension;If a user specifies
-Debugging:irt_trafo /output/dir/trafo.trafoXML, this would producerun1_/output/dir/trafo.trafoXMLinstead of/output/dir/run1_trafo.trafoXML.♻️ Suggested fix for consistency (lines 1147-1149)
if (!irt_trafo_out.empty() && run_groups.size() > 1) { - // For multi-run, use basename prefix to make unique filenames - String base_name = irt_trafo_out.substr(0, irt_trafo_out.find_last_of('.')); - String extension = irt_trafo_out.substr(irt_trafo_out.find_last_of('.')); - irt_trafo_out = file_basename + "_" + base_name + extension; + // Preserve parent directory when creating per-run filenames. + String parent = File::path(irt_trafo_out); + String filename = File::basename(irt_trafo_out); + String stem = filename.substr(0, filename.find_last_of('.')); + String extension = filename.substr(filename.find_last_of('.')); + String fname_with_prefix = file_basename + "_" + stem + extension; + irt_trafo_out = (parent == "." ? fname_with_prefix : parent + "/" + fname_with_prefix); }Similar change would apply to
irt_mzml_outat lines 1156-1158.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/topp/OpenSwathWorkflow.cpp` around lines 1276 - 1283, The irt_trafo_out and irt_mzml_out multi-run filename handling still prepends file_basename to the full path (producing run1_/output/dir/trafo.trafoXML) instead of preserving the parent directory; update the logic for irt_trafo_out and irt_mzml_out to mirror the fix used for out_chrom: use File::path(irt_trafo_out)/File::basename(irt_trafo_out) (and similarly for irt_mzml_out) to extract parent, stem and extension, then build fname_with_prefix = file_basename + "_" + stem + extension and set the final path to either fname_with_prefix or parent + "/" + fname_with_prefix based on parent == ".".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/topp/OpenSwathWorkflow.cpp`:
- Around line 1276-1283: The irt_trafo_out and irt_mzml_out multi-run filename
handling still prepends file_basename to the full path (producing
run1_/output/dir/trafo.trafoXML) instead of preserving the parent directory;
update the logic for irt_trafo_out and irt_mzml_out to mirror the fix used for
out_chrom: use File::path(irt_trafo_out)/File::basename(irt_trafo_out) (and
similarly for irt_mzml_out) to extract parent, stem and extension, then build
fname_with_prefix = file_basename + "_" + stem + extension and set the final
path to either fname_with_prefix or parent + "/" + fname_with_prefix based on
parent == ".".
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 161beca0-dc7d-470d-99d9-0ff08c514ae4
📒 Files selected for processing (6)
src/pyOpenMS/CMakeLists.txtsrc/pyOpenMS/pyopenms/addons/msexperiment.pysrc/pyOpenMS/tests/unittests/test_XICParquetFile.pysrc/pyOpenMS/tests/unittests/test_XIMParquetFile.pysrc/tests/class_tests/openms/CMakeLists.txtsrc/topp/OpenSwathWorkflow.cpp
✅ Files skipped from review due to trivial changes (1)
- src/pyOpenMS/pyopenms/addons/msexperiment.py
🚧 Files skipped from review as they are similar to previous changes (2)
- src/pyOpenMS/tests/unittests/test_XICParquetFile.py
- src/pyOpenMS/CMakeLists.txt
Add missing entries for: - #8991: Arrow/Parquet made required dependency; WITH_PARQUET CMake option removed - #8993: BREAKING IMFormat enum rename (CONCATENATED→IM_PEAK, MULTIPLE_SPECTRA→IM_SPECTRUM) - #8997: Fix TOPPView performance regression when right-clicking in large mzML - #8999: BrukerTimsFile tiered scan→1/K0 calibration with RationalScan2ImConverter Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add missing entries for: - #8991: Arrow/Parquet made required dependency; WITH_PARQUET CMake option removed - #8993: BREAKING IMFormat enum rename (CONCATENATED→IM_PEAK, MULTIPLE_SPECTRA→IM_SPECTRUM) - #8997: Fix TOPPView performance regression when right-clicking in large mzML - #8999: BrukerTimsFile tiered scan→1/K0 calibration with RationalScan2ImConverter Co-authored-by: GitHub Copilot <copilot@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
WITH_PARQUETCMake option — Arrow/Parquet is now a required dependency (was alreadyONby default)#ifdef WITH_PARQUET/#ifndef WITH_PARQUETpreprocessor guards and their fallback code paths from 86 filestarget_compile_definitions(... WITH_PARQUET=1)from library and test CMake files-DWITH_PARQUET=ONfrom CI workflows (no longer needed)_arrow_zerocopymodule@cond WITH_PARQUETguards and doc placeholder generationTest plan
_arrow_zerocopymodule loads and tests pass🤖 Generated with Claude Code
Summary by CodeRabbit