Skip to content

ENH: Ingest ITKIOScanco into Modules/IO#6266

Open
hjmjohnson wants to merge 137 commits into
InsightSoftwareConsortium:mainfrom
hjmjohnson:ingest-IOScanco
Open

ENH: Ingest ITKIOScanco into Modules/IO#6266
hjmjohnson wants to merge 137 commits into
InsightSoftwareConsortium:mainfrom
hjmjohnson:ingest-IOScanco

Conversation

@hjmjohnson
Copy link
Copy Markdown
Member

Ingest the ITKIOScanco remote module into Modules/IO/IOScanco (Wave 2 of #6160), preserving upstream merge topology. Replaces the .remote.cmake reference and enables Module_IOScanco in the configure-ci pixi task.

What was ingested
Merge-topology preservation

The ingest was performed via --allow-unrelated-histories merge of the filter-repo'd upstream module history. All upstream merges from the original KitwareMedical/ITKIOScanco repository are preserved as two-parent merge commits; git blame walks back through the merge into the original upstream contributors.

git rev-list --count upstream/main..HEAD          = 126
git rev-list --count --merges upstream/main..HEAD = 25
Follow-on commits
  • COMP: Remove IOScanco .remote.cmake (in-tree) — drops the now-redundant remote-module manifest.
  • ENH: Enable Module_IOScanco in configure-ci pixi task — surfaces the in-tree module in the CI configure step.
  • ENH: Normalize C0004255.ISQ to CID content-link (now in ITKTestingData) — switches the test fixture from a .sha512 URL reference to an ExternalData CID link.
Test data (ITKTestingData)

The C0004255.ISQ test fixture has been published to ITKTestingData via PR InsightSoftwareConsortium/ITKTestingData#50 (merged). CID:

bafkreigcun2qy5mcnsyhpwjaspkds5wmanibtc2v5xwnnajgl3v2x62drm

The ingest tip references this CID directly via ExternalData content-link, so the test resolves once the ITKTestingData submodule reference picks up the merged PR.

@github-actions github-actions Bot added type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Enhancement Improvement of existing methods or implementation area:Python wrapping Python bindings for a class type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct area:IO Issues affecting the IO module area:Remotes Issues affecting the Remote module labels May 14, 2026
@hjmjohnson hjmjohnson marked this pull request as ready for review May 14, 2026 11:51
@greptile-apps

This comment was marked as resolved.

Comment thread Modules/IO/IOScanco/src/itkScancoDataManipulation.cxx Outdated
Comment thread Modules/IO/IOScanco/src/itkScancoDataManipulation.cxx Outdated
Comment thread Modules/IO/IOScanco/src/itkScancoDataManipulation.cxx
Comment thread Modules/IO/IOScanco/src/itkScancoImageIO.cxx Outdated
Comment thread Modules/IO/IOScanco/src/itkScancoHeaderIO.cxx Outdated
Comment thread Modules/IO/IOScanco/CMakeLists.txt Outdated
The space-padding loop iterated length times unconditionally, on top of
the bytes already copied from source. PadString therefore wrote up to
2*length bytes into dest, overrunning the fixed-width header fields
that follow it on every ISQ/AIM write.
CheckVersion called strcmp on a fixed-width 16-byte buffer that is not
guaranteed to be null-terminated (CanReadFile reads 512 raw bytes from
the file and passes the pointer in). The comparison could walk past the
buffer's logical end into trailing header bytes, producing undefined
behavior on arbitrary inputs. Use strncmp bounded by VersionDiskWidth,
matching the existing CTDATA check.
…mats

The else branch unconditionally returned fileType = 2 (AIMDATA_V020)
for any header that failed both prior checks, so CanReadFile claimed
every readable file as a Scanco file and the IO factory would
self-nominate for unrelated inputs. Match AIMDATA_V020 explicitly via
strncmp and let fileType remain 0 when no magic string matches.
ReadImageInformation called SetPixelType(SCALAR) unconditionally after
the AIM branch had already called ParseAIMComponentType, which sets
VECTOR for multi-component AIM data types (e.g. 0x00120003, 0x00060003).
The trailing SetPixelType clobbered that decision, leaving downstream
pipelines to treat multi-component AIM volumes as scalar.
WriteHeader's open call passed the local filename parameter, which is
empty when the caller relied on a previously-set m_FileName (the case
the surrounding check explicitly accepts). Use m_FileName, matching
ReadHeader. Also correct "/n" to "\\n" in the error message.
add_definitions() leaks to every sibling target compiled in the
remaining directory scope. Replace with target_compile_definitions on
the ITKIOScanco library defined in src/CMakeLists.txt.
@hjmjohnson
Copy link
Copy Markdown
Member Author

Addressed all 5 P1 + 1 P2 greptile findings in 6 BUG:/COMP: fixup commits appended to preserve merge topology. ITKTestingData PR #50 just merged; ExternalData cache should resolve within ~5 min, so Pixi-Cxx CI should rerun green.

Commits
  • d4a96ec BUG: Fix PadString writing twice the expected byte count
  • 60517f0 BUG: Use bounded comparison for Scanco magic string check
  • 504267d BUG: Make CheckVersion return non-zero only for recognized Scanco formats
  • 877a779 BUG: Preserve VECTOR pixel type for multi-component AIM data
  • 3b267eb BUG: Use m_FileName in WriteHeader and fix newline escape typo
  • ba4ba5b COMP: Scope _CRT_SECURE_NO_WARNINGS to the ITKIOScanco target

pre-commit run --all-files: clean. Merge topology preserved (25 merges, 132 commits ahead of upstream/main).

@hjmjohnson
Copy link
Copy Markdown
Member Author

/azp run

@hjmjohnson
Copy link
Copy Markdown
Member Author

@greptileai review this draft — addressed all 6 findings in commits d4a96ec..ba4ba5b. Please confirm no regressions.

…b 100MB)

The IOScanco ISQ test input blob is 106,958,336 bytes, which exceeds
GitHub's 100 MB hard limit for blobs in repositories. The CID
content-link in ITKTestingData PR InsightSoftwareConsortium#50 produced only a zero-byte
marker file because the actual content could not be pushed.

Restore the original SHA512 content-link. ExternalData resolves this
file via the data.kitware.com SHA512 mirror, which hosts the full
blob at:

  https://data.kitware.com/api/v1/file/hashsum/sha512/<digest>/download

(HTTP 200, Content-Length 106958336 — verified 2026-05-14.)

The orphaned zero-byte CID marker on ITKTestingData gh-pages is
removed in a companion PR.
@hjmjohnson
Copy link
Copy Markdown
Member Author

Reverted Modules/IO/IOScanco/test/Input/C0004255.ISQ.cid back to the original .sha512 content-link in 191e28d. The blob is 106,958,336 bytes (> GitHub's 100 MB per-file limit), so the ITKTestingData CID upload landed only a zero-byte marker. ExternalData resolves the SHA512 form via the data.kitware.com mirror (HTTP 200, Content-Length: 106958336 — verified today).

Companion cleanup: InsightSoftwareConsortium/ITKTestingData#55 removes the orphaned zero-byte CID marker from gh-pages.

The Baseline C0004255.mha.cid (67 MB) is left as-is — under the limit and presumably already resolves.

The C0004255.ISQ fixture is a 107 MB file hosted on data.kitware.com
via SHA512 mirror (it exceeds GitHub's 100 MB single-file limit so
cannot live in ITKTestingData). Tests that require this fixture are
gated behind ITK_BUILD_BIG_DATA_TESTS (default OFF) and labelled with
BigIO;RUNS_LONG, matching the pattern used by itkLargeTIFFImageWriteRead*
tests in Modules/IO/TIFF. Default builds keep the smaller AIM (4 MB)
fixtures, exercising the reader without the multi-hundred-MB download.
Two related changes that together unblock the Populate ExternalData
Cache workflow on PR InsightSoftwareConsortium#6266:

1. Replace the ITK_BUILD_BIG_DATA_TESTS option with the established
   ITK_COMPUTER_MEMORY_SIZE > 5 gate (matches the TIFF
   itkLargeTIFFImageWriteReadTest* pattern in Modules/IO/TIFF).  CI
   dashboards declare ITK_COMPUTER_MEMORY_SIZE=4.5 so the C0004255
   tests are not registered on those agents.

2. Convert Baseline/C0004255.mha.cid -> .mha.sha512.  The
   Populate ExternalData Cache GitHub Action scans **/*.cid across
   the whole source tree (regardless of CMake gating), so the .cid
   form forced the workflow to fetch a 67 MB blob that was never
   uploaded to ITKTestingData.  Switching to .sha512 routes the
   fetch through data.kitware.com's hashsum mirror (which has the
   blob) and removes it from the cache-populate expected set.
The if(NOT ITK_SOURCE_DIR) branch (find_package(ITK) +
include(ITKModuleExternal)) only executes when building the module
standalone outside ITK. In-tree builds always take the
itk_module_impl() else branch. Drop the dead branch following the
cleanup pattern from InsightSoftwareConsortium#6272/InsightSoftwareConsortium#6279.
hjmjohnson added a commit to hjmjohnson/ITK that referenced this pull request May 14, 2026
The Populate ExternalData Cache CI workflow scans every *.cid file and
fails the build if any CID cannot resolve via the IPFS gateway list.
The two new Strain baselines (LineLoadStrainLagrangian.mha,
LineLoadStrainEulerian.mha) were only registered as zero-byte markers
on ITKTestingData gh-pages and do not resolve through the gateways.

Switch both baselines to the .sha512 content-link form so that
ExternalData resolves them via the data.kitware.com hashsum mirror
instead of the IPFS gateway list, matching the pattern used for
IOScanco baselines in PR InsightSoftwareConsortium#6266.
@hjmjohnson
Copy link
Copy Markdown
Member Author

/azp run

…InsightSoftwareConsortium#56)

Convert 1 .sha512 content-link to .cid form now that the blob is
staged on the ITKTestingData gh-pages mirror via PR InsightSoftwareConsortium#56:

  Modules/IO/IOScanco/test/Baseline/C0004255.mha.cid

The companion C0004255.ISQ remains as .sha512 (too large for gh-pages).
@hjmjohnson
Copy link
Copy Markdown
Member Author

/azp run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:IO Issues affecting the IO module area:Python wrapping Python bindings for a class area:Remotes Issues affecting the Remote module type:Data Changes to testing data type:Enhancement Improvement of existing methods or implementation type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants