Skip to content

Conversation

@hongquanli
Copy link
Contributor

@hongquanli hongquanli commented Sep 28, 2025

This pull request introduces support for saving image data as OME-TIFF stacks with full metadata, enhancing the software's ability to handle multidimensional microscopy data in a standardized format. The implementation includes new utilities for OME-TIFF writing, updates to image saving logic, and ensures rich experiment metadata is captured and stored. The most important changes are grouped below.

OME-TIFF Support and Metadata Handling:

  • Added a new OME_TIFF option to the FileSavingOption enum, enabling users to save image data as OME-TIFF stacks with full metadata.
  • Implemented the ome_tiff_writer module, which provides utilities for writing OME-TIFF stacks via tifffile memmaps, handling metadata initialization, updating, and XML augmentation for OME compliance.
  • Enhanced the CaptureInfo dataclass to include additional fields required for OME-TIFF metadata, such as time point, total time points, z-levels, channels, channel names, experiment path, time increment, and physical sizes.

Image Saving Logic and File Locking:

  • Updated the image saving logic in SaveImageJob to support OME-TIFF output, including file locking for concurrent metadata updates and robust error checking for image dimensions and indices. [1] [2] [3]

Experiment Metadata Propagation:

  • Modified the multipoint acquisition workflow to compute and propagate experiment metadata (e.g., pixel size, time increment, physical sizes) and ensure directories are created for OME-TIFF output. These metadata fields are now passed through to CaptureInfo during image acquisition. [1] [2] [3] [4]This pull request introduces support for saving image data in the OME-TIFF format with full metadata, alongside the existing saving options. The implementation uses a memory-mapped stack and a metadata file to efficiently collect and organize multi-dimensional image data (TCZYX axes) before writing the final OME-TIFF file when acquisition is complete. The changes also ensure that all necessary metadata is tracked and provided during acquisition, and include a comprehensive test for the new saving pipeline.

OME-TIFF saving pipeline implementation:

  • Added OME_TIFF as a new option to the FileSavingOption enum in control/_def.py to enable OME-TIFF stack saving.
  • Implemented OME-TIFF saving logic in SaveImageJob within job_processing.py, including file locking, metadata management, and stack writing, with proper cleanup of temporary files. [1] [2] [3]

Metadata and acquisition integration:

  • Extended CaptureInfo to include OME-TIFF relevant fields such as time_point, total_time_points, total_z_levels, total_channels, channel_names, and experiment_path.
  • Updated multi_point_worker.py to populate these new fields during acquisition and ensure experiment paths are set and used consistently. [1] [2] [3] [4]

Testing and validation:

  • Added test_ome_tiff_saving.py with a full roundtrip test that verifies OME-TIFF stack creation, metadata correctness, and cleanup of temporary files after completion.
  • Tested manually by opening the output in FIJI using Bio-Formats Importer

hongquanli and others added 8 commits September 27, 2025 23:10
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
@hongquanli hongquanli force-pushed the ome-tiff branch 2 times, most recently from ba42c17 to 53b581f Compare September 28, 2025 20:07
hongquanli and others added 2 commits September 28, 2025 13:11
Embed time increment, dz, and pixel size in OME metadata while keeping per-plane entries intact.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces OME-TIFF saving support with comprehensive metadata handling, allowing efficient multi-dimensional image stack creation using memory-mapped files during acquisition.

  • Adds OME-TIFF as a new file saving option with full metadata support
  • Implements memory-mapped stack writing with proper synchronization and cleanup
  • Extends capture metadata to include acquisition parameters needed for OME-TIFF format

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
software/control/_def.py Adds OME_TIFF option to FileSavingOption enum
software/control/core/job_processing.py Implements OME-TIFF saving pipeline with file locking and metadata management
software/control/core/ome_tiff_writer.py New utility module for OME-TIFF stack creation and metadata handling
software/control/core/multi_point_worker.py Extends acquisition worker to populate OME-TIFF metadata fields
software/tests/test_ome_tiff_saving.py Comprehensive test for OME-TIFF saving roundtrip validation

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 184 to 186
expected_stage_um = float(z) * 1000.0
expected_piezo_um = float(z) * 10.0
expected_total_um = expected_stage_um + expected_piezo_um
Copy link

Copilot AI Sep 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Magic numbers 1000.0 and 10.0 should be defined as named constants to clarify their meaning (e.g., MM_TO_UM_FACTOR = 1000.0, Z_PIEZO_STEP_UM = 10.0).

Copilot uses AI. Check for mistakes.
hongquanli and others added 2 commits September 28, 2025 13:26
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@hongquanli hongquanli marked this pull request as ready for review September 28, 2025 20:41
@hongquanli hongquanli merged commit 47f6d35 into master Sep 29, 2025
3 checks passed
import squid.abc
import squid.logging
from control.utils_config import ChannelMode
from . import utils_ome_tiff_writer as ome_tiff_writer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really use relative imports anywhere else in the codebase, for consistency absolute would be better

fov: int
configuration_idx: int
z_piezo_um: Optional[float] = None
time_point: Optional[int] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these may make sense as CaptureInfo fields since they are specific to the capture itself. EG: time_point.

For the rest, they're more like acquisition info. The distinction being that it is not valid for the acquisition info fields to change within the context of an acquisition, but it is valid (and expected) for all the capture info values to change within the context of an acquisition.

Given that, and because this many fields is a bit unwieldy (aka: I'd probably group them even if the above paragraph wasn't true), it'd be better to make something like:

@dataclass
class AcquisitionInfo:
   total_time_points: int
   total_z_levels: int
   # etc ...

(This could potentially be our already existent AcquisitionParameters class, but to be honest one layer of separation is probably smart.)

Then either add an acquisition_info: Optional[AcquisitionInfo] on the capture info, or as a job member, or ideally in some context that is impossible to accidentally change across the different job calls in the acquisition.

This is preferred because:

  1. You only need to check that acquisition_info exists (instead of each of the individual fields) to have good input checking for the ome tiff case.
  2. It'll be easier to make sure that the AcquisitionInfo doesn't change through the course of an acquisition (even if we add more later) because you'll just need to compare a single dataclass (instead of all the individual fields).
  3. Since we probably want some mechanism for shared metadata across a series of jobs, this would force us to figure that out. I'm not sure the best way to do this off the top of my head. I don't think adding it to the job runner is right, but maybe (in which case, we'd need to enforce that a runner only runs one type of job with a given set metadata. And it'd pass the metadata into the job)

def _acquire_file_lock(lock_path: str):
lock_file = open(lock_path, "w")
try:
if fcntl is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a silent no-lock on non fcntl systems? That's sorta scary!

There's a library called filelock that implements cross platform file locking. It might make sense to use that.

Or in the lease, instead of fnctl = None above when the import fails. Re-raise and give a nice error message saying this is not supported.

lock_file.close()


class SaveImageJob(Job):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't thought about this a ton, but originally the intent was to create a bunch of Job class implementations. EG, here we'd have 1 for the original image saving format, and another for OME TIFF.

Then the acquisition side takes care of deciding what Job class(s) to create based on what config exists. This is nice because:

  1. The job implementations shouldn't need to know about global config. Or at least they should know as little as possible.
  2. This makes it easier to write out multiple save formats without it being confusing. AKA, we could send all captures to SaveImageJob and SaveOMETiffJob via the job classes we can pass in to MultiPointWorker. Then we'd get both for free (and can decide at runtime). Right now, with the global config, we can't do this (if we create 2 SaveImageJob, the fact that config is global will mean both switch if we change the global config at runtime)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd also help keep chunks of code a bit smaller and easier to read (we could still pull out common code either via @staticmethod or an intermediate base class)

metadata = ome_tiff_writer.update_plane_metadata(metadata, info)
index_key = f"{time_point}-{channel_index}-{z_index}"
if index_key not in metadata["written_indices"]:
metadata["written_indices"].append(index_key)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these bare string keys are pretty tricky from a maintenance perspective. I'd either figure out a way to use a dataclass or pydantic model, or at least pull these out as constants (eg: WRITTEN_INDICIES_KEY = "written_indicies" or something). The latter is not ideal, but is better than the bare strings. Bare strings are similar to bare numbers - if we ever need to refactor, it's really tricky to go check every instance of a particular string. However if we have it pulled out as a constant, then any instance of that key will use WRITTEN_INDICIES_KEY instead and it's easy to find all relevant uses.

region_id=region_id,
fov=fov,
configuration_idx=config_idx,
time_point=self.time_point,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments above, but if we break this out as AcquisitionInfo, some get_acquisition_info() helper would be nice.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there any reason to store all of these on as attributes on self? I think we only use them in this method to create the capture info.

If we don't need them elsewhere, it'd be better to just keep them local to this field.

Also it's python best practice to define all attributes and give them default values in __init__. That way, you know that the __init__ class is the place to look to figure out what attributes need to be taken care of when modifying a class. I've been slowly doing this throughout the codebase - it'd be good for everyone to start doing it (and, in general, to not define more class attributes if they aren't strictly needed!).


git_stub.Repo = _Repo # type: ignore[attr-defined]
sys.modules["git"] = git_stub

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have a bunch of hard coded strings in the ome tiff code, it'd be good to write a test to make sure they all get set and used properly. it looks like we test for quite a few of them below, so maybe they're all covered. Is all the metadata covered?

sys.modules["cv2"] = cv2_stub

if "git" not in sys.modules:
git_stub = types.ModuleType("git")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used somewhere? I don't see it below.

def _ensure_dependency_stubs() -> None:
"""Provide minimal substitutes for optional runtime dependencies."""

if "cv2" not in sys.modules:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used? I don't see it anywhere. Also, our tests run in a fully configured environment so we should have all the imports we need (if we don't, it means our setup script is broken!).

import numpy as np
import pytest

PROJECT_ROOT = Path(__file__).resolve().parents[1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should need this. There's a good chance something is broken if you do.

hongquanli added a commit that referenced this pull request Jan 4, 2026
…istency (#367)

This pull request refactors and improves how OME-TIFF image saving is
handled in the control software, with a focus on separating
acquisition-wide metadata from per-image metadata and making the
OME-TIFF writing process more robust and maintainable. The changes
introduce a new `AcquisitionInfo` dataclass, update job processing and
worker logic to use it, and enhance metadata handling for OME-TIFF
files. Additionally, file locking is now done using the cross-platform
`filelock` library.

The PR (using Google Antigravity) addresses @ianohara's comments on
#339.

**OME-TIFF job and metadata refactor:**

* Introduced the `AcquisitionInfo` dataclass to encapsulate
acquisition-wide metadata (e.g., channel names, experiment path, time
increments, physical sizes), separating it from per-image `CaptureInfo`.
All OME-TIFF related functions and jobs now use this new class for
metadata.
[[1]](diffhunk://#diff-11a0c546b6598556291594354ba0512d79c90e56422c1a1c26566ec552ca342eL27-R36)
[[2]](diffhunk://#diff-f62b9df2df972843b19e864beb57b207852b326b3da285407f4b4f54e535f0a6R100-R111)
* Added a new `SaveOMETiffJob` class, which requires `AcquisitionInfo`
to be injected by the `JobRunner`. The OME-TIFF save logic now uses both
`CaptureInfo` and `AcquisitionInfo` for validation and metadata
initialization.
* Refactored `utils_ome_tiff_writer.py` to require `AcquisitionInfo` for
all OME-TIFF metadata operations, and replaced hardcoded metadata keys
with constants for improved readability and maintainability.
[[1]](diffhunk://#diff-5a974645fb955d4696bb1584c8fd4c8f568b2da068b2684f3738147e49b1ea76L18-R47)
[[2]](diffhunk://#diff-5a974645fb955d4696bb1584c8fd4c8f568b2da068b2684f3738147e49b1ea76L51-R121)
* Updated metadata handling functions to use acquisition-wide values
(e.g., channel names, physical sizes) from `AcquisitionInfo` rather than
from each `CaptureInfo`.
[[1]](diffhunk://#diff-5a974645fb955d4696bb1584c8fd4c8f568b2da068b2684f3738147e49b1ea76L51-R121)
[[2]](diffhunk://#diff-5a974645fb955d4696bb1584c8fd4c8f568b2da068b2684f3738147e49b1ea76L122-R147)
[[3]](diffhunk://#diff-5a974645fb955d4696bb1584c8fd4c8f568b2da068b2684f3738147e49b1ea76L135-R173)

**Job runner and worker logic improvements:**

* The `JobRunner` class now accepts and stores an optional
`AcquisitionInfo`, and automatically injects it into `SaveOMETiffJob`
instances. This ensures all jobs have access to acquisition-wide
metadata.
[[1]](diffhunk://#diff-11a0c546b6598556291594354ba0512d79c90e56422c1a1c26566ec552ca342eL265-R281)
[[2]](diffhunk://#diff-f62b9df2df972843b19e864beb57b207852b326b3da285407f4b4f54e535f0a6L150-R168)
* The multipoint worker (`multi_point_worker.py`) constructs a single
`AcquisitionInfo` at initialization and passes it to the `JobRunner` and
OME-TIFF jobs, simplifying job creation and metadata consistency.
[[1]](diffhunk://#diff-f62b9df2df972843b19e864beb57b207852b326b3da285407f4b4f54e535f0a6R100-R111)
[[2]](diffhunk://#diff-f62b9df2df972843b19e864beb57b207852b326b3da285407f4b4f54e535f0a6L139-R157)

**File locking and platform compatibility:**

* Replaced the previous `fcntl`-based file locking mechanism with the
cross-platform `filelock` library for metadata file access, improving
compatibility and reliability.
[[1]](diffhunk://#diff-11a0c546b6598556291594354ba0512d79c90e56422c1a1c26566ec552ca342eL12-R13)
[[2]](diffhunk://#diff-11a0c546b6598556291594354ba0512d79c90e56422c1a1c26566ec552ca342eL95-L103)

**Code cleanup and removal of duplication:**

* Removed duplicated acquisition metadata fields from `CaptureInfo`,
which are now provided by `AcquisitionInfo`.
[[1]](diffhunk://#diff-11a0c546b6598556291594354ba0512d79c90e56422c1a1c26566ec552ca342eL45-L53)
[[2]](diffhunk://#diff-f62b9df2df972843b19e864beb57b207852b326b3da285407f4b4f54e535f0a6L660-L668)
[[3]](diffhunk://#diff-f62b9df2df972843b19e864beb57b207852b326b3da285407f4b4f54e535f0a6L752-L760)
* Cleaned up conditional logic for job selection in the worker, ensuring
OME-TIFF jobs are used only when appropriate.

These changes make the OME-TIFF saving process more robust, modular, and
maintainable, and improve the overall reliability of metadata handling
and file access.

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants