Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CV7000: handle case where wells are recorded, but all files were removed #4006

Merged
merged 2 commits into from Jun 14, 2023

Conversation

melissalinkert
Copy link
Member

See #4005.

I think this actually does fix the issue in #4005, but needs further testing. Mostly opening this PR now to see if/how many tests fail. At the moment, I'd expect memo file regeneration to be required, but not re-import.

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally tested using the IDR plate reported in the image.sc thread and importing it into OMERO without additional rendering settings.

Without this PR, black thumbnails follow blocks of missing wells
Screenshot 2023-05-16 at 11 25 15

With this PR, the thumbnails now all render data
Screenshot 2023-05-16 at 11 25 37

The nightly CI builds seem to indicate this caused no regression at least in the representative samples of the format in the curated OME QA repository.

As indicated inline, my primary concern is possibly the performance implications of the additional look-ups

@@ -309,6 +309,10 @@ public int compare(Channel c1, Channel c2) {

for (Plane p : planeData) {
if (p != null) {
if (!isWellAcquired(p.field.row, p.field.column)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the logic is to populate a more faithful uniqueWells (excluding the missing wells) and then initialize and populate a reversePlaneLookup map only for the existing wells which makes total sense.

My primary concern with the proposed fix is its computational cost and the potential performance impact on large plates. isWellAcquired returns the existence of a particular well from a loop on PlaneData itself. Having this call nested within a PlaneData loop means an increase of complexity which will scale with the number of wells/fields in a plate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I wasn't really thinking about performance for the initial fix, was really trying to verify that it worked. 8e1dac2 is one option that should make isWellAcquired inexpensive for repeated checks on the same well.

@sbesson sbesson self-requested a review May 16, 2023 20:16
Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nightly CI tests are still passing. I imported all the CV7000 samples of our curated QA repository (incl. the new representative sample from idr0088) into servers with and without this PR included and compared the import times via omero fs importtime

Without this PR

(.venv3) bash-4.2$ omero fs importtime Fileset:48407
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: private-1
   upload time of  51.87s for 1597 files (0.032s/file)
    setId time of   4.36s
 metadata time of  52.01s
   pixels time of 303.14s for 1848 planes (0.164s/plane)
    rdefs time of  52.30s for 616 rendering settings (0.085s/rdef)
thumbnail time of  76.20s for 616 thumbnails (0.124s/thumbnail)
(.venv3) bash-4.2$ omero fs importtime Fileset:48408
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: private-1
   upload time of  19.42s for 755 files (0.026s/file)
    setId time of   1.38s
 metadata time of  19.54s
   pixels time of 403.96s for 738 planes (0.547s/plane)
    rdefs time of  22.01s for 369 rendering settings (0.060s/rdef)
thumbnail time of  75.56s for 369 thumbnails (0.205s/thumbnail)
(.venv3) bash-4.2$ omero fs importtime Fileset:48409
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: private-1
   upload time of   2.51s for 78 files (0.032s/file)
    setId time of   0.86s
 metadata time of   1.66s
   pixels time of  35.04s for 64 planes (0.547s/plane)
    rdefs time of   0.97s for 16 rendering settings (0.061s/rdef)
thumbnail time of   3.15s for 16 thumbnails (0.197s/thumbnail)
(.venv3) bash-4.2$ omero fs importtime Fileset:48410
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: private-1
   upload time of  70.07s for 2345 files (0.030s/file)
    setId time of   2.79s
 metadata time of   4.33s
   pixels time of 422.97s for 2340 planes (0.181s/plane)
    rdefs time of   3.71s for 24 rendering settings (0.154s/rdef)
thumbnail time of   3.46s for 24 thumbnails (0.144s/thumbnail)
(.venv3) bash-4.2$ omero fs importtime Fileset:48411
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: private-1
   upload time of   2.91s for 103 files (0.028s/file)
    setId time of   0.47s
 metadata time of   1.08s
   pixels time of  30.39s for 88 planes (0.345s/plane)
    rdefs time of   1.02s for 22 rendering settings (0.046s/rdef)
thumbnail time of   4.46s for 22 thumbnails (0.203s/thumbnail)
(.venv3) bash-4.2$ omero fs importtime Fileset:48412
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: private-1
   upload time of  42.16s for 1495 files (0.028s/file)
    setId time of   2.17s
 metadata time of  27.11s
   pixels time of 729.45s for 1480 planes (0.493s/plane)
    rdefs time of  34.48s for 370 rendering settings (0.093s/rdef)
thumbnail time of  93.41s for 370 thumbnails (0.252s/thumbnail)
(.venv3) bash-4.2$ omero fs importtime Fileset:48413
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: private-1
   upload time of   2.05s for 78 files (0.026s/file)
    setId time of   0.42s
 metadata time of   1.63s
   pixels time of  34.09s for 64 planes (0.533s/plane)
    rdefs time of   0.74s for 16 rendering settings (0.046s/rdef)
thumbnail time of   3.21s for 16 thumbnails (0.201s/thumbnail)

With this PR

(.venv3) bash-4.2$  omero fs importtime Fileset:141501
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: private-1
   upload time of  40.33s for 1597 files (0.025s/file)
    setId time of   4.68s
 metadata time of  47.43s
   pixels time of 288.06s for 1584 planes (0.182s/plane)
    rdefs time of  51.51s for 528 rendering settings (0.098s/rdef)
thumbnail time of  79.51s for 528 thumbnails (0.151s/thumbnail)
(.venv3) bash-4.2$  omero fs importtime Fileset:141502
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: private-1
   upload time of  17.20s for 755 files (0.023s/file)
    setId time of   1.49s
 metadata time of  18.29s
   pixels time of 367.24s for 738 planes (0.498s/plane)
    rdefs time of  20.93s for 369 rendering settings (0.057s/rdef)
thumbnail time of  99.72s for 369 thumbnails (0.270s/thumbnail)
(.venv3) bash-4.2$  omero fs importtime Fileset:141503
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: private-1
   upload time of   1.96s for 78 files (0.025s/file)
    setId time of   0.40s
 metadata time of   0.80s
   pixels time of  34.06s for 64 planes (0.532s/plane)
    rdefs time of   0.63s for 16 rendering settings (0.039s/rdef)
thumbnail time of   5.98s for 16 thumbnails (0.374s/thumbnail)
(.venv3) bash-4.2$  omero fs importtime Fileset:141504
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: private-1
   upload time of  54.74s for 2345 files (0.023s/file)
    setId time of   2.36s
 metadata time of   3.30s
   pixels time of 440.90s for 2340 planes (0.188s/plane)
    rdefs time of   4.49s for 24 rendering settings (0.187s/rdef)
thumbnail time of   5.24s for 24 thumbnails (0.218s/thumbnail)
(.venv3) bash-4.2$  omero fs importtime Fileset:141505
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: private-1
   upload time of   2.83s for 103 files (0.027s/file)
    setId time of   0.54s
 metadata time of   0.92s
   pixels time of  46.33s for 88 planes (0.527s/plane)
    rdefs time of   0.94s for 22 rendering settings (0.043s/rdef)
thumbnail time of   8.25s for 22 thumbnails (0.375s/thumbnail)
(.venv3) bash-4.2$  omero fs importtime Fileset:141506
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: private-1
   upload time of  34.33s for 1495 files (0.023s/file)
    setId time of   2.51s
 metadata time of  27.15s
   pixels time of 767.81s for 1480 planes (0.519s/plane)
    rdefs time of  25.66s for 370 rendering settings (0.069s/rdef)
thumbnail time of 115.40s for 370 thumbnails (0.312s/thumbnail)
(.venv3) bash-4.2$  omero fs importtime Fileset:141507
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: private-1
   upload time of   1.94s for 78 files (0.025s/file)
    setId time of   0.36s
 metadata time of   0.68s
   pixels time of  19.77s for 64 planes (0.309s/plane)
    rdefs time of   0.71s for 16 rendering settings (0.044s/rdef)
thumbnail time of   6.37s for 16 thumbnails (0.398s/thumbnail)

Overall, I see no adverse performance implications associated with this PR in the context or standard plates (384 wells, up to 7 FOVs/well). I'll port these changes to the IDR/bioformats fork for evaluation against the IDR samples next

Copy link
Member

@dgault dgault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in the PR appear to be behaving as expected with no regressions in other datasets. The import metrics also suggest that there is no major change in performance. Looks good to merge.

@dgault dgault merged commit bc6c7eb into ome:develop Jun 14, 2023
17 checks passed
@sbesson
Copy link
Member

sbesson commented Jun 14, 2023

Adding a clarification: as shown implicitly. by the fs importtime output in #4006 (review), this PR makes backwards incompatible core metadata changes for CV7000 datasets where wells have been recorded in the metadata but all files are removed. Prior to this PR, the metadata included a set of Image elements associated with acquired well samples were linked to Plate/Well/WellSample hierarchy in the metadata albeit the plane reading via openBytes was incorrect and a subset of Image elements left orphaned. With this PR included, only the Image element associated with fields of views where the data exists are defined in the metadata and linked to Plate/Well/WellSample hierarchy leaving no more orphaned images.

As per the nature of the changes, this should definitely be included in a minor release of Bio-Formats as per the versioning policy

@dgault
Copy link
Member

dgault commented Jun 15, 2023

Yes this is for the next minor release, 6.14.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants