fix #10703: download pre-FS original metadata in FS #1717

mtbc · 2013-11-06T11:07:24Z

Fixes http://trac.openmicroscopy.org.uk/ome/ticket/10703. To test,

run a local 4.4 server
import files with metadata, e.g., from ome/data_repo/test_images_metadata/
upgrade to 5.0, using sql/psql/OMERO5.0DEV__6/OMERO4.4__0.sql
import other files with metadata
check that client "download original metadata" works okay for both pre-FS and FS images, both global and series metadata
check that OriginalMetadataRequestTest.testMetadataParsing() passes.

(Lines may be re-ordered but ought still be under the correct sections.)

I first tried using org.apache.commons.configuration.HierarchicalINIConfiguration to parse the file but, for our kind of INI-format, it turned out to be more trouble than it was worth.

--no-rebase as 5.0-specific

joshmoore · 2013-11-06T11:29:36Z

components/blitz/src/omero/cmd/fs/OriginalMetadataRequestI.java

+        if (rsp.fileAnnotationId != null) {
+            final IQuery iQuery = helper.getServiceFactory().getQueryService();
+            final FileAnnotation fileAnnotation = iQuery.get(FileAnnotation.class, rsp.fileAnnotationId.getValue());
+            final String filePath = pixelsService.getFilesPath(fileAnnotation.getFile().getId());


Is this not looking under /OMERO/Pixels rather than /OMERO/Files/?

That's getPixelsPath.

Of course! Cheers.

joshmoore · 2013-11-07T11:22:04Z

Using mt1_R3D_D3D.dv before and after upgrade and the following script:

import omero
import omero.all
from omero.cmd import OriginalMetadataRequest
from omero.gateway import BlitzGateway
from omero.rtypes import unwrap
from path import path


c = omero.client("localhost")
c.createSession(...)
g = BlitzGateway(client_obj=c)

rsps=[]
for i in (1,51):
    req = OriginalMetadataRequest()
    req.imageId = i
    handle = c.sf.submit(req)
    cb = g._waitOnCmd(handle)
    rsps.append(cb.getResponse())

a,b = rsps
print a.filesetId, b.filesetId
print a.fileAnnotationId, b.fileAnnotationId
a_keys = set(a.globalMetadata.keys())
b_keys = set(b.globalMetadata.keys())

print a_keys - b_keys
print b_keys - a_keys

for key in sorted(a_keys):
    print key, unwrap(a.globalMetadata[key]), unwrap(b.globalMetadata[key])

I get the following differences in keys:

set(['Z axis reduction quotient', 'Title 4', 'Title 7', 'Wavelength 4 (in nm)', 'Title 3', 'Title 2', 'Min, Max, Mean (w', 'Wavelength 1 (in nm)', 'Wavelength 5 (in nm)', 'Wavelength 2 (in nm)', 'Wavelength 3 (in nm)', 'Offset to first plane', 'Number of Sub-resolution sets'])

and

set(['Z position for position #2', 'Y position for position #3', 'X position for position #2', 'Z position for position #4', 'Z position for position #3', 'Title #07', 'Title #08', 'Min, Max, Mean (w=617.0 nm)', 'Y position for position #2', 'Title #02', 'Title #03', 'Title #04', 'Title #05', 'Title #06', 'Y position for position #4', 'X position for position #3', 'X position for position #4', 'Title #10', 'Title #01', 'Title #09', 'Min, Max, Mean (w=470.0 nm)'])

which looks as if the parsing of = within () is not handled.

mtbc · 2013-11-07T11:44:18Z

It isn't. The Apache Commons library I tried didn't seem to quite fit the INI-like format that we use, and your testing reveals that going too simple doesn't either. Is there a spec somewhere of our version of it so I know what to handle how? (I am thinking that '=' within '()' might not be the only problem, for instance perhaps we do something with quoting or escaping or other kinds of bracket or brace too?) Or, avoiding parenthesis balancing, perhaps it is always okay to split only at the last '=', ignoring any earlier -- we could give that a try?

joshmoore · 2013-11-07T13:55:20Z

/cc @melissalinkert @will-moore @jburel

melissalinkert · 2013-11-07T16:22:15Z

@mtbc, have you tried the loci.common.IniParser class? Not sure if that will work, but it (and the other Ini* classes) are what Bio-Formats uses internally for reading and writing INI files.

mtbc · 2013-11-07T17:28:46Z

It probably might, thank you, I shall give it a try.

mtbc · 2013-11-08T11:57:09Z

Hmm, it looks to delegate to ome.scifio.common.IniParser whose parseINI does nothing special with parentheses and splits on the first '='.

jburel · 2013-11-10T21:03:23Z

agree that Download and downgrade (currently in insight) should be done in B-F but not in that PR.

mtbc · 2013-11-10T21:51:06Z

I can certainly see adding unit testing for parseOriginalMetadataTxt.

Once this PR is agreed to work, if deemed desirable for Bio-Formats I could try to introduce splitOnEquals into IniParser and then use that class in OriginalMetadataRequestI, perhaps after beta2.

joshmoore · 2013-11-11T09:47:42Z

@mtbc : sounds like a plan.

omero.cmd.fs.OriginalMetadataRequestTest.testMetadataParsing which presumably someday gets moved with other Blitz tests to OmeroJava

bpindelski · 2013-11-12T11:31:35Z

Pre-upgrade images have metadata default file names as "original_metadata.txt". Post-upgrade files have random file names...
leica-lif has differences between original metadata files (images imported after the upgrade seem to have less information in the original metadata file downloaded from Insight)
For metamorph\sample)_3x3,stk imported after upgrade, I couldn't download the metadata (Metadata could not be retrieved)
For zeiss-lsm/colocsample1b.lsm that was imported post-upgrade, the MIME type for the original metadata text file is wrong (file foo.txt returns foo.txt: data)
For post-upgrade imported zeiss-lsm/sample files.mdb, the right hand side Acquisition panel has some strange entries (Recordings ...):

In general - the metadata files are present, but with minor (major?) differences. It's hard for me to judge the implications, so maybe someone with more metadata-fu could have a look? /cc @pwalczysko

The unit test runs fine. If the issues mentioned by me are to be handled in a different PR, this is OK to merge.

mtbc · 2013-11-12T11:48:42Z

How do the downloaded files differ from the original_metadata.txt that end up in Files/? (Probably it's easy to find the right ones, but let me know if you'd like me to dig up some SQL.)

I'd guess that the MIME type issue is unrelated to this PR but very likely worth at least ticketing. I wonder if the Leica issue is some Bio-Formats change or something. I'll cc @rleigh-dundee as he may have some familiarity with all this too.

bpindelski · 2013-11-12T12:10:26Z

@mtbc The differences were either related to the # symbol being used in series numbers or missing/added metadata fields (e.g. pre-upgrade imported image had less fields in the metadata file than post-upgrade imported image). I didn't do a full image-by-image comparison of all the metadata attached to files in test_images_metadata - that's a full day's job (unless we can automate it, as there are 381 image files - yes, some of them are part of a MIF).

mtbc · 2013-11-12T16:12:11Z

Thank you, I will investigate tomorrow.

mtbc · 2013-11-13T09:25:54Z

So, the files I'll look at are,

omero=> select i.name, of.name, a.file from image i, imageannotationlink ial, annotation a, originalfile of where i.id = ial.parent and ial.child = a.id and a.file = of.id;
                                                 name                                                  | file 
-------------------------------------------------------------------------------------------------------+------
 /Volumes/ome/data_repo/test_images_metadata/leica-lif/01_4C1Z.lif                                     |   19
 /Volumes/ome/data_repo/test_images_metadata/metamorph/sample_3x3.stk                                  |   20
 /Volumes/ome/data_repo/test_images_metadata/zeiss-lsm/colocsample1b.lsm                               |   21
 /Volumes/ome/data_repo/test_images_metadata/zeiss-lsm/sample files.mdb/sample files.mdb [XY-ch-02]    |   23
 /Volumes/ome/data_repo/test_images_metadata/zeiss-lsm/sample files.mdb/sample files.mdb [XY-ch-03]    |   24
 /Volumes/ome/data_repo/test_images_metadata/zeiss-lsm/sample files.mdb/sample files.mdb [XY-ch]       |   25
 /Volumes/ome/data_repo/test_images_metadata/zeiss-lsm/sample files.mdb/sample files.mdb [XYT]         |   26
 /Volumes/ome/data_repo/test_images_metadata/zeiss-lsm/sample files.mdb/sample files.mdb [XYZ-ch-20x]  |   27
 /Volumes/ome/data_repo/test_images_metadata/zeiss-lsm/sample files.mdb/sample files.mdb [XYZ-ch-zoom] |   28
 /Volumes/ome/data_repo/test_images_metadata/zeiss-lsm/sample files.mdb/sample files.mdb [XYZ-ch]      |   29
 /Volumes/ome/data_repo/test_images_metadata/zeiss-lsm/sample files.mdb/sample files.mdb [XYZ-ch0]     |   30
(11 rows)

will-moore · 2013-11-13T09:29:23Z

FYI: I found a bug in download of pre-FS original files while reviewing another PR: #1738. Don't know if that's something related to this?

mtbc · 2013-11-13T10:57:26Z

I am comparing the original_metadata.txt from Files/ with the original metadata files I download from Insight. For most of the above images, I find them the same, apart from that lines within a section are reordered. So, if @bpindelski is viewing them in ways that make most of them seem very different then that is probably unrelated to this PR and worth a ticket. There are, however, three files that exhibit differences, about which I shall separately comment.

mtbc · 2013-11-13T10:58:07Z

leica-lif/01_4C1Z.lif has an empty [GlobalMetadata] section that quite reasonably is omitted on download.

mtbc · 2013-11-13T11:01:52Z

zeiss-lsm/sample files.mdb/sample files.mdb [XY-ch] has a "line",

Recording #1 Notes=IHC 15.07.08,  Part I Sequenza for comparison of the intensity of CFTR signal between CF and Non CF and between Sequenza  AR 























1/2 AW non CF/CK+,  1/ JTDF  CF  slide 14.11.07 hnb
IHC done by Heather 1/2 cells are non CF labelled with CK and G449 and 1/2 CF labelled only with G449
Ref pict from 18.09.07 stack 1, modified to 12 bit

which got parsed as just the first line and the other lines were ignored.

I think I'd probably argue here that this was the right thing to do, unless we really should be appending following (even blank) lines onto values without any explicit quoting or line continuation hint at all? I don't know if anyone watching this PR has a strong opinion on this or if it's worth an RFE ticket.

mtbc · 2013-11-13T11:05:58Z

zeiss-lsm/sample files.mdb/sample files.mdb [XYZ-ch] has the same issue as above with exactly the same "line".

mtbc · 2013-11-13T11:12:08Z

So, at this point I think I'm still happy with this PR as at least being a substantial improvement unless that awful many-line Recording #1 Notes must be handled differently.

GIven that @bpindelski mentions an issue with # I would guess that he's using a code path that uses ome.scifio.common.IniParser which does,

  private String commentDelimiter = "#";

      String line = in.readLine();
      if (line == null) break;
      no++;

      // strip comments
      if (commentDelimiter != null) {
        int comment = line.indexOf(commentDelimiter);
        if (comment >= 0) line = line.substring(0, comment);
      }

which, given the nature of our actual input data, may well be worth a ticket.

mtbc · 2013-11-13T11:21:52Z

@will-moore: I think, probably unrelated, but thank you!

mtbc · 2013-11-13T11:31:04Z

It should be noted that the same images imported in 4.4 and upgraded, and imported in 5.0, may not have identical metadata due to various other code differences, especially in Bio-Formats.

mtbc · 2013-11-14T11:04:41Z

Filed http://trac.openmicroscopy.org.uk/ome/ticket/11684 and http://trac.openmicroscopy.org.uk/ome/ticket/11685 about the INI format issues. @bpindelski please could you file one about how to reproduce the MIME type issue you discovered? This PR at least, I think, is a significant improvement over the current state of affairs (wherein one couldn't download any metadata in OMERO 5 for images imported prior to upgrade from OMERO 4).

bpindelski · 2013-11-14T11:32:51Z

@mtbc Ticket for MIME type difference opened: https://trac.openmicroscopy.org.uk/ome/ticket/11688

I also agree that this PR improves the state of the original metadata aspect. I don't have anything against merging, and massaging out the subtler bugs later on.

joshmoore · 2013-11-14T11:36:58Z

Thanks, guys!

fix #10703: download pre-FS original metadata in FS

mtbc added 3 commits November 5, 2013 11:30

section titles to match actual 4.4 original_metadata.txt

b50f306

adjust Insight to match server metadata section headings

75be4e3

parse pre-FS original metadata txt files in request

c530c20

joshmoore reviewed Nov 6, 2013
View reviewed changes

mtbc added 4 commits November 8, 2013 12:17

more complex selection of which = to split at in INI

2604584

set charset for reading INI file text to UTF-8 encoding

fb8245c

minor whitespace fix

4519bf9

set charset for writing INI file text to UTF-8 encoding

330a2bc

add test of original metadata file parsing

8b64b52

omero.cmd.fs.OriginalMetadataRequestTest.testMetadataParsing which presumably someday gets moved with other Blitz tests to OmeroJava

joshmoore added a commit that referenced this pull request Nov 14, 2013

Merge pull request #1717 from mtbc/trac-10703-original-metadata

a6df36d

fix #10703: download pre-FS original metadata in FS

joshmoore merged commit a6df36d into ome:develop Nov 14, 2013

mtbc deleted the trac-10703-original-metadata branch November 14, 2013 11:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix #10703: download pre-FS original metadata in FS #1717

fix #10703: download pre-FS original metadata in FS #1717

mtbc commented Nov 6, 2013

joshmoore Nov 6, 2013

mtbc Nov 6, 2013

joshmoore Nov 6, 2013

joshmoore commented Nov 7, 2013

mtbc commented Nov 7, 2013

joshmoore commented Nov 7, 2013

melissalinkert commented Nov 7, 2013

mtbc commented Nov 7, 2013

mtbc commented Nov 8, 2013

jburel commented Nov 10, 2013

mtbc commented Nov 10, 2013

joshmoore commented Nov 11, 2013

bpindelski commented Nov 12, 2013

mtbc commented Nov 12, 2013

bpindelski commented Nov 12, 2013

mtbc commented Nov 12, 2013

mtbc commented Nov 13, 2013

will-moore commented Nov 13, 2013

mtbc commented Nov 13, 2013

mtbc commented Nov 13, 2013

mtbc commented Nov 13, 2013

mtbc commented Nov 13, 2013

mtbc commented Nov 13, 2013

mtbc commented Nov 13, 2013

mtbc commented Nov 13, 2013

mtbc commented Nov 14, 2013

bpindelski commented Nov 14, 2013

joshmoore commented Nov 14, 2013

fix #10703: download pre-FS original metadata in FS #1717

fix #10703: download pre-FS original metadata in FS #1717

Conversation

mtbc commented Nov 6, 2013

joshmoore Nov 6, 2013

Choose a reason for hiding this comment

mtbc Nov 6, 2013

Choose a reason for hiding this comment

joshmoore Nov 6, 2013

Choose a reason for hiding this comment

joshmoore commented Nov 7, 2013

mtbc commented Nov 7, 2013

joshmoore commented Nov 7, 2013

melissalinkert commented Nov 7, 2013

mtbc commented Nov 7, 2013

mtbc commented Nov 8, 2013

jburel commented Nov 10, 2013

mtbc commented Nov 10, 2013

joshmoore commented Nov 11, 2013

bpindelski commented Nov 12, 2013

mtbc commented Nov 12, 2013

bpindelski commented Nov 12, 2013

mtbc commented Nov 12, 2013

mtbc commented Nov 13, 2013

will-moore commented Nov 13, 2013

mtbc commented Nov 13, 2013

mtbc commented Nov 13, 2013

mtbc commented Nov 13, 2013

mtbc commented Nov 13, 2013

mtbc commented Nov 13, 2013

mtbc commented Nov 13, 2013

mtbc commented Nov 13, 2013

mtbc commented Nov 14, 2013

bpindelski commented Nov 14, 2013

joshmoore commented Nov 14, 2013