-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Micromanager: parse JSON tag in each TIFF #2213
Conversation
See https://trello.com/c/499UGA5U/72-micromanager-stage-positions-and-json This reads the JSON data from each TIFF and stores the key/value pairs in the original metadata table. Stage position values ("*PositionUm") will also be stored in OME-XML, if they are present. As each TIFF tag contains a single JSON object, simple parsing of the key/value pairs without an external library was the easiest solution. We'll likely want to revisit this once we have a better plan for dealing with JSON data in general though.
See configuration for new dataset: https://github.com/openmicroscopy/data_repo_config/pull/73 |
As per Melissa's request, I attach an example dataset acquired with the demo config of MM 1.4.22. tifutill tells that it has all the required tags:
|
NB: general question which will also become an issue for OMERO -- at what stage do we refactor json parsing out to helpers and/or introduce a dependency? |
Thanks, @julou. We'll fix that dataset in a separate pull request, as it is a proper OME-TIFF dataset and not the original Micromanager format (i.e. no metadata.txt file). Noted here so we don't lose track: https://trello.com/c/0FWF75pF/88-parse-extra-ome-tiff-tags @joshmoore: probably would be good to do that in the not-very-distant future. I just didn't want to make a unilateral decision (given the number of library choices), and assumed we wouldn't have bandwidth to discuss/trial/decide in the current milestone. |
@melissalinkert I'm afraid there might be some misunderstanding here and I really don't understand why you want to make this a separate issue. As explained by M. Tsuchida:
So the dataset I sent is nothing else than a proper sample dataset of micromanager (wit both OME metadata and JSON config-specific metadata, although the latter are not duplicated in a metadata.txt file). As far as I understand, parsing metadata should not rely on the .txt file, all the more so as theese are not imported by omero (at least I didn't manage so far). From your answer, I'm afraid that the current reader relies at some point on the metadata.txt file, is this the case or not? |
btw I just looked at your modified code and realised you indeed get the JSON MM Metadata from metadata.txt. Thank you very much for your work on this! |
Works as expected. Please merge.. |
@bramalingam I'm surprised that you simply ignored my last comments. Relying on the metadata.txt files rather than the internal tiff tag makes the patch useless for most users and as far as I can tell for omero import… |
Hi @julou and apologies for the delay in answering your comment. From the OME point of view, Micro-Manager currently saves its acquired data under two types of file formats as described in the dataset structure table:
We certainly understand that this distinction might feel irrelevant for most consumers of Micro-Manager. However at the Bio-Formats level, each file format is handled by a separate image reader. This format duplicity has some constraints in terms of code management, bug fixes, testing and documentation. This is also why we tend to open pull request for individual readers. The current PR opened by @melissalinkert targets the parsing of the JSON metadata for the MicroManager format only and this is why it was functionally tested by @bramalingam within this scope. For files saved as .ome.tif, the situation is more complex as Melissa eluded above. Although the OME-TIFF files generated by Micro-Manager are fully valid from an OME-XML point-of-view, the usage of a custom TIFF tag to store extra metadata is not part in the specification. As pointed out by Mark in the related Micro-Manager thread, supporting this custom metadata is effectively equivalent to adding support for a new file format. We will follow-up on the threads initiated both on the ome-users and the Micro-Manager mailing lists and engage the conversation on how to store this extra metadata into the existing OME-TIFF format without creating a new specification. Best, |
Hi guys, one of the µManager devs here. There seems to have been some miscommunication going on here, so I'd just like to work on figuring out exactly what is going on and what is needed. First off, looking at the diff for this PR, it seems like it's actually parsing JSON from tag 50839. I don't see any reference to the metadata.txt file, but I also don't see how accessing tag 50839 gets you the JSON metadata we generate. So I'm a bit confused about what this change actually does. Second, regarding file formats: µManager's file format does not inherently include the metadata.txt file. As Thomas notes, generation of this file is optional. It is intended primarily for users who want to be able to access image metadata without going through µManager. When µManager itself reads a µManager dataset, the file is ignored in favor of TIFF tag 51123. So I think it's incorrect to characterize the µManager file format as "a metadata.txt with a set of TIFF files". I'm not familiar with BioFormats' implementation or what is and is not easy to accomplish, but if BioFormats is relying on the metadata.txt, then it is imperative that the file loader gracefully handle the absence of a metadata.txt file, presumably by simply not populating the X/Y/Z positions. @sbesson I admit to being a bit confused by your comment. There's already a MicromanagerReader class; what technical hurdle prevents that class from accessing tag 51123, reading the JSON, and extracting any relevant tags from it for insertion into the OME? Of course ideally we (that is, µManager) would be storing this information in the OME ourselves when we write the file; our handling of OME metadata is far from perfect. We want to do a serious revamp of how µManager stores data, for many reasons, but that's a fairly long ways off. On a side note, manually parsing JSON makes me uneasy. |
Hi @ChrisWeisiger -- was just writing up a response to the ome-users & micro-manager-general mailing lists about this. Any chance of discussing briefly on IRC (freenode's #ome) or on https://gitter.im/openmicroscopy/bioformats? |
In the meantime, a few comments:
Sorry for any misrepresentation, a better way of saying this would be "for Bio-Formats ... µManager file format as "a metadata.txt with a set of TIFF files", because this is exactly what the reader scans for at present.
Assuming the metadata is present in the OME-TIFF IFD, then that'll happen. But from the Bio-Formats point-of-view, this will be an OME-TIFF, and not a µManager file.
At the moment, the primary hurdle is that this looks to be a new format from the Bio-Formats perspective and one that could potentially be confused (internally) with OME-TIFF itself. And that's the type of corner cases it would be good to work through.
On that point, it might be good to pass some examples back and forth so we're clear on expectations. Slowly sinking into weekend mode. If I/we don't see you on IRC, etc. looking forward to further discussions. ~Josh |
Okay, now that I've looked at the entire file instead of just what was modified in this diff, I see that you are indeed loading the metadata.txt file as Thomas said. I'm on IRC now and available to talk about our plans and how best to handle this kind of thing in future. |
Hi @julou. A quick follow-up before the weekend having chatted on IRC with @ChrisWeisiger: our best suggestion for the moment is to continue generating the separate metadata.txt in µManager. Without it, the MicromanagerReader (modified here) is not used at all, and instead the OMETiffReader is used, meaning the metadata is not accessible to you. @ChrisWeisiger and I discussed possible solutions for down the road, but changes in neither project is going to be immediately useful for you (nor others). As soon as I can, I'll follow-up on your ome-users email thread and CC the micro-manager-general mailing list with general thoughts on the way forward. |
Hi @joshmoore. Thanks a lot for looking more deeply into this and taking the time to coordinate with MM guys (thanks @ChrisWeisiger btw). As far as I understand, when looking at a dataset, BF will only use the MM reader if it's made of a metadata.txt file + a bunch of tifs, otherwise it'll fall back into using the default OME-TIFF reader… correct? About relying on metadata.txt files at the moment, I come back on one earlier comment: the whole point of asking this parsing from my perspective is to get metadata displayed in OMERO. My user experience so far is that I couldn't get omero to import anything but this… Is there a way to force the java importer to look for specific files (e.g. based on name patterns)? As a side question, is it possible to customise the OME-TIF reader. If yes, what would be more relevant than detecting MM datasets based on metadata.txt files would be to detect them based on name patterns (either a bunch of tif stacks with |
btw I did some more tests today to try to force omero to import the metadata.txt files but I was not successful so far. @joshmoore your hints on how to achieve this would be more than appreciated. |
Correct, though there looks to be one difficulty with your latest dataset in that there's a prefix before
Understood, but there's some difficulty here in that we're dealing with at least 2 different file formats (from our perspective). With this PR, we'd assume that the dataset containing a
Not really, and by design. We'll try to clarify why in the spec documentation itself, but the burden of detecting custom uses of OME-TIFFs is too high at the moment. Instead, we'd like to investigate making the extension points explicit.
This is conceptually possible, though it's a part of the
Which fileset particularly? For example, |
True! even in my hands and with our omero server!!!
Yes, I now realise that this might be the root of the misunderstanding. It looks like you're currently not taking into account that MM dataset can be saved as stacks or separate files (one per frame) depending on the user taste. Apparently your reader expects a file named metadata.txt, and I assumed up to now that it could be any metadata file finishing with this pattern (didn't even notice the difference to tell the truth)… The reason for that is that files management is easier on our side with few large files than several small ones, hence with always save dataset as stacks (and it's not really possible to consider to change this). I realise this requires another ticket. Shall I send another email to the ome-users mailing list? Also, if you extend the MMreader so that it takes all metadata files into account, will this automatically propagates to omero? or is it hardcoded somewhere in omero to import metadata.txt files (but not *_metadata.txt files)? Thanks a lot for your patient support! |
Absolutely, at the moment the reader is looking for a file exactly named
Understood. Generating less files of larger but manageable size is completely sensible.
OMERO will automatically import the set of files which have been detected and grouped together by Bio-Formats. This means that once the MicromanagerReader is extended to support
Thanks for your input and uploading new datasets. As this PR is starting to turn into an epic (but hopefully fruitful) discussion, I will now merge its improvements which has been reviewed both at the Bio-Formats and OMERO level. As mentioned above, we will deal with the other issues like the recognition of The goal is to have these changes reviewed and included by the end of this week so that the upcoming releases of Bio-Formats 5.1.8 and OMERO 5.2.2 can digest the four filesets you uploaded in #2213 (comment) and parse their metadata. |
Micromanager: parse JSON tag in each TIFF
--rebased-to #2224 |
Wow! Really excited about that :) For this week, focusing on supporting the stack format of the current stable release (i.e. test_stack) would already be great! |
@julou In principle it should not make a difference if you save images as they are acquired vs. if you save them after accumulating a complete dataset in RAM; however, it would not surprise me if there were minor differences. Both should be valid OME datasets (and of course, valid MM datasets). As far as filenames are concerned, images saved as they are acquired will have the name of the directory as a prefix (e.g. files saved in /tmp/test will be named like "test_1_MMStack_Pos0.ome.tif") while images saved after accumulating the complete set in RAM will have names like "MMStack_Pos0.ome.tif" regardless of where they are saved. However, this is not considered meaningful by µManager. |
See https://trello.com/c/499UGA5U/72-micromanager-stage-positions-and-json
This reads the JSON data from each TIFF and stores the key/value pairs
in the original metadata table. Stage position values ("*PositionUm")
will also be stored in OME-XML, if they are present.
As each TIFF tag contains a single JSON object, simple parsing of the
key/value pairs without an external library was the easiest solution for now.
We'll likely want to revisit this once we have a better plan for dealing
with JSON data in general though.
To test, use the dataset in
curated/micromanager/nico/Untitled_4_multi_file
(config PR forthcoming). Without this change,showinf -omexml
should not show PositionX, PositionY, or PositionZ values for Plane in the OME-XML. With this change, all 3 values should be set for every Plane, and the original metadata table should show many entries beginning withPlane
.