Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for PrairieView version 5.2 datasets #1306

Merged
merged 19 commits into from Sep 12, 2014

Conversation

ctrueden
Copy link
Member

This branch adds support for the new more compact Prairie XML file, as well as the updated ENV metadata file. It updates the PrairieMetadata structure to be compatible with the new, more hierarchical key/value structure of 5.2, and retrofits parsing of older files to be compatible with this structure. In this way, consuming code does not need to care which version of PrairieView produced the data.

Tested in ImageJ with various datasets using the "Grid/Collection stitching" plugin, and all seems well. Would be great to run a full automated test suite over all available Prairie sample data, including the new 5.2 sample data provided by Prairie/Bruker.

Specifically, each <File> now knows its <Frame>, and each <Frame> now
knows its <Sequence>. This will useful for handling the more compact XML
structure of PrairieView 5.x, in which parameter values are inherited
from parent elements where feasible.
This returns null when the attribute is not present, rather than the
empty string. It also makes the code slightly shorter.
The key/value pair structure got more complex in PrairieView 5.2. Now
each key can point at either a single string item, or at a table of
values, which can themselves point at sub-tables.

This data structure is intended to facilitate easy management of the
new structure, while still supporting the old structure, too.
These are useful when dealing with possible null values.
As of PrairieView 5.2, the key/value pair structure is more
hierarchical, with value tables able to contain sub-tables.

Let's parse the old-style keys in the same way, so that regardless of
which version of PrairieView wrote the data, it can be consumed in the
same way after parsing.
We are going to add additional ValueTable fields in other places,
and calling them all "values" gets confusing (and causes IDEs to
emit warnings about variable names shadowing one another).
PrairieView 5.2 stores the key/value structure differently: it uses a
sequence of PVStateValue elements with IndexedValue and SubindexedValue
elements underneath. This commit adds the code necessary to parse such
structures into a ValueTable object.
There are three levels where PVStateShard subtrees appear:

1) PVScan (the top level)
2) Sequence
3) Frame

Let's parse all of them into separate ValueTable objects.
If a Frame's ValueTable (i.e., PVStateShard) does not contain
a requested key, search its parent Sequence for the same.

Similarly, if a Sequence's ValueTable is missing the key,
search the top-level PVScan's ValueTable, too.

This behavior is required to support the new "smaller XML file format"
which has been an option since PrairieView 5.0, and the only option
since PrairieView 5.2.
When calling value() on a ValueTable, if there is only one entry in the
table, let's just return its value directly. See comment for details.
Rather than having separate hardcoded methods, let's just have one
method called "findMetadataFiles" that looks for all the unspecified
ones. And refer to them as "Prairie metadata files" for generality.

This will be useful when we add support for ENV files shortly.
We should at least _try_ to parse the active channels, even if a CFG
file was not present. This will be useful soon, when that information
could potentially come from another source (the ENV file).
Start with version 5.2, the CFG file was replaced with an ENV
that is similar conceptually, but with a revised XML schema.

It does contain many of the same key/value pairs, though, including
the ones in which we are most interested (channels, bitDepth, laserPower
and xYStageYPositionIncreasesBottomToTop in particular), so for now we
parse the metadata into the same config table used by the CFG file.
Otherwise, an obtuse exception is raised later in
MetadataTools.populatePixels because sizeC is 0.
@sbesson
Copy link
Member

sbesson commented Sep 1, 2014

See http://ci.openmicroscopy.org/view/Failing/job/BIOFORMATS-5.1-merge-test_images_good/240

  [testng] [2014-08-31 21:12:58,346] [pool-1-thread-5] Initializing /ome/data_repo/test_images_good/prairie/pollen-001.xml: 
   [testng] [2014-08-31 21:12:58,397] [pool-1-thread-5] 5 files
   [testng] [2014-08-31 21:12:58,556] [pool-1-thread-5]     ChannelNames: FAILED (Series 0 channel 0 (got 'null', expected '1'))

@ctrueden
Copy link
Member Author

ctrueden commented Sep 2, 2014

Thanks @sbesson. The issue is that the internal Prairie metadata structure changed to be more hierarchical, and that is reflected when populating the original metadata hash:

diff --git a/old.txt b/new.txt
index 13f3cc7..28b82e3 100644
--- a/old.txt
+++ b/new.txt
@@ -1,7 +1,7 @@
 Checking file format [Prairie TIFF]
 Initializing reader
 PrairieReader initializing pollen-001.xml
-Finding CFG file
+Finding metadata files
 Parsing Prairie metadata
 Populating core metadata
 Reading IFDs
@@ -11,7 +11,7 @@ Populating OME metadata
 unknown creation date format: 1945:07:20 12:30:21 
 Populating OME metadata
 Unknown Immersion value 'null' will be stored as "Other"
-Initialization took 0.348s
+Initialization took 0.306s

 Reading core metadata
 Filename = /Volumes/Kuro/data/prairie/misc/Ultima ver 2.5 test data sets/pollen-001/pollen-001.xml
@@ -54,91 +54,44 @@ calMode: 0
 ccFocusPresent: True
 ccLaserPresent: False
 ccStagePresent: True
-channel_0: True
-channel_1: True
-channel_2: True
-channel_3: False
-currentScanAmplitude_XAxis: 1.5
-currentScanAmplitude_YAxis: -1.5
+channel: {0=True, 1=True, 2=True, 3=False}
+currentScanAmplitude: {XAxis=1.5, YAxis=-1.5}
+currentScanCenter: {XAxis=0, YAxis=0}
 currentScanCenterPitch: 1
-currentScanCenter_XAxis: 0
-currentScanCenter_YAxis: 0
 cycleCount: 1
-daq_0: 5
-daq_1: 5
-daq_2: 5
-daq_3: 5
+daq: {0=5, 1=5, 2=5, 3=5}
 date: 12/15/2006 2:18:43 PM
-directory_Base: C:\Documents and Settings\user\My Documents\test1
-directory_BrightnessOverTime: BrightnessOverTime-12152006-1335
-directory_LineScan: LineScan-12152006-1335
-directory_SingleImage: pollen-001
-directory_TSeries: TSeries-12152006-1335
-directory_TriggerSync: TriggerSync-12152006-1335
-directory_WSeries: WSeries-12152006-1335
-directory_ZSeries: ZSeries-12152006-1335
+directory: {TriggerSync=TriggerSync-12152006-1335, LineScan=LineScan-12152006-1335, ZSeries=ZSeries-12152006-1335, BrightnessOverTime=BrightnessOverTime-12152006-1335, TSeries=TSeries-12152006-1335, WSeries=WSeries-12152006-1335, SingleImage=pollen-001, Base=C:\Documents and Settings\user\My Documents\test1}
 displayScalingFactor: 1
 dwellTime: 4.0
 dwellTimeMin: 0.8
-fileIteration_BrightnessOverTime: 8
-fileIteration_LineScan: 34
-fileIteration_SingleImage: 1
-fileIteration_TSeries: 38
-fileIteration_TriggerSync: 1
-fileIteration_ZSeries: 61
+fileIteration: {TriggerSync=1, LineScan=34, ZSeries=61, BrightnessOverTime=8, TSeries=38, SingleImage=1}
 flushFrameBufferPath: C:\Documents and Settings\dwimages\ZSeries-12142006-1438-060
 frameAveraging: 4
 framePeriod: 1.380352
 framerate: 0.724452893175074
-galvo_XAxis+AccelerationMax: 48
-galvo_XAxis+AmplitudeMax: 20
-galvo_XAxis+AmplitudePark: 1.5
-galvo_XAxis+AmplitudeScan: 1.5
-galvo_XAxis+LagTime: 88
-galvo_XAxis+PanDirection: -1
-galvo_XAxis+ScanCenter: 0
-galvo_XAxis+SpeedMax: 8
-galvo_XAxis+VoltsPerDegree: 4
-galvo_YAxis+AccelerationMax: 48
-galvo_YAxis+AmplitudeMax: 20
-galvo_YAxis+AmplitudePark: -1.5
-galvo_YAxis+AmplitudeScan: -1.5
-galvo_YAxis+LagTime: 88
-galvo_YAxis+PanDirection: 1
-galvo_YAxis+ScanCenter: 0
-galvo_YAxis+SpeedMax: 8
-galvo_YAxis+VoltsPerDegree: 4
+galvo: {YAxis+AccelerationMax=48, XAxis+PanDirection=-1, YAxis+AmplitudeScan=-1.5, XAxis+ScanCenter=0, XAxis+AccelerationMax=48, YAxis+VoltsPerDegree=4, XAxis+SpeedMax=8, YAxis+AmplitudeMax=20, YAxis+SpeedMax=8, XAxis+AmplitudeScan=1.5, YAxis+LagTime=88, YAxis+PanDirection=1, YAxis+AmplitudePark=-1.5, YAxis+ScanCenter=0, XAxis+VoltsPerDegree=4, XAxis+LagTime=88, XAxis+AmplitudeMax=20, XAxis+AmplitudePark=1.5}
 hardShutter: False
 inputTriggerPresent: True
-laserPower_0: 0
-laserPower_1: 0
+laserPower: {0=0, 1=0}
 linesPerFrame: 512
-micronsPerPixel_XAxis: 0.632911392405062
-micronsPerPixel_YAxis: 0.632911392405062
-motorStepSize_ZAxis: 0.5
+micronsPerPixel: {XAxis=0.632911392405062, YAxis=0.632911392405062}
+motorStepSize: {ZAxis=0.5}
 objectiveLens: Olympus 60x
 opticalZoom: 1.0
 opticalZoomMin: 1
 photoActivationPresent: True
 pixelsPerLine: 512
 pixelsPerLineMin: 94
-pmtGain_0: 691.964285714286
-pmtGain_1: 580.357142857143
-pmtGain_2: 625
-pmtGain_3: 0
-pmtOffset_0: 0
-pmtOffset_1: 0
-pmtOffset_2: 0
-pmtOffset_3: 0
+pmtGain: {0=691.964285714286, 1=580.357142857143, 2=625, 3=0}
+pmtOffset: {0=0, 1=0, 2=0, 3=0}
 pockelsLag: 0
 polarityBlanking: 1
 polarityShutter: 0
 polarityTrigger: 0
 porchBack: 10
 porchFront: 20
-positionCurrent_XAxis: -677.5
-positionCurrent_YAxis: -2055.625
-positionCurrent_ZAxis: 73.325
+positionCurrent: {ZAxis=73.325, XAxis=-677.5, YAxis=-2055.625}
 rotation: 0
 scanlinePeriod: 0.002696
 scanningMode: 0

My vote is not to worry about this, since we explicitly state that no one should rely on the stability of the original metadata key/value pairs. But if this is deemed a blocker to merge, I'll see what I can do to reconcile it with the old keys. @melissalinkert, what do you think?

@melissalinkert
Copy link
Member

The test is not failing because of original metadata differences (which we don't check), but because the Name attribute on Channel is not being set for /ome/data_repo/test_images_good/prairie/pollen-001.xml at least.

@ctrueden
Copy link
Member Author

ctrueden commented Sep 2, 2014

Thanks @melissalinkert. When running showinf pollen-001.xml -nometa -omexml -nopix -novalid with both the old and new PrairieReader code, the OME-XML is identical. When running without the -nometa flag, the relevant diff is:

       <InstrumentRef ID="Instrument:0"/>
       <ObjectiveSettings ID="Objective:0:0"/>
       <Pixels BigEndian="false" DimensionOrder="XYCZT" ID="Pixels:0" Interleaved="false" PhysicalSizeX="0.632911392405062" PhysicalSizeY="0.632911392405062" SignificantBits="16" SizeC="3" SizeT="1" SizeX="512" SizeY="512" SizeZ="1" Type="uint16">
-         <Channel ID="Channel:0:0" Name="" SamplesPerPixel="1">
+         <Channel ID="Channel:0:0" SamplesPerPixel="1">
             <DetectorSettings Gain="691.964285714286" ID="Detector:0:0" Offset="0.0"/>
             <LightPath/>
          </Channel>
-         <Channel ID="Channel:0:1" Name="" SamplesPerPixel="1">
+         <Channel ID="Channel:0:1" SamplesPerPixel="1">
             <DetectorSettings Gain="580.357142857143" ID="Detector:0:1" Offset="0.0"/>
             <LightPath/>
          </Channel>
-         <Channel ID="Channel:0:2" Name="" SamplesPerPixel="1">
+         <Channel ID="Channel:0:2" SamplesPerPixel="1">
             <DetectorSettings Gain="625.0" ID="Detector:0:2" Offset="0.0"/>
             <LightPath/>
          </Channel>

So, the culprit is actually this commit: ctrueden@0338a65

How much does that change in behavior matter? I can add some code to always populate the Name with the empty string, but it is an optional attribute, so do we really need to do that?

@melissalinkert
Copy link
Member

OK, that's fine then. For whatever reason the tests had the wrong channel name configured, which is now fixed.

My only remaining concern then is that the new spectral datasets do not appear to have the correct dimensions. spectral datasets/4d - zseries over time/TSeries-06052014-1327-004, for instance, has 432 .ome.tif files, which appear to correspond to 9Z x 3T x 16C; SizeC however is set to 1.

@ctrueden
Copy link
Member Author

ctrueden commented Sep 5, 2014

Ok, I'll pursue another patch set to support the spectral datasets
properly. Shall we merge this (I think it substantially improves on the old
behavior) and then I'll file another PR for the spectral suppprt?
On Sep 4, 2014 5:31 PM, "Melissa Linkert" notifications@github.com wrote:

OK, that's fine then. For whatever reason the tests had the wrong channel
name configured, which is now fixed.

My only remaining concern then is that the new spectral datasets do not
appear to have the correct dimensions. spectral datasets/4d - zseries
over time/TSeries-06052014-1327-004, for instance, has 432 .ome.tif
files, which appear to correspond to 9Z x 3T x 16C; SizeC however is set to
1.


Reply to this email directly or view it on GitHub
#1306 (comment)
.

melissalinkert added a commit that referenced this pull request Sep 12, 2014
Add support for PrairieView version 5.2 datasets
@melissalinkert melissalinkert merged commit 0819d22 into ome:develop Sep 12, 2014
@ctrueden ctrueden deleted the devel/prairie-view-5 branch September 12, 2014 18:42
@joshmoore
Copy link
Member

--rebased-from #1305

@sbesson sbesson added this to the 5.1.0-m1 milestone Oct 14, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants