New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OMEXMLReader: Handle BinData in non-Pixels elements such as Mask #2682
Conversation
The reader made the assumption that BinData was only found inside Pixels. However, it's also contained within Mask, and the reader was stripping out all BinData content when initially reading the OME-XML (it pre-processes it with a custom SAX parser before creating the MetadataStore). This change makes the stripping selective, so it only occurs for BinData elements contained within Pixels.
I assume that stripping one out but not the other actually makes sense? Does this reflect a more general difference in how the "normal" image data and the masks may be stored? |
This is an optimisation performed by the reader. For all the Image data (BinData under Pixels), the reader stores the offset into the OME-XML. This lets it read the pixel data directly without re-parsing the XML, and also allows the MetadataStore to not have BinData bloat within it, saving potentially vast amounts of memory. However, this optimisation works only for pixel data returned via openBytes, which looks up the offset and reads the pixel data directly. It does not work for BinData elsewhere in the model, such as inside Mask, where it strips out the data with no way to get it back. The lack of uniformity in pixel data handling could be considered a model defect in its own right--why can't I store the Mask data in TiffData, for example? But this PR specifically addresses the broken assumption that it's OK to strip out all BinData when in fact it's only safe to do so for Pixels BinData. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor style nitpicks aside, this looks fine to me. showinf -nopix -omexml specification/samples/2016-06/ROI.ome.xml
without this change confirms that the contents of BinData
under ROI:5
are stripped out. The same test with this change results in the length and content being preserved as expected, and there does not seem to be any other difference in behavior.
} | ||
|
||
@Override | ||
public void characters(char[] ch, int start, int length) { | ||
if (currentQName.indexOf("BinData") < 0) { | ||
if (currentQName.indexOf("BinData") < 0 || inPixels == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!inPixels
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if fiddling anyway then maybe put it first given shortcut logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed fixes for these, thanks.
inPixels = true; | ||
} | ||
|
||
if (qName.indexOf("BinData") != -1 && inPixels == true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just inPixels
instead of inPixels == true
?
The reader made the assumption that
BinData
was only found insidePixels
. However, it's also contained withinMask
, and the reader was stripping out allBinData
content when initially reading the OME-XML (it pre-processes it with a custom SAX parser before creating theMetadataStore
). This change makes the stripping selective, so it only occurs forBinData
elements contained within Pixels.This is related to ome/ome-model#14 but is not coupled to it (it's useful for testing #14 with the sample in ome/ome-model#13), but can be merged independently. If you test these without this PR, you'll see that the ROI sample gets the
BinData
stripped out of theMask
ROI, and with this PR, it's correctly retained.Testing: Check builds remain green. Check that reading OME-XML with
Mask
BinData
works withshowinf
. Thespecification/samples/2016-06/ROI.ome.xml
sample from ome/ome-model#13 is useful for this purpose.