Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TIFF incorrectly identified as DeltaVision #2656

Closed
gatagat opened this issue Nov 6, 2016 · 11 comments
Closed

TIFF incorrectly identified as DeltaVision #2656

gatagat opened this issue Nov 6, 2016 · 11 comments
Milestone

Comments

@gatagat
Copy link

gatagat commented Nov 6, 2016

The bfconvert tool fails to read a tiff image because it identifies it as DeltaVision image (wrong) and then fails reading it. Exact stack trace is below.

If detecting the format is not possible, a -format switch could be added to bfconvert to be able to force a specific format.

Two test files (one loading correctly, one not) are uploaded under 17411 in the QA system.

Thanks.

Bioformats: 5.2.4

$ ~/bin/bftools-5.2.4/bfconvert -stitch _t20270.tif out.ome.tif
_t20270.tif
TiffDelegateReader initializing XXX/_t20270.tif
Reading IFDs
Populating metadata
Checking comment style
Populating OME metadata
TiffDelegateReader initializing XXX/_t20270.tif
Reading IFDs
Populating metadata
Checking comment style
Populating OME metadata
TiffDelegateReader initializing XXX/_t20270.tif
TiffDelegateReader initializing XXX/_t20270.tif
TiffDelegateReader initializing XXX/_t20270.tif
TiffDelegateReader initializing XXX_t20270.tif
TiffDelegateReader initializing XXX/_t20270.tif
TiffDelegateReader initializing XXX/_t20270.tif
[Tagged Image File Format] -> out.ome.tif [OME-TIFF]
TiffDelegateReader initializing XXX/_t20270.tif
TiffDelegateReader initializing XXX/_t20270.tif
TiffDelegateReader initializing XXX/_t20270.tif
TiffDelegateReader initializing XXX/_t20270.tif
Reading IFDs
Populating metadata
Checking comment style
Populating OME metadata
TiffDelegateReader initializing XXX/_t20270.tif
DeltavisionReader initializing XXX/_t20271.tif
Reading header
Populating core metadata
Reading extended header
Exception in thread "main" java.lang.NegativeArraySizeException
	at loci.formats.in.DeltavisionReader.initPixels(DeltavisionReader.java:404)
	at loci.formats.in.DeltavisionReader.initFile(DeltavisionReader.java:284)
	at loci.formats.FormatReader.setId(FormatReader.java:1401)
	at loci.formats.ImageReader.setId(ImageReader.java:835)
	at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:651)
	at loci.formats.DimensionSwapper.setId(DimensionSwapper.java:293)
	at loci.formats.FileStitcher.initReader(FileStitcher.java:1219)
	at loci.formats.FileStitcher.getReader(FileStitcher.java:188)
	at loci.formats.FileStitcher.computeIndices(FileStitcher.java:1196)
	at loci.formats.FileStitcher.openBytes(FileStitcher.java:481)
	at loci.formats.FileStitcher.openBytes(FileStitcher.java:471)
	at loci.formats.tools.ImageConverter.convertPlane(ImageConverter.java:634)
	at loci.formats.tools.ImageConverter.testConvert(ImageConverter.java:559)
	at loci.formats.tools.ImageConverter.main(ImageConverter.java:880)
@rleigh-codelibre
Copy link
Contributor

This is reproducible with 5.3 development releases as well. Basically, it's considering it a valid deltavision file if bytes 96-97 (0x60-0x61) contain the pattern a0c0 or c0a0.

% hexdump -C 17411/_t20270.tif | grep ^00000060
00000060  13 36 1a 41 da 5e 31 41  0d 99 41 41 33 b1 19 41  |.6.A.^1A..AA3..A|
% hexdump -C 17411/_t20271.tif | grep ^00000060
00000060  a0 c0 1b 41 27 c9 1c 41  54 7a 2e 41 6b 54 14 41  |...A'..ATz.AkT.A|

You can see that the latter file contains the a0c0 pattern at this position, which is the reason for the misbehaviour.

@melissalinkert Is there any additional data in the .dv file format we could use to be sure it's really deltavision? If not, could we add a negative check for it being a TIFF string to eliminate this possibility?

@melissalinkert
Copy link
Member

isThisType in DeltavisionReader also checks that the X, Y, and image count values are positive - magic bytes alone are not sufficient. In this case, the three values are 2771273, 732328, and 1079403675 respectively, so adding an extra check that the three values multiplied together are smaller than the file length is probably sufficient.

@rleigh-codelibre
Copy link
Contributor

See #2658 which adds these checks.

@sbesson
Copy link
Member

sbesson commented Jun 26, 2017

Sorry for the delay @gatagat. The fix for this issue has now been merged into the mainline and should be available in the upcoming Bio-Formats release.

@sbesson sbesson closed this as completed Jun 26, 2017
@sbesson sbesson added this to the 5.5.3 milestone Jun 26, 2017
@chalkie666
Copy link

A deltavision .DV file always has an objective lens ID in the header which is a 5 digit number. There are a bunch of other specific meta datas in the header too. Want kind of file format spec doc have you got? Send to me and I will check with product managers at Ge if that's the best div we have.
Or do you.mean a tiff file exported from a deltavision.

@rleigh-codelibre
Copy link
Contributor

@chalkie666 In this specific case, we're referring to determining unambiguously if a file is a .DV file purely from a raw byte stream (i.e. no .dv extension as a hint). In the above example, a TIFF was being detected (wrongly) as a deltavision file. Knowing more about the header structure of the DV format would certainly enable us to write a more robust check.

We have a HTML file overview of the format from 2004 and a couple of doc files from 2013 (IM_header3.doc and IM_header3_ext_hdr.doc), but these don't mention objective IDs or anything like that. We would certainly very much appreciate an up to date copy of the specification so we can better support deltavision file reading in Bio-Formats.

Kind regards,
Roger

@chalkie666
Copy link

chalkie666 commented Jun 29, 2017 via email

@chalkie666
Copy link

Hi folks.
@rleigh-codelibre
This is what I was told by product management:

The documents that are referenced below are the most current that we have. The information they’re requesting is actually included in those documents. Any further information about the .DV file format would have to be discussed under CDA (which I’m happy to do).

So those 2 .doc files contain info about how to read the header. That info should make it unambiguous if a file is or is not a deltavision file.

@rleigh-codelibre
Copy link
Contributor

@chalkie666 Thank you for following up. We'll see what improvements we can make using these specification documents.

Kind regards,
Roger

@carandraug
Copy link
Contributor

@chalkie666 commented:

A deltavision .DV file always has an objective lens ID in the header which is a 5 digit number. [...]

@rleigh-codelibre commented:

[...] We have a HTML file overview of the format from 2004 and a couple of doc files from 2013 (IM_header3.doc and IM_header3_ext_hdr.doc), but these don't mention objective IDs or anything like that.

The objective ID is a 2 byte signed integer, in base 0 bytes 162-163.

For what is worth, the objective ID is named LensNum on the IM_header4.doc file (I'd imagine it has the same name on the IM_header3.doc file since I have seen that name on many other files, some of them almost 20 years old). I can provide a table that maps objective ids to manufacturers, NA, and magnification.

@dgault
Copy link
Member

dgault commented Aug 10, 2018

Thanks @carandraug, looking at the reader we are parsing the LensNum as lensID and seem to have a good number of mappings for the IDs. If there is anything that is missing or incorrect with the values in https://github.com/openmicroscopy/bioformats/blob/develop/components/formats-gpl/src/loci/formats/in/DeltavisionReader.java#L1418 then please do let us know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants