Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prioritization among loaders? #144

Closed
tlnagy opened this issue Jul 28, 2017 · 16 comments
Closed

Prioritization among loaders? #144

tlnagy opened this issue Jul 28, 2017 · 16 comments

Comments

@tlnagy
Copy link
Contributor

tlnagy commented Jul 28, 2017

I'm writing a loader for uMicromanager OME-TIFF files. These have the standard TIFF magic bytes and TIFF ending and are picked up by the standard TIFF loader. I tried inserting my custom detection function either above or below the TIFF add_format call and I couldn't get it to work unless I commented out that line.

Is there any way to assign priority for loaders?

@tlnagy
Copy link
Contributor Author

tlnagy commented Jul 30, 2017

@timholy any thoughts?

@timholy
Copy link
Member

timholy commented Jul 30, 2017

I haven't worked on this package in ages, so I don't remember. You could check the implementation of add_format and see how it works. If it's a Dict, then the current answer is "no", but in that case we should change the implementation somehow (either to an ordered Dict, or assign a priority, or something).

@tlnagy
Copy link
Contributor Author

tlnagy commented Aug 2, 2017

So my OMETIFF loader is up and running over at https://github.com/tlnagy/OMETIFF.jl, but currently has its own loadtiff function instead of depending on the FileIO infrastructure.

It looks like this might be a little complicated. My reading of add_format here https://github.com/JuliaIO/FileIO.jl/blob/master/src/query.jl#L78-L89 and query code here https://github.com/JuliaIO/FileIO.jl/blob/master/src/query.jl#L402-L426 is that

  1. Magic bytes are sorted in front of magic functions
  2. Magic bytes are unique to a specific format

This means that in my case, where I have identical magic bytes to already registered format, my loader would have lower priority than the default TIFF one.

I think main way of solving this is to get rid of the uniqueness requirement and first sort on magic bytes and then on file name (*tif vs *.ome.tif). Thus make uniqueness be a combination of magic bytes + filename, instead of just magic bytes. Since OME-TIFF is a simple extension of TIFF that's backwards compatible, there's no good way of distinguishing it quickly from normal TIFFs.

@timholy
Copy link
Member

timholy commented Aug 3, 2017

OK, now I have a couple of seconds to rub together and can actually look at code.

Why can't you just list your OME-TIFF package above QuartzImageIO and ImageMagick in these lines? Just throw a LoaderError from your loader if it turns out not to be OME-TIFF and it will try the next one on the list. See

FileIO.jl/src/loadsave.jl

Lines 85 to 104 in bf7756c

function load{F}(q::Formatted{F}, args...; options...)
if unknown(q)
isfile(filename(q)) || open(filename(q)) # force systemerror
throw(UnknownFormat(q))
end
libraries = applicable_loaders(q)
failures = Any[]
for library in libraries
try
Library = checked_import(library)
if !has_method_from(methods(Library.load), Library)
throw(LoaderError(string(library), "load not defined"))
end
return eval(Main, :($(Library.load)($q, $args...; $options...)))
catch e
push!(failures, (e, q))
end
end
handle_exceptions(failures, "loading \"$(filename(q))\"")
end
.

@timholy
Copy link
Member

timholy commented Aug 3, 2017

If the docs are inadequate on this point, a PR to fix them would be much appreciated!

tlnagy added a commit to tlnagy/FileIO.jl that referenced this issue Aug 3, 2017
@tlnagy
Copy link
Contributor Author

tlnagy commented Aug 3, 2017

I didn't realize that it would just try those loaders in order, #147 is my PR, but it needs to wait until a new version of OMETIFF.jl is merged into METADATA.jl

@tlnagy
Copy link
Contributor Author

tlnagy commented Aug 3, 2017

I just scan the name of the stream quickly for ome.tif or ome.tiff and throw a LoaderError if it isn't found.

@tkelman
Copy link
Contributor

tkelman commented Aug 3, 2017

Will FileIO try/suggest to install OMETIFF before trying ImageMagick if it can't find either?

@tlnagy
Copy link
Contributor Author

tlnagy commented Aug 3, 2017

That's what I would expect, but imagemagick will most likely already be installed if the user opens almost any other image filetype first. This would only happen if the first image file the user opens happens to be TIFF. Not sure about a workaround.

@tkelman
Copy link
Contributor

tkelman commented Aug 3, 2017

If it's not automatic and just a suggestion, can the suggestion message be customized? Maybe you could add "most users should skip this and try the next loader" to the message?

@tlnagy
Copy link
Contributor Author

tlnagy commented Aug 3, 2017

How about this in registry.jl? This seems to work quite well:

# Handle OME-TIFFs, which are identical to normal TIFFs with the primary difference being the filename and embedded XML metadata
tiff_magic = (UInt8[0x4d,0x4d,0x00,0x2a], UInt8[0x4d,0x4d,0x00,0x2b], UInt8[0x49,0x49,0x2a,0x00],UInt8[0x49,0x49,0x2b,0x00])
function detecttiff(io)
    seekstart(io)
    magic = read(io, UInt8, 4)
    # do any of the first 4 bytes match any of the 4 possible combinations of tiff magics
    return any(map(x->all(magic .== x), tiff_magic))
end
# OME-TIFF
detect_ometiff(io) = detecttiff(io) && (endswith(io.name, ".ome.tif>") || endswith(io.name, ".ome.tiff>"))
add_format(format"OMETIFF", detect_ometiff, [".tif", ".tiff"], [:OMETIFF])
# normal TIFF
detect_noometiff(io) = detecttiff(io) && !(endswith(io.name, ".ome.tif>") || endswith(io.name, ".ome.tiff>"))
add_format(format"TIFF", detect_noometiff, [".tiff", ".tif"], [:QuartzImageIO, OSX], [:ImageMagick])

This should only install OMETIFF.jl if the user is actually opening an ome-tiff file. This basically just adds a filename check on top of the normal magic bytes check. It will slow down the speed at which FileIO can determine what the file type is though since it won't use the fast magic byte matching but a function instead. It doesn't seem to make much of a difference actually see below

@tlnagy
Copy link
Contributor Author

tlnagy commented Aug 3, 2017

The only test that is failing is the skipmagic one here: https://github.com/JuliaIO/FileIO.jl/blob/master/test/query.jl#L296 since skipmagic called on a function always returns 0, instead 4 as was expected when TIFF used registered magic bytes.

@tlnagy
Copy link
Contributor Author

tlnagy commented Aug 3, 2017

Making this change doesn't have much of an effect on the speed of loading a normal TIFF

Master:

julia> @benchmark load("/Users/tamasnagy/Downloads/129779/Figure_S5.tif")
WARNING: Compat.ASCIIString is deprecated, use String instead.
  likely near no file:208
WARNING: Compat.UTF8String is deprecated, use String instead.
  likely near no file:208
BenchmarkTools.Trial:
  memory estimate:  14.68 MiB
  allocs estimate:  276
  --------------
  minimum time:     39.933 ms (0.00% GC)
  median time:      44.914 ms (4.94% GC)
  mean time:        46.036 ms (5.88% GC)
  maximum time:     133.672 ms (66.14% GC)
  --------------
  samples:          109
  evals/sample:     1

This change:

julia> @benchmark load("/Users/tamasnagy/Downloads/129779/Figure_S5.tif")
WARNING: Compat.ASCIIString is deprecated, use String instead.
  likely near no file:208
WARNING: Compat.UTF8String is deprecated, use String instead.
  likely near no file:208
BenchmarkTools.Trial:
  memory estimate:  14.71 MiB
  allocs estimate:  343
  --------------
  minimum time:     40.817 ms (0.00% GC)
  median time:      45.740 ms (4.90% GC)
  mean time:        47.232 ms (6.21% GC)
  maximum time:     129.985 ms (65.44% GC)
  --------------
  samples:          106
  evals/sample:     1

This is on 0.6

@tlnagy
Copy link
Contributor Author

tlnagy commented Aug 3, 2017

See my updates over at #147

@timholy
Copy link
Member

timholy commented Mar 3, 2021

Should this be closed? Or still relevant?

@tlnagy
Copy link
Contributor Author

tlnagy commented Mar 3, 2021

I don't think this is relevant any more since we dispatch the correct TIFF package based on the name of the file + the TIFF magic bytes starting in #147

@tlnagy tlnagy closed this as completed Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants