Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve TIFF performance #7

Open
hinerm opened this issue Jun 5, 2013 · 14 comments
Open

Improve TIFF performance #7

hinerm opened this issue Jun 5, 2013 · 14 comments
Milestone

Comments

@hinerm
Copy link
Member

hinerm commented Jun 5, 2013

The current TIFFFormat is a direct port of the loci.formats TIFFFormat. Part of the goal for SCIFIO is to improve this performance.

As a part of this, the MinimalTIFFFormat should be restructured into just base components, and not be a format by itself.

@ghost ghost assigned hinerm Jun 5, 2013
@dscho
Copy link
Contributor

dscho commented Aug 18, 2013

Now that scijava-common has a Timing class, I suggest this strategy when trying to figure out what code needs the most improvement:

final Timing timing = Timing.start(true);
    ...
Timing.tick(timing);
    ...
Timing.tick(timing);
    ...
Timing.stop(timing);

This will accumulate nano-second timings (as precise as the platform allows, anyway) with only minimal performance impact. The stop method will sort the individual steps by time and offer the file name and line number to you (very convenient in Eclipse's console...) The reason for the true is that you can easily switch off the timing in the method by prefixing the true with an exclamation mark, both to unclutter the console and to avoid impacting performance by the benchmarking itself.

@hinerm
Copy link
Member Author

hinerm commented Aug 21, 2013

Possible place to improve performance:

  • tiff service for determining tiff type
  • create a sub-plugin "TiffFormat"
  • read the IFD once, pass it around to different checkers..

@ctrueden
Copy link
Member

@hinerm: I agree, we need a TIFFService which can cache IFDs. See also issue #39.

@hinerm hinerm modified the milestones: 0.10.0, 0.9.0 Feb 11, 2014
@hinerm hinerm modified the milestones: 0.11.0, 0.10.0 Mar 10, 2014
@hinerm hinerm modified the milestones: 0.12.0, 0.11.0 Apr 21, 2014
@hinerm hinerm modified the milestones: 0.13.0, 0.12.0, 0.15.0 May 30, 2014
@hinerm hinerm modified the milestones: 0.16.0, 0.15.0 Jun 10, 2014
@dietzc
Copy link
Contributor

dietzc commented Dec 10, 2014

@gab1one is taking a look at the performance of TIFF reader / writer and is posting his results here for further discussion.

@hinerm
Copy link
Member Author

hinerm commented Jan 25, 2015

This gist can be loaded into a reader, e.g. Eclipse Java Monitor, to view profiling results that clearly indicate TIFF is terrible (e.g. spending 1/3 of its time determining the format)

@hinerm
Copy link
Member Author

hinerm commented Jan 25, 2015

Top offenders that profiling identified:

  • TiffFormat.Parser.initMetadata > Location.list 41%
  • AbstractChecker.isFormat.RandomAccessInputStream.<init> 13%
  • AbstractFormat.createChecker 8%
  • MicroManagerFormat.isFormat 8%

Lesser offenders but kind of suspicious:

  • MasterFilterHelper.<init> 2%
  • JPEGFormat.isFormat 2%
  • DICOMFormat.isFormat 2%

@hinerm
Copy link
Member Author

hinerm commented Jan 26, 2015

OK so I made a mistake in that I was only profiling io.scif.* packages. After adding net.imagej.* and org.scijava.* It now seems like an overwhelming amount of time is spent on context injection:

17KB TIFF

Typed parsing

typedparse

CreateChecker

createchecker

IsFormat

isformat

Reading (less than 3%)

read

This doesn't 100% tell us that the API is being used correctly (e.g. calling Context.inject when it is actually needed).

But my next steps are to profile with a naive caching mechanism, e.g. mapping classes to their field annotations.

If that proves to be successful then my thoughts for future steps would be incremental Context initialization (recursive pre-parsing and caching of fields and anything else..) that runs on a dedicated thread after a Context is created. This would also require a general caching service, e.g. a LRU cache.

@hinerm
Copy link
Member Author

hinerm commented Jan 26, 2015

When testing with a larger image, of course, the Context initialization impact is negligible:

200MB OME-TIFF

largeimage

@hinerm
Copy link
Member Author

hinerm commented Jan 26, 2015

So there are several scenarios to consider with regard to performance here:

@hinerm
Copy link
Member Author

hinerm commented Jan 27, 2015

@dietzc ea52f8b is awesome for repeated reads of small files

@hinerm
Copy link
Member Author

hinerm commented Jan 27, 2015

@hinerm hinerm removed this from the 0.16.0 milestone Mar 10, 2015
@hinerm hinerm added this to the m4 milestone Mar 11, 2015
@hinerm hinerm self-assigned this Mar 11, 2015
@hinerm
Copy link
Member Author

hinerm commented Jun 1, 2015

Note that #203 is also relevant here as having a higher priority ImageJ (and/or SCIFIO) TIFF Format could also improve performance in many cases

@hinerm
Copy link
Member Author

hinerm commented Oct 15, 2021

See also #342

@hinerm
Copy link
Member Author

hinerm commented Oct 15, 2021

See also #310

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants