Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up array IO (2.5x - 4x) with nio views as suggested by Kevin Eastridge #414

Merged
merged 15 commits into from May 22, 2023

Conversation

attipaci
Copy link
Collaborator

@attipaci attipaci commented May 21, 2023

@keastrid has noticed that array IO can be significantly improved by using java.nio views (see Discussion #413), so this PR implements that, and provides some minor fixes/improvements to the FITS IO implementation. The buffer views are used for the retrieval of multiple elements only, while single elements are retrieved more efficiently using, e.g. ByteBuffer.getFloat() directly.

@attipaci attipaci added enhancement A new feature and/or an improved capability breaking-potential Alters functionality in ways that may break prior application. labels May 21, 2023
@attipaci attipaci added this to the 1.18.0 milestone May 21, 2023
@attipaci attipaci self-assigned this May 21, 2023
@codecov
Copy link

codecov bot commented May 21, 2023

Codecov Report

Merging #414 (e3ad8f2) into master (509c010) will increase coverage by 0.05%.
The diff coverage is 99.04%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #414      +/-   ##
============================================
+ Coverage     96.99%   97.04%   +0.05%     
- Complexity     4359     4376      +17     
============================================
  Files           203      203              
  Lines         12690    12807     +117     
  Branches       1852     1875      +23     
============================================
+ Hits          12309    12429     +120     
+ Misses          155      151       -4     
- Partials        226      227       +1     
Flag Coverage Δ
unittests 97.04% <99.04%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/main/java/nom/tam/fits/FitsFactory.java 96.00% <ø> (ø)
...ts/compression/algorithm/quant/RandomSequence.java 100.00% <ø> (ø)
src/main/java/nom/tam/util/ArrayDataInput.java 90.47% <0.00%> (-9.53%) ⬇️
src/main/java/nom/tam/fits/BinaryTable.java 92.28% <90.00%> (+0.12%) ⬆️
src/main/java/nom/tam/fits/Fits.java 97.67% <100.00%> (+0.05%) ⬆️
src/main/java/nom/tam/fits/ImageData.java 97.82% <100.00%> (ø)
src/main/java/nom/tam/fits/RandomGroupsData.java 100.00% <100.00%> (ø)
...rc/main/java/nom/tam/image/StandardImageTiler.java 95.95% <100.00%> (ø)
src/main/java/nom/tam/util/ArrayDataFile.java 100.00% <100.00%> (ø)
src/main/java/nom/tam/util/ArrayInputStream.java 100.00% <100.00%> (ø)
... and 12 more

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 480cf4e...e3ad8f2. Read the comment docs.

@attipaci attipaci changed the title Speed up array io with nio views (~2x) as suggested by Kevin Eastridge Speed up array IO with nio views (~2x) as suggested by Kevin Eastridge May 21, 2023
@attipaci attipaci changed the title Speed up array IO with nio views (~2x) as suggested by Kevin Eastridge Speed up array IO (up to 2x) with nio views as suggested by Kevin Eastridge May 21, 2023
@keastrid
Copy link
Contributor

This works very well, thank you! It is a very noticeable improvement in opening times.

I see you added the same for fits writing, which is a great addition.

@attipaci
Copy link
Collaborator Author

More work was done (above) to:

  • Make the code more elegant (less repeated)
  • Retain image read performance for arrays where trailing dimension may be small.
  • Extend high-throughput reads to Random Groups and Binary Tables also
  • Fill missing documentation

Will need to fill new test coverage holes before merging...

@attipaci
Copy link
Collaborator Author

I did some benchmarking of the new IO speed vs previous releases. In general, reading and writing images is 2.5x -- 4x faster than previous version of the library (e.g. 1.15.2 to 1.17.1) on my laptop with a Gen 3 NVMe SSD. And, reading from FitsInputStream (aka BufferedDataInputStream in earlier versions) is a whopping 20-25 times faster than 1.15.2...

@attipaci attipaci changed the title Speed up array IO (up to 2x) with nio views as suggested by Kevin Eastridge Speed up array IO (2.5x - 4x) with nio views as suggested by Kevin Eastridge May 22, 2023
@attipaci attipaci merged commit 59fb779 into nom-tam-fits:master May 22, 2023
4 checks passed
@attipaci attipaci deleted the io-speedup branch May 22, 2023 18:24
attipaci added a commit that referenced this pull request May 22, 2023
@keastrid
Copy link
Contributor

@attipaci I think the recent commits introduced a regression to table reading.

Trying to get the hdu array results in the following exception:

May 22, 2023 9:07:47 PM nom.tam.fits.Header forceEOF
WARNING: Junk detected at -1.
nom.tam.fits.FitsException: Not a proper FITS header:  at -1
	at nom.tam.fits.Header.checkFirstCard(Header.java:2124)
	at nom.tam.fits.Header.read(Header.java:1579)
	at nom.tam.fits.Header.readHeader(Header.java:170)
	at nom.tam.fits.Fits.readHDU(Fits.java:1005)
	at nom.tam.fits.Fits.readToEnd(Fits.java:1045)
	at nom.tam.fits.Fits.read(Fits.java:956)

It is also only finding 2 of the 4 hdus. The file is available here

@attipaci
Copy link
Collaborator Author

I was fearing something like this would happen, even though the built-in regression tests passed. Based on the error, it's the binary table (2nd HDU) that reads either too short or beyond its boundary, so it's trying to detect the next HDUs header at the wrong offset -- hence without success. I'll see what I can make of it and create a fix... And, I'll create a new issue for tracking the regression...

at88mph pushed a commit to at88mph/nom-tam-fits that referenced this pull request Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-potential Alters functionality in ways that may break prior application. enhancement A new feature and/or an improved capability
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants