Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes and features needed for SELENE Kaguya TC data import #63

Merged
merged 25 commits into from
Oct 14, 2021

Conversation

pkgw
Copy link

@pkgw pkgw commented Oct 13, 2021

These are a bunch of improvements needed to handle the Kaguya dataset.

  • Add a simple U8 image mode
  • Add support for reading JPEG2000 files in chunked format (using glymur)
  • Add support for chunked plate-carree TOAST sampling
  • Add toasty transform u8-to-rgb
  • Add support for transforming into a different tile pyramid tree (e.g., NPY and JPG in different hierarchies)
  • Removed unneeded "dispatcher" threads from multiprocessing operations
  • Fix race conditions in large-scale multiprocessing jobs
  • Don't create directories everywhere when cleaning up lockfiles
  • Use a lockfile approach that works in multi-host (HPC) contexts
  • Add proper support for the "planetary" TOAST coordinate system, which is rotated 180° in longitude
  • Add some helpful TOAST TIle APIs
  • Add support for filtering out subtrees when generating a TOAST hierarchy
  • Add support for a filtered TOAST sampling operation
  • Massively clean up the "transform" infrastructure for, e.g., float-to-RGB conversions

pkgw added 20 commits October 5, 2021 15:34
With a slight reordering of the built-in level 1 tiles so that we can recurse simply.
This will allow us to be a lot more efficient when doing chunked TOAST samplings.
This is for the Kaguya lunar dataset. We're a bit sloppy here by using 0 as the "mask"
value, but it's a reasonable approach. This patch also includes a few fixups to handle
the F64 mode in the same way as F32, where it wasn't added to all of the logic.
Fortunately, all we need to do is spin things by 180 degrees in longitude.
This can be used with a chunked image to do a TOAST tiling from something that's too large
to fit in memory.
I need to propagate this fix to other multiprocessing tasks too.
For some reason I can't get a reference to
`toasty.image.SUPPORTED_FORMATS` to work as a Sphinx `:data:...`
reference?
... which actually introduces some dead code here. Oh well.
This turned out to be a bit of a silly idea, since the main thread can
just do the dispatch work. Also make sure that we handle the done_event
without race conditions.
@codecov
Copy link

codecov bot commented Oct 13, 2021

Codecov Report

Merging #63 (cd16179) into master (bf50952) will increase coverage by 1.65%.
The diff coverage is 64.12%.

❗ Current head cd16179 differs from pull request most recent head 381e53f. Consider uploading reports for the commit 381e53f to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master      #63      +/-   ##
==========================================
+ Coverage   73.11%   74.77%   +1.65%     
==========================================
  Files          21       22       +1     
  Lines        2738     3037     +299     
==========================================
+ Hits         2002     2271     +269     
- Misses        736      766      +30     
Impacted Files Coverage Δ
toasty/multi_wcs.py 65.16% <8.33%> (+65.16%) ⬆️
toasty/image.py 75.81% <39.13%> (+1.35%) ⬆️
toasty/cli.py 83.03% <52.17%> (-2.36%) ⬇️
toasty/transform.py 41.90% <53.12%> (-12.51%) ⬇️
toasty/samplers.py 69.84% <54.43%> (-10.16%) ⬇️
toasty/toast.py 80.98% <65.85%> (-17.89%) ⬇️
toasty/collection.py 51.45% <66.66%> (-0.03%) ⬇️
toasty/jpeg2000.py 83.33% <83.33%> (ø)
toasty/multi_tan.py 78.57% <91.66%> (+0.04%) ⬆️
toasty/merge.py 94.40% <94.44%> (ø)
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3bab4f7...381e53f. Read the comment docs.

We weren't correctly updating Images when they were originally read in
by PIL, because we'd write out the old PIL object and not the updated
array. I thought that the fact that we would get a read-only Numpy array
would prevent such issues, but apparently not? We may need a more
generic API to ensure that such bugs don't reoccur, but there are some
specific cases where it will always be correct to clear the PIL data.
My initial code was hardcoded for the U8 image format, but we should be
better than that. Unfortunately I haven't tested whether it still works
for U8 images.

This is a signpost that the TOAST sampling code should be migrated to
use an Image buffer in general, but I won't do that here.
@pkgw pkgw merged commit db23ba9 into WorldWideTelescope:master Oct 14, 2021
@pkgw pkgw deleted the kaguya branch October 14, 2021 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant