Pythonic FITS reader #1273
Labels
enhancement
New feature or request
external contribution welcome
A good place to start contributing to the project
Projects
I'm copying over an issue from rapidsai/cudf#2821 by @profjsb . Hopefully this is in-scope for DALI.
This is a request for a GPU FITS reader. Such a reader will be a welcomed and critical component as the community starts to transition data pipelines from CPU- to GPU-centric workflows.
The common image exchange format in astronomy is FITS (Flexible Image Transport System) and there are well-supported CPU-centric packages for reading (and writing) FITS, such as
PyFITS
(https://pythonhosted.org/pyfits/) andastropy.io
(https://docs.astropy.org/en/stable/io/fits/). As part of many data pipelines, it is common to read FITS files from disk, combine and manipulate the images/spectra (as operations onnumpy
arrays, e.g.), and then write the results back to disk. The reduction pipelinepypeit
(https://github.com/pypeit/PypeIt/tree/master/pypeit) is a good example package to see the end-to-end manipulation of FITS files for science.With the relatively recent introduction of neural network-based steps for astronomical image processing (e.g., we have a package called
deepCR
, https://github.com/profjsb/deepCR, https://arxiv.org/abs/1907.09500) the best practice when wanting to use GPUs currently is to read FITS data from disk, push the data to a GPU Tensor inpytorch
, apply machine learning models, then convert the Tensor back to a CPU-basednumpy
array. This roundtrip adds overhead. We'd like to be able to read FITS files directly to a GPU Tensor inpytorch
(and the like). Of course writing FITS files directly from GPU Tensors would be a next step.If a FITS reader is developed that can easily lead to the construction of a tensor variable on the GPU, this will open up our community to develop entirely GPU-based image processing pipelines. Much of our manipulations on images are very amenable to the massive parallelism afforded by GPUs. As someone leading an astronomy-meets-machine-learning group at UC Berkeley, I'm personally excited about this as we start to make use of GPU-based clusters, such as the new "Perlmutter" system at NERSC (https://www.nersc.gov/systems/perlmutter/).
cc @profjsb @datametrician @jakirkham
The text was updated successfully, but these errors were encountered: