Add experimental FITS reader for CPU backend #4591

aderylo · 2023-01-20T08:19:52Z

Category: New feature

Description: Implement fits reader on cpu backend

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: N/A

JanuszL · 2023-01-20T08:50:46Z

dali/test/python/reader/test_fits.py

+gds_data_root = '/scratch/'
+if not os.path.isdir(gds_data_root):
+    gds_data_root = os.getcwd() + "/scratch/"
+    if not os.path.isdir(gds_data_root):
+        os.mkdir(gds_data_root)
+        assert os.path.isdir(gds_data_root)


You don't need that for fits. It is used for the GPU Direct Storage solution that works only on the real file system, as our tests run inside the docker environment and its tmp file system doesn't work with GDS so we need to mount the external file system and use it for tests.

Sure, addressed by removing this fragment.

klecki

The comments related to the operator API and some code trimming suggestions.

dali/operators/reader/fits_reader_op.cc

klecki · 2023-01-20T15:00:36Z

dali/operators/reader/fits_reader_op.cc

+  }
+}
+
+DALI_REGISTER_OPERATOR(readers__Fits, FitsReaderCPU, CPU);


I think, we should start by placing this in experimental module, at least untill we get some more functionality:

Suggested change

DALI_REGISTER_OPERATOR(readers__Fits, FitsReaderCPU, CPU);

DALI_REGISTER_OPERATOR(experimental__readers__Fits, FitsReaderCPU, CPU);

This will mean that you can access the operator as experimental.readers.fits, so you would need to adjust the tests.

Addressed, followed through with the suggestion.

dali/operators/reader/fits_reader_op.cc

klecki · 2023-01-20T15:05:46Z

dali/operators/reader/loader/fits_loader.cc

+  fits_movabs_hdu(current_file, hdu_index_, NULL, &status);
+
+  if (hdu_index_ < hdunum) {
+    hdu_index_++;


The suggested approach would mean, that we potentially read one sample consisting of multiple HDUs.

klecki · 2023-01-20T16:27:09Z

dali/operators/reader/loader/fits_loader.cc

+  try {
+    fits::ParseHeader(header, current_file);
+  } catch (const std::runtime_error& e) {
+    DALI_FAIL(e.what() + ". File: " + filename);
+  }


It should be enough to run it directly and let the exception be handled further. I think only the DALI_ENFORCE in this function may throw.

Suggested change

try {

fits::ParseHeader(header, current_file);

} catch (const std::runtime_error& e) {

DALI_FAIL(e.what() + ". File: " + filename);

}

fits::ParseHeader(header, current_file);

klecki · 2023-01-20T16:29:35Z

dali/util/fits.cc

+
+  fits_get_img_param(src, MAX_DIMS, &img_type, &n_dims, dims, &status);
+  parsed_header.type_info = &TypeFromFitsImageType(img_type);
+  parsed_header.compressed = (fits_is_compressed_image(src, &status) == 1);


Should we error out on the compressed files until we merge the option to decompress them?

dali/test/python/reader/test_fits.py

klecki · 2023-01-26T17:22:01Z

dali/operators/reader/fits_reader_op.h

+    int batch_size = GetCurrBatchSize();
+    const auto& file_0 = GetSample(0);
+    DALIDataType output_type = file_0.get_type();
+    int ndim = file_0.get_shape().sample_dim();
+    TensorListShape<> sh(batch_size, ndim);
+
+    // TODO: implement checking that all images have same dimensions
+    // also all general calculations for all images such as roi


In the initial case it might be ok to assume uniform batch but maybe there shouldn't be such constraint in general and we should support image of different sizes.

Also, you need to peek the shape of the sample and set it for all element of sh so the executor can correctly resize the buffers. Now it will probably have some random contents (we just resized the internal vector to batch_size * ndim elements), so this is probably a bug.

klecki

Bunch of comments, but I think we are close to finishing with this.

dali/util/fits.cc

klecki · 2023-03-07T12:44:13Z

dali/util/fits.cc

+  }
+
+  if (status)
+    fits_report_error(stderr, status);


I think we should probably obtain the error message and report it with DALI_FAIL

klecki · 2023-03-07T14:24:25Z

dali/operators/reader/loader/fits_loader.cc

+
+void FitsLoader::ReadSample(FitsFileWrapper& target) {
+  auto filename = files_[current_index_++];
+  fitsfile* current_file = nullptr;


It would probably be safe for the resource management if we use UniqueHandle for managing the opened fits file: include/dali/core/unique_handle.h

See for example how to implement it here: include/dali/core/cuda_event.h

we could implement it like:

class DLL_PUBLIC FitsHandle: public UniqueHandle<fitsfile*, FitsHandle> { public: DALI_INHERIT_UNIQUE_HANDLE(fitsfile*, FitsHandle) constexpr FitsHandle() = default; /** @brief Opens the FITS file with fits_open_file*/ static FitsHandle OpenFile(char *filename, int mode); /** @brief Calls fits_close_file on the file handle */ static void DestroyHandle(fitsfile*); }; It will automatically call close on destruction so we can safely throw exceptions around.

dali/operators/reader/loader/fits_loader.cc

klecki · 2023-03-07T15:00:53Z

dali/operators/reader/fits_reader_op.h

+  using Operator<Backend>::spec_;
+
+  // rewrite this part given new approach with multiple outputs
+  // do i


There was a question here :)

dali/operators/reader/fits_reader_op.cc

klecki · 2023-03-07T15:25:18Z

dali/test/python/reader/test_fits.py

+all_numpy_types = set([
+    np.bool_, np.byte, np.ubyte, np.short, np.ushort, np.intc, np.uintc, np.int_, np.uint,
+    np.longlong, np.ulonglong, np.half, np.float16, np.single, np.double, np.longdouble, np.csingle,
+    np.cdouble, np.clongdouble, np.int8, np.int16, np.int32, np.int64, np.uint8, np.uint16,
+    np.uint32, np.uint64, np.intp, np.uintp, np.float32, np.float64, np.float_, np.complex64,
+    np.complex128, np.complex_
+])


Let's limit ourselves to some subset of the image types supported by FITS:
u8, i16, i32, i64, u32, u64.

klecki · 2023-03-07T15:29:23Z

dali/test/python/reader/test_fits.py

+
+def FitsReaderPipeline(path, batch_size, device="cpu", file_list=None, files=None,
+                       file_filter="*.fits", num_threads=1, device_id=0,
+                       hdu_indices=None, dtype=None):
+    pipe = Pipeline(batch_size=batch_size, num_threads=num_threads, device_id=device_id)
+    data = fn.readers.fits(device=device,
+                           file_list=file_list,
+                           files=files,
+                           file_root=path,
+                           file_filter=file_filter,
+                           shard_id=0,
+                           num_shards=1)
+    pipe.set_outputs(data)
+    return pipe


Two nitpicks, I recommend using @pipeline_def decorator and you need to add the experimental module:

Suggested change

def FitsReaderPipeline(path, batch_size, device="cpu", file_list=None, files=None,

file_filter="*.fits", num_threads=1, device_id=0,

hdu_indices=None, dtype=None):

pipe = Pipeline(batch_size=batch_size, num_threads=num_threads, device_id=device_id)

data = fn.readers.fits(device=device,

file_list=file_list,

files=files,

file_root=path,

file_filter=file_filter,

shard_id=0,

num_shards=1)

pipe.set_outputs(data)

return pipe

@pipeline_def

def FitsReaderPipeline(path, device="cpu", file_list=None, files=None,

file_filter="*.fits", hdu_indices=None, dtype=None):

data = fn.experimental.readers.fits(device=device,

file_list=file_list,

files=files,

file_root=path,

file_filter=file_filter,

shard_id=0,

num_shards=1)

return data

This function would accept arguments for Pipeline constructor as kwargs, so you can still pass the batch_size, num_threads and device_id.

klecki · 2023-03-07T15:31:36Z

dali/test/python/reader/test_fits.py

+def delete_fits_file(filename):
+    if os.path.isfile(filename):
+        os.remove(filename)


Unused, you could probably take a look into setting up a Unittest setUp and tearDown functions for this purpose.

dali/util/fits.cc

klecki · 2023-03-14T10:28:10Z

dali/operators/reader/fits_reader_op.cc

+    .AddOptionalArg("hdu_indices",
+                                 R"(HDU indexes to read. If not provided, first HDU after primary 
+  will be yielded (i.e. will default to  hdu_indices=[2]).)",
+                                 std::vector<int>{2})


Try the GetRepeatedArgument method.

klecki · 2023-03-17T10:11:19Z

!build

dali-automaton · 2023-03-17T10:15:51Z

CI MESSAGE: [7619795]: BUILD STARTED

review-notebook-app · 2023-03-17T12:29:40Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Signed-off-by: aderylo <a.m.derylo@gmail.com>

stiepan · 2023-03-22T15:22:46Z

dali/operators/reader/loader/fits_loader.cc

+    DALI_ENFORCE(status == 0, make_string("Failed to read a file: ", filename));
+
+
+    // close the file handle
+    fits_close_file(current_file, &status);


+1 to Krzysztof's comment about the UniqueHandle. This looks like a handle's leak on unsuccessful fits_read_img.

stiepan · 2023-03-22T15:27:27Z

dali/operators/reader/loader/fits_loader.cc

+    fits_read_img(current_file, header.datatype_code, 1, nelem, &nulval,
+                  static_cast<uint8_t*>(target.data[output_idx].raw_mutable_data()), &anynul,
+                  &status);
+    DALI_ENFORCE(status == 0, make_string("Failed to read a file: ", filename));


For consideration:

I wonder if it's not more user-friendly if we forward a message from the fits itself here.

make_string("Failed to read a file: ", filename, ". ", GetFitsErrorMessage(status))

where GetFitsErrorMessage is like HandleFitsError but simply returns the string. (You could then use it inside HandleFitsError too.)

stiepan · 2023-03-22T15:35:18Z

dali/util/fits.cc

+  fits_get_img_dim(src, &n_dims, &status);           /* get NAXIS value */
+  std::vector<int64_t> dims(n_dims, 0);                 /* create vector for storing img dims*/
+  fits_get_img_size(src, n_dims, &dims[0], &status); /* get NAXISn values */


I wonder how paranoid we should be here, but I guess it wouldn't hurt to validate the ndim and shape for being non-negative if that can be an arbitrary value in a malicious file.

dali/util/fits.cc

dali/operators/reader/loader/fits_loader.cc

stiepan · 2023-03-22T15:58:55Z

dali/operators/reader/fits_reader_op.h

+      auto& sample = GetSample(sample_idx);
+      for (int output_idx = 0; output_idx < num_outputs; output_idx++) {
+        output_desc[output_idx].shape.set_tensor_shape(sample_idx, sample.data[output_idx].shape());
+        output_desc[output_idx].type = sample.data[output_idx].type();


Do you (or one of the base classes?) validate the dtypes against what we actually get in the header?

stiepan · 2023-03-22T15:59:48Z

dali/operators/reader/fits_reader_op.h

+    const auto& sample_0 = GetSample(0);
+
+    for (int output_idx = 0; output_idx < num_outputs; output_idx++) {
+      int ndim = sample_0.data[output_idx].shape().sample_dim();


What happens if the ndim's in the actual headers differ between files?

No idea. In any case, I've added check for consistency of dimensions and types between files on corresponding outputs.

stiepan · 2023-03-22T16:04:53Z

dali/test/python/reader/test_fits.py

I wonder how the reader acts if

some dtypes in the header don't match the declarated dtypes

ndims differ between files
3*. there are no files to read (due to empty dir, empty list file, restritive glob)

I belive the third point should be handled by the base class, so it is nice to have but for the first two I think we should have tests that assert we fail in orderly manner (take a look at assert_raises).

2 & 3 should raise errors, 1 is not handled yet.

Nice! Nontheless, it would be good to have a simple test cases that validate it is really the case. For now and future changes.

Signed-off-by: aderylo <a.m.derylo@gmail.com>

dali/util/fits.cc

…cfitsio api Signed-off-by: aderylo <a.m.derylo@gmail.com>

Signed-off-by: aderylo <a.m.derylo@gmail.com>

…om provided arguments Signed-off-by: aderylo <a.m.derylo@gmail.com>

klecki · 2023-03-23T14:20:17Z

!build

dali-automaton · 2023-03-23T14:27:26Z

CI MESSAGE: [7686640]: BUILD STARTED

dali-automaton · 2023-03-23T15:12:55Z

CI MESSAGE: [7686640]: BUILD FAILED

Signed-off-by: aderylo <a.m.derylo@gmail.com>

klecki · 2023-03-23T16:12:26Z

!build

dali-automaton · 2023-03-23T16:46:50Z

CI MESSAGE: [7688979]: BUILD STARTED

dali-automaton · 2023-03-23T19:04:59Z

CI MESSAGE: [7688979]: BUILD PASSED

stiepan · 2023-03-24T11:53:32Z

dali/operators/reader/fits_reader_op.cc

+4. Number of outputs per sample corresponds to the length of ``hdu_indices`` argument. By default,
+first HDU with data is read from each file, so the number of outputs defaults to 1. 


Nitpick, can be done as a follow-up

Suggested change

4. Number of outputs per sample corresponds to the length of ``hdu_indices`` argument. By default,

first HDU with data is read from each file, so the number of outputs defaults to 1.

The number of outputs per sample corresponds to the length of ``hdu_indices`` argument. By default,

first HDU with data is read from each file, so the number of outputs defaults to 1.

stiepan · 2023-03-24T11:59:18Z

dali/util/fits.cc

+  parsed_header.hdu_type = hdu_type;
+  parsed_header.datatype_code = ImgTypeToDatatypeCode(img_type);
+  parsed_header.type_info = &TypeFromFitsDatatypeCode(parsed_header.datatype_code);
+  parsed_header.compressed = (fits_is_compressed_image(src, &status) == 1);


We don't validate this particular status ever, do we?

stiepan · 2023-03-24T12:02:30Z

dali/operators/reader/fits_reader_op.h

+            sample.data[output_idx].shape().sample_dim() == ndims[output_idx],
+            make_string("Inconsistent data: All samples in the batch must have the same number of "
+                        "dimensions for outputs with the same idx. "
+                        "Got \"",
+                        sample_0.filename, "\" with ", ndims[output_idx], " dimensions and \"",
+                        sample.filename, "\" with ", sample.data[output_idx].shape().sample_dim(),
+                        " dimensions on ouput with idx = ", output_idx));
+
+        DALI_ENFORCE(
+            sample.data[output_idx].type() == output_dtypes[output_idx],
+            make_string("Inconsistent data: All samples in the batch must have the same data type "
+                        "for outputs with the same idx. "
+                        "Got \"",
+                        sample_0.filename, "\" with data type ", output_dtypes[output_idx],
+                        " and \"", sample.filename, "\" with data type ",
+                        sample.data[output_idx].type()));


stiepan · 2023-03-24T12:07:46Z

It looks good, it seems that it works, great job!

As a follow-up, please keep in mind: 1. a few test cases on incorrect inputs would be helpful, 2. enforcing the dtypes if we accept them 3. there is one status of fits call that is not validated (it seems like a special case when the status is not returned as an output; if the call cannot fail according to fits docs then please ignore that remark).

JanuszL reviewed Jan 20, 2023

View reviewed changes

jantonguirao assigned mzient and stiepan Jan 20, 2023

klecki assigned klecki and unassigned mzient Jan 20, 2023

klecki reviewed Jan 20, 2023

View reviewed changes

github-advanced-security bot found potential problems Jan 24, 2023

View reviewed changes

dali/test/python/reader/test_fits.py Fixed Show fixed Hide fixed

aderylo force-pushed the fits-reader branch 2 times, most recently from 0804f9a to 954de00 Compare January 26, 2023 09:08

klecki reviewed Jan 26, 2023

View reviewed changes

aderylo force-pushed the fits-reader branch from 55ceb54 to 1f1daf9 Compare February 24, 2023 09:41

klecki reviewed Mar 7, 2023

View reviewed changes

klecki reviewed Mar 13, 2023

View reviewed changes

dali/util/fits.cc Outdated Show resolved Hide resolved

klecki reviewed Mar 14, 2023

View reviewed changes

aderylo added 11 commits March 17, 2023 13:30

Early draft of fits loaders

14f9507

Signed-off-by: aderylo <a.m.derylo@gmail.com>

Further modify fits loader to read data into tensor

4a89b47

Signed-off-by: aderylo <a.m.derylo@gmail.com>

FPossibly working fits_loader with ugly header parsing

b8c5617

Signed-off-by: aderylo <a.m.derylo@gmail.com>

Draft of a fits reader header

3c899cb

Signed-off-by: aderylo <a.m.derylo@gmail.com>

Add FitsReaderCPU::RunImpl

cfcd5d8

Signed-off-by: aderylo <a.m.derylo@gmail.com>

Add basic python tests for fits reader

d8fcfa1

Signed-off-by: aderylo <a.m.derylo@gmail.com>

Remove typo from fits_reader filename

322d11e

Signed-off-by: aderylo <a.m.derylo@gmail.com>

Futher renameings

f432342

Signed-off-by: aderylo <a.m.derylo@gmail.com>

Remove header_cache function from fitsloader consturctor

98aefb3

Signed-off-by: aderylo <a.m.derylo@gmail.com>

Rewrite SetupImpl in fits_reader header

aead703

Signed-off-by: aderylo <a.m.derylo@gmail.com>

Fix comments, no impact change

fba6c1f

Signed-off-by: aderylo <a.m.derylo@gmail.com>

stiepan reviewed Mar 22, 2023

View reviewed changes

dali/util/fits.cc Outdated Show resolved Hide resolved

stiepan reviewed Mar 22, 2023

View reviewed changes

dali/operators/reader/loader/fits_loader.cc Show resolved Hide resolved

stiepan reviewed Mar 22, 2023

View reviewed changes

aderylo added 3 commits March 22, 2023 18:15

Implement FitsHandle

dfc0127

Signed-off-by: aderylo <a.m.derylo@gmail.com>

Apply corrections to fits reader spec

1e4e1db

Signed-off-by: aderylo <a.m.derylo@gmail.com>

Improve fits error handling. Add corrections to comments in fits tests

fc28414

Signed-off-by: aderylo <a.m.derylo@gmail.com>

stiepan reviewed Mar 22, 2023

View reviewed changes

dali/util/fits.cc Show resolved Hide resolved

aderylo added 3 commits March 22, 2023 21:26

Introduce FITS_CALL wrapper for handling errors that come from using …

b258ec5

…cfitsio api Signed-off-by: aderylo <a.m.derylo@gmail.com>

Add dims and type consistency checks to setupImpl in fits reader

9ae1e76

Signed-off-by: aderylo <a.m.derylo@gmail.com>

Add comments to spec that clarify how number of outputs is infered fr…

cdf7aa5

…om provided arguments Signed-off-by: aderylo <a.m.derylo@gmail.com>

klecki approved these changes Mar 23, 2023

View reviewed changes

Remove inline kword from FITS_CALL declaration

e957c60

Signed-off-by: aderylo <a.m.derylo@gmail.com>

stiepan reviewed Mar 24, 2023

View reviewed changes

stiepan approved these changes Mar 24, 2023

View reviewed changes

klecki changed the title ~~Reader and loader for fits files on CPU backend~~ Add experimental FITS reader for CPU backend Mar 24, 2023

klecki merged commit a4f7d59 into NVIDIA:main Mar 24, 2023

	DALI_REGISTER_OPERATOR(readers__Fits, FitsReaderCPU, CPU);
	DALI_REGISTER_OPERATOR(experimental__readers__Fits, FitsReaderCPU, CPU);

		4. Number of outputs per sample corresponds to the length of ``hdu_indices`` argument. By default,
		first HDU with data is read from each file, so the number of outputs defaults to 1.

Add experimental FITS reader for CPU backend #4591

Add experimental FITS reader for CPU backend #4591

Conversation

aderylo commented Jan 20, 2023 • edited Loading

Category: New feature

Description: Implement fits reader on cpu backend

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klecki left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klecki left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klecki commented Mar 17, 2023

dali-automaton commented Mar 17, 2023

review-notebook-app bot commented Mar 17, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aderylo Mar 22, 2023 • edited Loading

Choose a reason for hiding this comment

stiepan Mar 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klecki commented Mar 23, 2023

dali-automaton commented Mar 23, 2023

dali-automaton commented Mar 23, 2023

klecki commented Mar 23, 2023

dali-automaton commented Mar 23, 2023

dali-automaton commented Mar 23, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stiepan commented Mar 24, 2023

aderylo commented Jan 20, 2023 •

edited

Loading

aderylo Mar 22, 2023 •

edited

Loading

stiepan Mar 22, 2023 •

edited

Loading