-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add experimental FITS reader for CPU backend #4591
Conversation
dali/test/python/reader/test_fits.py
Outdated
gds_data_root = '/scratch/' | ||
if not os.path.isdir(gds_data_root): | ||
gds_data_root = os.getcwd() + "/scratch/" | ||
if not os.path.isdir(gds_data_root): | ||
os.mkdir(gds_data_root) | ||
assert os.path.isdir(gds_data_root) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need that for fits. It is used for the GPU Direct Storage solution that works only on the real file system, as our tests run inside the docker environment and its tmp file system doesn't work with GDS so we need to mount the external file system and use it for tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, addressed by removing this fragment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments related to the operator API and some code trimming suggestions.
} | ||
} | ||
|
||
DALI_REGISTER_OPERATOR(readers__Fits, FitsReaderCPU, CPU); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, we should start by placing this in experimental module, at least untill we get some more functionality:
DALI_REGISTER_OPERATOR(readers__Fits, FitsReaderCPU, CPU); | |
DALI_REGISTER_OPERATOR(experimental__readers__Fits, FitsReaderCPU, CPU); |
This will mean that you can access the operator as experimental.readers.fits
, so you would need to adjust the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed, followed through with the suggestion.
fits_movabs_hdu(current_file, hdu_index_, NULL, &status); | ||
|
||
if (hdu_index_ < hdunum) { | ||
hdu_index_++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The suggested approach would mean, that we potentially read one sample consisting of multiple HDUs.
try { | ||
fits::ParseHeader(header, current_file); | ||
} catch (const std::runtime_error& e) { | ||
DALI_FAIL(e.what() + ". File: " + filename); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be enough to run it directly and let the exception be handled further. I think only the DALI_ENFORCE in this function may throw.
try { | |
fits::ParseHeader(header, current_file); | |
} catch (const std::runtime_error& e) { | |
DALI_FAIL(e.what() + ". File: " + filename); | |
} | |
fits::ParseHeader(header, current_file); |
|
||
fits_get_img_param(src, MAX_DIMS, &img_type, &n_dims, dims, &status); | ||
parsed_header.type_info = &TypeFromFitsImageType(img_type); | ||
parsed_header.compressed = (fits_is_compressed_image(src, &status) == 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we error out on the compressed files until we merge the option to decompress them?
0804f9a
to
954de00
Compare
int batch_size = GetCurrBatchSize(); | ||
const auto& file_0 = GetSample(0); | ||
DALIDataType output_type = file_0.get_type(); | ||
int ndim = file_0.get_shape().sample_dim(); | ||
TensorListShape<> sh(batch_size, ndim); | ||
|
||
// TODO: implement checking that all images have same dimensions | ||
// also all general calculations for all images such as roi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the initial case it might be ok to assume uniform batch but maybe there shouldn't be such constraint in general and we should support image of different sizes.
Also, you need to peek the shape of the sample and set it for all element of sh
so the executor can correctly resize the buffers. Now it will probably have some random contents (we just resized the internal vector to batch_size * ndim
elements), so this is probably a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bunch of comments, but I think we are close to finishing with this.
dali/util/fits.cc
Outdated
} | ||
|
||
if (status) | ||
fits_report_error(stderr, status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should probably obtain the error message and report it with DALI_FAIL
|
||
void FitsLoader::ReadSample(FitsFileWrapper& target) { | ||
auto filename = files_[current_index_++]; | ||
fitsfile* current_file = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would probably be safe for the resource management if we use UniqueHandle for managing the opened fits file: include/dali/core/unique_handle.h
See for example how to implement it here: include/dali/core/cuda_event.h
we could implement it like:
class DLL_PUBLIC FitsHandle: public UniqueHandle<fitsfile*, FitsHandle> {
public:
DALI_INHERIT_UNIQUE_HANDLE(fitsfile*, FitsHandle)
constexpr FitsHandle() = default;
/** @brief Opens the FITS file with fits_open_file*/
static FitsHandle OpenFile(char *filename, int mode);
/** @brief Calls fits_close_file on the file handle */
static void DestroyHandle(fitsfile*);
};
It will automatically call close on destruction so we can safely throw exceptions around.
using Operator<Backend>::spec_; | ||
|
||
// rewrite this part given new approach with multiple outputs | ||
// do i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a question here :)
dali/test/python/reader/test_fits.py
Outdated
all_numpy_types = set([ | ||
np.bool_, np.byte, np.ubyte, np.short, np.ushort, np.intc, np.uintc, np.int_, np.uint, | ||
np.longlong, np.ulonglong, np.half, np.float16, np.single, np.double, np.longdouble, np.csingle, | ||
np.cdouble, np.clongdouble, np.int8, np.int16, np.int32, np.int64, np.uint8, np.uint16, | ||
np.uint32, np.uint64, np.intp, np.uintp, np.float32, np.float64, np.float_, np.complex64, | ||
np.complex128, np.complex_ | ||
]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's limit ourselves to some subset of the image types supported by FITS:
u8, i16, i32, i64, u32, u64.
dali/test/python/reader/test_fits.py
Outdated
|
||
def FitsReaderPipeline(path, batch_size, device="cpu", file_list=None, files=None, | ||
file_filter="*.fits", num_threads=1, device_id=0, | ||
hdu_indices=None, dtype=None): | ||
pipe = Pipeline(batch_size=batch_size, num_threads=num_threads, device_id=device_id) | ||
data = fn.readers.fits(device=device, | ||
file_list=file_list, | ||
files=files, | ||
file_root=path, | ||
file_filter=file_filter, | ||
shard_id=0, | ||
num_shards=1) | ||
pipe.set_outputs(data) | ||
return pipe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two nitpicks, I recommend using @pipeline_def
decorator and you need to add the experimental
module:
def FitsReaderPipeline(path, batch_size, device="cpu", file_list=None, files=None, | |
file_filter="*.fits", num_threads=1, device_id=0, | |
hdu_indices=None, dtype=None): | |
pipe = Pipeline(batch_size=batch_size, num_threads=num_threads, device_id=device_id) | |
data = fn.readers.fits(device=device, | |
file_list=file_list, | |
files=files, | |
file_root=path, | |
file_filter=file_filter, | |
shard_id=0, | |
num_shards=1) | |
pipe.set_outputs(data) | |
return pipe | |
@pipeline_def | |
def FitsReaderPipeline(path, device="cpu", file_list=None, files=None, | |
file_filter="*.fits", hdu_indices=None, dtype=None): | |
data = fn.experimental.readers.fits(device=device, | |
file_list=file_list, | |
files=files, | |
file_root=path, | |
file_filter=file_filter, | |
shard_id=0, | |
num_shards=1) | |
return data |
This function would accept arguments for Pipeline constructor as kwargs, so you can still pass the batch_size
, num_threads
and device_id
.
dali/test/python/reader/test_fits.py
Outdated
def delete_fits_file(filename): | ||
if os.path.isfile(filename): | ||
os.remove(filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused, you could probably take a look into setting up a Unittest setUp and tearDown functions for this purpose.
.AddOptionalArg("hdu_indices", | ||
R"(HDU indexes to read. If not provided, first HDU after primary | ||
will be yielded (i.e. will default to hdu_indices=[2]).)", | ||
std::vector<int>{2}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try the GetRepeatedArgument
method.
!build |
CI MESSAGE: [7619795]: BUILD STARTED |
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Signed-off-by: aderylo <a.m.derylo@gmail.com>
Signed-off-by: aderylo <a.m.derylo@gmail.com>
Signed-off-by: aderylo <a.m.derylo@gmail.com>
Signed-off-by: aderylo <a.m.derylo@gmail.com>
Signed-off-by: aderylo <a.m.derylo@gmail.com>
Signed-off-by: aderylo <a.m.derylo@gmail.com>
Signed-off-by: aderylo <a.m.derylo@gmail.com>
Signed-off-by: aderylo <a.m.derylo@gmail.com>
Signed-off-by: aderylo <a.m.derylo@gmail.com>
Signed-off-by: aderylo <a.m.derylo@gmail.com>
Signed-off-by: aderylo <a.m.derylo@gmail.com>
DALI_ENFORCE(status == 0, make_string("Failed to read a file: ", filename)); | ||
|
||
|
||
// close the file handle | ||
fits_close_file(current_file, &status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to Krzysztof's comment about the UniqueHandle
. This looks like a handle's leak on unsuccessful fits_read_img
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
fits_read_img(current_file, header.datatype_code, 1, nelem, &nulval, | ||
static_cast<uint8_t*>(target.data[output_idx].raw_mutable_data()), &anynul, | ||
&status); | ||
DALI_ENFORCE(status == 0, make_string("Failed to read a file: ", filename)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consideration:
I wonder if it's not more user-friendly if we forward a message from the fits itself here.
make_string("Failed to read a file: ", filename, ". ", GetFitsErrorMessage(status))
where GetFitsErrorMessage
is like HandleFitsError
but simply returns the string. (You could then use it inside HandleFitsError
too.)
dali/util/fits.cc
Outdated
fits_get_img_dim(src, &n_dims, &status); /* get NAXIS value */ | ||
std::vector<int64_t> dims(n_dims, 0); /* create vector for storing img dims*/ | ||
fits_get_img_size(src, n_dims, &dims[0], &status); /* get NAXISn values */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how paranoid we should be here, but I guess it wouldn't hurt to validate the ndim and shape for being non-negative if that can be an arbitrary value in a malicious file.
auto& sample = GetSample(sample_idx); | ||
for (int output_idx = 0; output_idx < num_outputs; output_idx++) { | ||
output_desc[output_idx].shape.set_tensor_shape(sample_idx, sample.data[output_idx].shape()); | ||
output_desc[output_idx].type = sample.data[output_idx].type(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you (or one of the base classes?) validate the dtypes
against what we actually get in the header?
const auto& sample_0 = GetSample(0); | ||
|
||
for (int output_idx = 0; output_idx < num_outputs; output_idx++) { | ||
int ndim = sample_0.data[output_idx].shape().sample_dim(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the ndim's in the actual headers differ between files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea. In any case, I've added check for consistency of dimensions and types between files on corresponding outputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how the reader acts if
- some dtypes in the header don't match the declarated dtypes
- ndims differ between files
3*. there are no files to read (due to empty dir, empty list file, restritive glob)
I belive the third point should be handled by the base class, so it is nice to have but for the first two I think we should have tests that assert we fail in orderly manner (take a look at assert_raises
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 & 3 should raise errors, 1 is not handled yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Nontheless, it would be good to have a simple test cases that validate it is really the case. For now and future changes.
Signed-off-by: aderylo <a.m.derylo@gmail.com>
Signed-off-by: aderylo <a.m.derylo@gmail.com>
Signed-off-by: aderylo <a.m.derylo@gmail.com>
…cfitsio api Signed-off-by: aderylo <a.m.derylo@gmail.com>
Signed-off-by: aderylo <a.m.derylo@gmail.com>
…om provided arguments Signed-off-by: aderylo <a.m.derylo@gmail.com>
!build |
CI MESSAGE: [7686640]: BUILD STARTED |
CI MESSAGE: [7686640]: BUILD FAILED |
Signed-off-by: aderylo <a.m.derylo@gmail.com>
!build |
CI MESSAGE: [7688979]: BUILD STARTED |
CI MESSAGE: [7688979]: BUILD PASSED |
4. Number of outputs per sample corresponds to the length of ``hdu_indices`` argument. By default, | ||
first HDU with data is read from each file, so the number of outputs defaults to 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick, can be done as a follow-up
4. Number of outputs per sample corresponds to the length of ``hdu_indices`` argument. By default, | |
first HDU with data is read from each file, so the number of outputs defaults to 1. | |
The number of outputs per sample corresponds to the length of ``hdu_indices`` argument. By default, | |
first HDU with data is read from each file, so the number of outputs defaults to 1. |
parsed_header.hdu_type = hdu_type; | ||
parsed_header.datatype_code = ImgTypeToDatatypeCode(img_type); | ||
parsed_header.type_info = &TypeFromFitsDatatypeCode(parsed_header.datatype_code); | ||
parsed_header.compressed = (fits_is_compressed_image(src, &status) == 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't validate this particular status ever, do we?
sample.data[output_idx].shape().sample_dim() == ndims[output_idx], | ||
make_string("Inconsistent data: All samples in the batch must have the same number of " | ||
"dimensions for outputs with the same idx. " | ||
"Got \"", | ||
sample_0.filename, "\" with ", ndims[output_idx], " dimensions and \"", | ||
sample.filename, "\" with ", sample.data[output_idx].shape().sample_dim(), | ||
" dimensions on ouput with idx = ", output_idx)); | ||
|
||
DALI_ENFORCE( | ||
sample.data[output_idx].type() == output_dtypes[output_idx], | ||
make_string("Inconsistent data: All samples in the batch must have the same data type " | ||
"for outputs with the same idx. " | ||
"Got \"", | ||
sample_0.filename, "\" with data type ", output_dtypes[output_idx], | ||
" and \"", sample.filename, "\" with data type ", | ||
sample.data[output_idx].type())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
It looks good, it seems that it works, great job! As a follow-up, please keep in mind: 1. a few test cases on incorrect inputs would be helpful, 2. enforcing the dtypes if we accept them 3. there is one status of fits call that is not validated (it seems like a special case when the status is not returned as an output; if the call cannot fail according to fits docs then please ignore that remark). |
Category: New feature
Description: Implement fits reader on cpu backend
Additional information:
Affected modules and functionalities:
Key points relevant for the review:
Tests:
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A