How to get the available dataset names from the reader #263

adybbroe · 2018-04-25T07:57:59Z

Problem description

We have a MSG SEVIRI native format data file with a subset of SEVIRI bands in the file.
However, when we request the available dataset names we get all. This is because the scene object does not use the information already retrieved in the reader. Reading the file header we have the information of which datasets are in the file. But this is not passed further. We tried adding a available_dataset_ids method to the reader, but this is not called/used when doing the below. We also tried passing the read-name to the call but that didn't make a difference.

How should one do?

In [2]: nat.available_dataset_names()
Out[2]: 
['HRV',
 'IR_016',
 'IR_039',
 'IR_087',
 'IR_097',
 'IR_108',
 'IR_120',
 'IR_134',
 'VIS006',
 'VIS008',
 'WV_062',
 'WV_073']

Expected Output

Versions of Python, package at hand and relevant dependencies

This was using the SatPy from the latest msg-hrit-native-merge branch, which we are working on for a PR.
The above is seen both in python 3.4.6 and 2.7.5

Thank you for reporting an issue !

The text was updated successfully, but these errors were encountered:

xuesongle · 2018-04-26T04:58:34Z

Each dataset name represents a band of channel in your local drive ( you need to have files in different bands saved in your local directory). Which band to be loaded into the Scene object depends on those parameters. The long list there tells you that you have all files in different bands in your base directory but not loaded
For example
nat.load(["IR_016"])
nat.save_datasets(writer='simple_image')

will save the data in "IR_016" band into a PNG image file.

djhoese · 2018-04-26T12:23:19Z

@adybbroe As discussed on slack, this "issue" is expected because currently the base reader uses a datasets file_type to determine if it is available. This can be problematic in cases like this where even though the proper file is loaded, the dataset isn't available in the file. There are two possible solutions:

If possible, split the single file_type in the reader's yaml file in to multiple file types. File types are usually combined in a YAML file because it is simpler to have one file pattern for all of the possible files even though the files are actually per-band. If the filenames differ between dataset then you split the single file type in to multiple file types. This can make the YAML file more complex but will produce the expected results.
Look at the existing clavrx and geocat readers which dynamically discover the available datasets by looking at the file contents. In the future this type of functionality will be part of the base readers, but has not been finalized yet to have that done.

djhoese · 2018-12-01T18:50:09Z

As mentioned in the other issue, this is covered by #434. I'll close this in favor of that discussion.

adybbroe assigned mraspaud and djhoese Apr 25, 2018

sjoro mentioned this issue Nov 13, 2018

Allow readers to filter the available datasets configured in YAML #434

Closed

djhoese closed this as completed Dec 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to get the available dataset names from the reader #263

How to get the available dataset names from the reader #263

adybbroe commented Apr 25, 2018

xuesongle commented Apr 26, 2018 •

edited

djhoese commented Apr 26, 2018

djhoese commented Dec 1, 2018

How to get the available dataset names from the reader #263

How to get the available dataset names from the reader #263

Comments

adybbroe commented Apr 25, 2018

Problem description

Expected Output

Versions of Python, package at hand and relevant dependencies

xuesongle commented Apr 26, 2018 • edited

djhoese commented Apr 26, 2018

djhoese commented Dec 1, 2018

xuesongle commented Apr 26, 2018 •

edited