Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SEP] Reader naming conventions #527

Closed
djhoese opened this issue Nov 29, 2018 · 4 comments · Fixed by #546
Closed

[SEP] Reader naming conventions #527

djhoese opened this issue Nov 29, 2018 · 4 comments · Fixed by #546
Assignees
Labels
backwards-incompatibility Causes backwards incompatibility or introduces a deprecation component:readers enhancement code enhancements, features, improvements help wanted PCW Pytroll Contributors' Week refactor
Milestone

Comments

@djhoese
Copy link
Member

djhoese commented Nov 29, 2018

This SatPy Enhancement Proposal [SEP] is to finalize reader naming conventions for all readers provided by SatPy. This issue is set for v0.11 which I think will be the "break all the readers and rename them" release.

I propose the following reader naming convention and rules:

Rules

  1. Readers should be named following the scheme:
[<file format>_]<sensor/algorithm>[_<processing/calibration level>[_<detailed suffix>]]
  1. Reader names should be all lowercase regardless of acronyms contained with fields separated by underscores where possible. Hyphens as inter-field separators will be considered on a case by case basis.
  2. Abbreviations should follow community standards:
  • NetCDF -> nc,
  • HDF 4 ->hdf
  • HDF 5 -> h5
  • Level 1B -> l1b
  • Level 2 -> l2
  • Sensor Data Record -> sdr
  • Environmental Data Record -> edr
  1. All naming scheme fields are optional except sensor/algorithm, but at least one of the optional fields should be provided.

  2. Future product expectations and community usage (colloquialisms) should also be considered in what naming fields are excluded.

    For example, the ABI L1B products stored in NetCDF4 files are the official NOAA product and as such shouldn't be provided in any other "official" format in the future by that name. This means that the reader can be called "abi_l1b" even though "nc_abi_l1b" would be a more detailed name. Other formats that provide similar data for ABI are known by and referred to by other names such as the "CMIP" format. In this case the reader could be called "cmip_abi_l1b" or "cmip_abi_l2".

  3. If a reader's format becomes obsolete, due to changes by the operational processing or satellite mission's organizers, and the new reader's name conflicts with the existing reader, the old reader's name should be prefixed with legacy_. The main point being that reader's of new formats are not new_ readers, but rather just "the reader" for that format.

Personal Opinion and Complications

If this naming scheme was only for computer/internal use then I'd say things like nc_abi_l1b would be preferred. Since these names need to be easily identifiable I think abi_l1b is more logical as a user friendly. I'm sure allowing the flexibility discussed in point 5 above will cause issues in the future, but I'm hoping it won't.

Additionally, we are going to have a lot of headaches where files are describes by the "scheme" they adopt and not by the format that they are. For example "SCMI" files are NetCDF4 files that follow CF standards (sort of). The name "SCMI" stands for Sectorized Cloud Moisture Imagery which isn't even what it is used for anymore. So does the reader that reads ABI SCMI data call itself the scmi_abi reader or the nc_abi reader or some combination of "nc", "scmi", "l1b", "abi", blah blah blah.

Examples

  1. abi_l1b
  2. hrit_ahi
  3. hsd_ahi
  4. viirs_sdr
  5. viirs_edr_flood
    The VIIRS EDR Flood Product
  6. geocat
    A reader for the output from the GeoCAT software. Could be one or more sensors and one or more processing levels.
  7. nc_goes
    Pre-GOES-16 satellite format
@djhoese djhoese added enhancement code enhancements, features, improvements help wanted component:readers refactor backwards-incompatibility Causes backwards incompatibility or introduces a deprecation PCW Pytroll Contributors' Week labels Nov 29, 2018
@djhoese djhoese added this to the v0.11 milestone Nov 29, 2018
@djhoese
Copy link
Member Author

djhoese commented Nov 30, 2018

As mentioned on slack, what if sensor/algorithm name went first?

  • viirs_sdr
  • abi_l1b
  • ahi_hsd

Then @mraspaud and @adybbroe said that file format would be compulsory/required which I'm not huge fan of because of how ugly some readers might get. Regardless, I think file format should go last if sensor/algorithm is first:

  • viirs_sdr_h5
  • abi_l1b_nc
  • ahi_hsd (not sure level is needed here)
  • geocat (handles multiple formats)

@djhoese
Copy link
Member Author

djhoese commented Dec 3, 2018

After a little more discussion on slack I think this is our new proposal:

<sensor/algorithm/processing-software>[_<processing/calibration level>[_<detailed suffix>]][_<file format>]

If the first field is a sensor then it must be followed by one of the other sections to avoid confusion and ambiguity. This applies to "sensor" specifically because there are likely many formats and processing levels for a single sensor but algorithms and processing-software packages typically have one type of output (clavr-x, geocat, nucaps, etc).

Bottom line: The reader name should be easy to identify by new users who may want to use the reader while also not conflicting with other readers (existing or future). Be reasonable.

Examples

  • viirs_sdr
  • abi_l1b
  • viirs_edr_flood
  • seviri_native
  • ahi_hsd
  • ahi_hrit
  • geocat
  • clavrx

Complications

The nc_goes reader is actually for a NetCDF format for the goes_imager sensor. We don't have many sensors with underscores in them. Should this sensor be changed to "imager" with the platform_name of "goes"? The final/new reader name would then be imager_nc which seems confusing. Do we use hyphens somewhere? @pytroll/core and @sfinkens thoughts?

Next, does viirs_compact include data outside than the SDR level? Should this be viirs_sdr_compact?

Next, safe_sar_c, similar to goes_imager the sensor name has a separator in it (hyphen in this case). Should this be sar_c_safe or something else?

Deprecation/Transition

A reasonable request by @pnuu is a transition from old reader names over the next couple releases. I say we add a reader name mapping dictionary in readers/__init__.py that does the mapping and the functions in that module that access reader names will provide a DeprecationWarning message when one of the old names is used. After maybe one or two releases we change this to an exception with explicit names mentioned, ex. "X -> Y". So after 3 or so releases (or 1.0, whichever comes first) then we remove this handling and reader name mapping.

@djhoese
Copy link
Member Author

djhoese commented Dec 3, 2018

Slack discussion on the complicated ones:

  • goes-imager_nc
  • sar-c_safe
  • mtsat2-imager_hrit
  • jami_hrit (even though it is on mtsat1)

Note: x-y doesn't mean that x is always the platform and y the sensor. It could be any separate used in the sensor name, processing software, or other information needed to distinguish the sensor from other sensors.

sfinkens added a commit to sfinkens/satpy that referenced this issue Dec 3, 2018
sfinkens added a commit to sfinkens/satpy that referenced this issue Dec 7, 2018
@djhoese
Copy link
Member Author

djhoese commented Dec 10, 2018

This morning I thought about this some more and was wondering if the generic_image reader should be renamed too? It is technically an experimental reader and is a special case at the least. However, it technically follows the above naming scheme is we consider "generic" the sensor name and "image" the abstract name for the format (tif, png, jpg, etc).

I hope to get working on this some time this week if other things go as planned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backwards-incompatibility Causes backwards incompatibility or introduces a deprecation component:readers enhancement code enhancements, features, improvements help wanted PCW Pytroll Contributors' Week refactor
Development

Successfully merging a pull request may close this issue.

4 participants