New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datasets: consistent capitalization of band names #778
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect all caps here like I would use in an Enum (e.g. cv2.INTER_LINEAR
).
This isn't exactly an enum, it's a class attribute. Even then, we often override many of these class attributes at the instance level ( |
I'm saying it behaves like an Enum, so I would expect it to be all caps. Changing these to all caps doesn't mean every class attribute has to be capitalized. If that inconsistency is too much, then we could make it an enum, or just not capitalize it I guess. |
For now, I would just go with not capitalizing as in this PR. Since this seems to only occur in NonGeoDatasets, it shouldn't affect #687. Besides, the capitalization convention is for Example: class AllBands(Enum):
RED: 1
GREEN: 2
BLUE: 3
NIR: 4 There might be some value in representing bands as List of str (current)class Landsat8(Landsat):
"""Landsat 8 Operational Land Imager (OLI) and Thermal Infrared Sensor (TIRS)."""
filename_glob = "LC08_*_{}.*"
all_bands = [
"SR_B1",
"SR_B2",
"SR_B3",
"SR_B4",
"SR_B5",
"SR_B6",
"SR_B7",
"SR_B8",
"SR_B9",
"SR_B10",
"SR_B11",
]
rgb_bands = ["SR_B4", "SR_B3", "SR_B2"] vs Using Enumclass Landsat8(Landsat):
"""Landsat 8 Operational Land Imager (OLI) and Thermal Infrared Sensor (TIRS)."""
filename_glob = "LC08_*_{}.*"
class Landsat8Bands(Enum):
COASTAL_AEROSOL = 1
BLUE = 2
GREEN = 3
RED = 4
NIR = 5
SWIR1 = 6
SWIR2 = 7
PAN = 8
CIRRUS = 9
TIRS1 = 10
TIRS2 = 11
rgb_bands = [Landsat8Bands.RED, Landsat8Bands.GREEN, Landsat8Bands.BLUE] |
I thought @calebrob6 meant something like: class Landsat8Bands(Enum):
RGB = ["SR_B4", "SR_B3", "SR_B2"]
ALL = ["SR_B1", ..., "SR_B11"] The problem with that is it doesn't allow you to enumerate arbitary subsets/orders/combinations. |
Yes, that is what I meant. I don't care too much about capitalization vs. not capitalization -- generally, I want the choices here to be obvious to users. All caps band choices seems like a decent way to separate out the "interesting" class attributes vs. the instance attributes. Enums are probably the right way, but for now let's just go all lowercase. |
I'd rather not change this back and forth multiple times. There's no rush to merge this, but let's come to a consensus on the best way to do this before changing it. |
Kk, I would suggest:
|
Already the case I hope.
This is much harder. B01 in one dataset does not match a different dataset. Some datasets call it B01 and others call it B1. I'm not sure if it makes sense or is even possible to match band names across datasets.
I feel like this is the wrong use case for enums. Typically, enums are used when you want to limit the input to a specific subset of options, and you want to document those options with names instead of numbers. If the methods accept a list or an enum, that feels weird to me.
Really like this idea, but I'll need to think about how to do this properly. I don't love our blog where we use |
Hmm, I feel like this shouldn't be difficult... The requirements here are to expose the interesting subsets of band combinations to the user in a way that is easily discoverable right? My problem with including these as just another lower-case class attribute is that they aren't really easily discoverable.
I mean across datasets that use the same satellite / platform. E.g. we can easily remap "B1" to "B01" to standardize across datasets that use Sentinel 2 if we want.
Out of curiosity, what do style guides say about this? E.g. it doesn't seem wrong at all to me. |
Unfortunately, PEP-8 was written before Python supported Enums. |
Enums for band numbers would be super useful for our index transforms (e.g., NDVI). |
I think this defeats the purpose of an enum. Enums are supposed to be a list of all possible accepted values. If they are only some possible values, then they're basically variables. |
Maybe we can think about this in a different way -- currently, how should a user know which class variables in Maybe it is as simple as consistently providing |
I think consistently providing I'm thinking we should probably merge this PR as is and consider moving to enums in the future. Let me resolve the merge conflicts... |
1242711
to
bc2f475
Compare
Sounds good to me! @adamjstewart, meta question -- do you have a long term set of issues like this that you keep track of? |
Issues like what? I'm only pushing for this to be merged so I have one less open PR. Other than that, I just check https://github.com/microsoft/torchgeo/issues and https://github.com/microsoft/torchgeo/pulls once in a while. https://github.com/microsoft/torchgeo/milestones is used to track releases, and #931 (comment) is tracking things I'm focusing on for the next release. |
I mean high-level API things to think about for future development (e.g. "how should users know what options they have for band sets?") |
Ah, not really. I just define the API in the base class and try to stick to it. It should be documented in the base class. |
* Datasets: consistent capitalization of band names * Fix tests too
There seems to be a 50/50 mix of
RGB_BANDS
/ALL_BANDS
andrgb_bands
/all_bands
in our datasets. TheGeoDataset
base class uses lowercase, so this PR changes all other datasets to match. From what I can tell, PEP-8 doesn't seem to distinguish between class attributes and instance attributes, so I'm not sure if these are considered variables or global variables.@ashnair1 not sure if this affects #687