-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read colormap and colorinterp from GeoTIFF as an attribute #414
Conversation
Cherry-picked from pydata/xarray#3136 Co-Authored-By: Alando Ballantyne <alando@radiant.earth>
@@ -285,6 +286,7 @@ def test_open_rasterio_mask_chunk_clip(): | |||
) | |||
assert attrs == { | |||
"add_offset": 0.0, | |||
"colorinterp": (rasterio.enums.ColorInterp.gray,), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need to be converted to either a string representation or the integer representation. The reason for this is to enable calling to_netcdf
without serialization issues.
Here is an example converting to the integer enumeration value:
>>> import rasterio
>>> rasterio.enums.ColorInterp.gray
<ColorInterp.gray: 1>
>>> rasterio.enums.ColorInterp.gray.value
1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative would be to read/write the photometric
creation option of the profile into/from the attributes. This is always in the string format and shouldn't have any serialization issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the colorinterp
enum can be serialized since it has an integer representation. It's the colormap
dict which I'm having trouble with. The colormap
dict looks something like this:
{0: (248, 224, 168, 255),
1: (240, 224, 152, 255),
2: (104, 104, 96, 255),
...
255: (248, 248, 248, 255)}
Do you think we should keep this colormap
as an xarray attr, or should it be loaded into the data_var as 4 bands (i.e. RGBA)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it is a dict, you could use json.dumps
& json.loads
to convert to/from a string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, the json serialization has been implemented in 54f9560. Had to do some extra magic because the dict keys have to be int
, but JSON only allows str
keys. I did try an alternative method using yaml in 5b07d7f as suggested by https://stackoverflow.com/questions/17099556/why-do-int-keys-of-a-python-dict-turn-into-strings-when-using-json-dumps#comment24737635_17099556, but maybe will stick with JSON.
FYI, this is how the xarray.Dataset looks like:
<xarray.DataArray (band: 1, y: 180, x: 360)>
array([[[251, 251, ..., 251, 251],
[251, 252, ..., 251, 252],
...,
[ 68, 16, ..., 68, 217],
[149, 149, ..., 149, 16]]], dtype=uint8)
Coordinates:
* band (band) int64 1
* x (x) float64 -179.5 -178.5 -177.5 -176.5 ... 177.5 178.5 179.5
* y (y) float64 89.5 88.5 87.5 86.5 ... -86.5 -87.5 -88.5 -89.5
spatial_ref int64 0
Attributes:
scale_factor: 1.0
add_offset: 0.0
colormap: {"0": [248, 224, 168, 255], "1": [240, 224, 152, 255], "2"...
colorinterp: (<ColorInterp.palette: 2>,)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should keep this
colormap
as an xarray attr, or should it be loaded into the data_var as 4 bands (i.e. RGBA)?
Just to follow up on the data_var loading, I tried converting the earth_day_01d_p.tif
GeoTiff to NetCDF directly using gdal_translate -of NetCDF -expand rgba earth_day_01d_p.tif earth_day_01d_p.nc
.
This is the output of gdalinfo earth_day_01d_p.nc
Driver: netCDF/Network Common Data Format
Files: earth_day_01d_p.nc
Size is 512, 512
Metadata:
NC_GLOBAL#Conventions=CF-1.5
NC_GLOBAL#GDAL=GDAL 3.3.2, released 2021/09/01
NC_GLOBAL#GDAL_AREA_OR_POINT=Area
NC_GLOBAL#history=Sat Nov 06 23:41:47 2021: GDAL CreateCopy( earth_day_01d_p.nc, ... )
Subdatasets:
SUBDATASET_1_NAME=NETCDF:"earth_day_01d_p.nc":Band1
SUBDATASET_1_DESC=[180x360] Band1 (8-bit integer)
SUBDATASET_2_NAME=NETCDF:"earth_day_01d_p.nc":Band2
SUBDATASET_2_DESC=[180x360] Band2 (8-bit integer)
SUBDATASET_3_NAME=NETCDF:"earth_day_01d_p.nc":Band3
SUBDATASET_3_DESC=[180x360] Band3 (8-bit integer)
SUBDATASET_4_NAME=NETCDF:"earth_day_01d_p.nc":Band4
SUBDATASET_4_DESC=[180x360] Band4 (8-bit integer)
Corner Coordinates:
Upper Left ( 0.0, 0.0)
Lower Left ( 0.0, 512.0)
Upper Right ( 512.0, 0.0)
Lower Right ( 512.0, 512.0)
Center ( 256.0, 256.0)
So the bands have been expanded into NetCDF subdatasets. I then loaded earth_day_01d_p.nc
using rioxarray.open_rasterio, and it produces the following:
<xarray.Dataset>
Dimensions: (y: 180, x: 360, band: 1)
Coordinates:
* y (y) float64 89.5 88.5 87.5 86.5 85.5 ... -86.5 -87.5 -88.5 -89.5
* x (x) float64 -179.5 -178.5 -177.5 -176.5 ... 176.5 177.5 178.5 179.5
* band (band) int64 1
crs int64 0
Data variables:
Band1 (band, y, x) uint8 ...
Band2 (band, y, x) uint8 ...
Band3 (band, y, x) uint8 ...
Band4 (band, y, x) uint8 ...
Attributes:
Conventions: CF-1.5
GDAL: GDAL 3.3.2, released 2021/09/01
GDAL_AREA_OR_POINT: Area
history: Sat Nov 06 23:41:47 2021: GDAL CreateCopy( earth_day...
Notice the Band1-Band4 data variables. So I'm thinking there should be a way to mimic this structure. Need to find a way to use the colormap dict currently in the attr, and get these Bands 1/2/3/4 information out as data variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mind doing ncdump -h earth_day_01d_p.nc
and pasting the output here?
rioxarray/_io.py
Outdated
if hasattr(riods, "colormap"): | ||
# A dict containing the colormap for a band | ||
try: | ||
attrs["colormap"] = riods.colormap(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only gets the colormap for the first band. I imagine it would be good to handle this for all of the bands. docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, probably need to loop through all the bands, but need to think about https://github.com/corteva/rioxarray/pull/414/files#r722041137 first 🙂
#429 should help. |
Codecov Report
@@ Coverage Diff @@
## master #414 +/- ##
==========================================
+ Coverage 96.07% 96.11% +0.04%
==========================================
Files 13 13
Lines 1530 1546 +16
==========================================
+ Hits 1470 1486 +16
Misses 60 60
Continue to review full report at Codecov.
|
# write colorinterp and colormap | ||
# see https://rasterio.readthedocs.io/en/latest/topics/color.html | ||
try: | ||
raster_handle.colorinterp = tags["colorinterp"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mind using separate try/except blocks for colorinterp and colormap?
Closing because this PR has become stale. Feel free to re-open or submit a new PR if you would like to contribute this feature. |
Preliminary support to read in and write color interpretation information from GeoTIFF files. Partially based on @alando46's code at pydata/xarray#3136. Rasterio reference at https://rasterio.readthedocs.io/en/latest/topics/color.html
docs/history.rst
for all changes anddocs/rioxarray.rst
for new API