Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read colormap and colorinterp from GeoTIFF as an attribute #414

Closed
wants to merge 10 commits into from

Conversation

weiji14
Copy link

@weiji14 weiji14 commented Oct 2, 2021

Preliminary support to read in and write color interpretation information from GeoTIFF files. Partially based on @alando46's code at pydata/xarray#3136. Rasterio reference at https://rasterio.readthedocs.io/en/latest/topics/color.html

@weiji14 weiji14 changed the title Read colortable and colorinterp from GeoTIFF as an attribute Read colormap and colorinterp from GeoTIFF as an attribute Oct 2, 2021
@@ -285,6 +286,7 @@ def test_open_rasterio_mask_chunk_clip():
)
assert attrs == {
"add_offset": 0.0,
"colorinterp": (rasterio.enums.ColorInterp.gray,),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be converted to either a string representation or the integer representation. The reason for this is to enable calling to_netcdf without serialization issues.

Here is an example converting to the integer enumeration value:

>>> import rasterio
>>> rasterio.enums.ColorInterp.gray
<ColorInterp.gray: 1>
>>> rasterio.enums.ColorInterp.gray.value
1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to read/write the photometric creation option of the profile into/from the attributes. This is always in the string format and shouldn't have any serialization issues.

Copy link
Author

@weiji14 weiji14 Oct 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the colorinterp enum can be serialized since it has an integer representation. It's the colormap dict which I'm having trouble with. The colormap dict looks something like this:

{0: (248, 224, 168, 255),
 1: (240, 224, 152, 255),
 2: (104, 104, 96, 255),
 ...
 255: (248, 248, 248, 255)}

Do you think we should keep this colormap as an xarray attr, or should it be loaded into the data_var as 4 bands (i.e. RGBA)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is a dict, you could use json.dumps & json.loads to convert to/from a string.

Copy link
Author

@weiji14 weiji14 Nov 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the json serialization has been implemented in 54f9560. Had to do some extra magic because the dict keys have to be int, but JSON only allows str keys. I did try an alternative method using yaml in 5b07d7f as suggested by https://stackoverflow.com/questions/17099556/why-do-int-keys-of-a-python-dict-turn-into-strings-when-using-json-dumps#comment24737635_17099556, but maybe will stick with JSON.

FYI, this is how the xarray.Dataset looks like:

<xarray.DataArray (band: 1, y: 180, x: 360)>
array([[[251, 251, ..., 251, 251],
        [251, 252, ..., 251, 252],
        ...,
        [ 68,  16, ...,  68, 217],
        [149, 149, ..., 149,  16]]], dtype=uint8)
Coordinates:
  * band         (band) int64 1
  * x            (x) float64 -179.5 -178.5 -177.5 -176.5 ... 177.5 178.5 179.5
  * y            (y) float64 89.5 88.5 87.5 86.5 ... -86.5 -87.5 -88.5 -89.5
    spatial_ref  int64 0
Attributes:
    scale_factor:  1.0
    add_offset:    0.0
    colormap:      {"0": [248, 224, 168, 255], "1": [240, 224, 152, 255], "2"...
    colorinterp:   (<ColorInterp.palette: 2>,)

Copy link
Author

@weiji14 weiji14 Nov 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should keep this colormap as an xarray attr, or should it be loaded into the data_var as 4 bands (i.e. RGBA)?

Just to follow up on the data_var loading, I tried converting the earth_day_01d_p.tif GeoTiff to NetCDF directly using gdal_translate -of NetCDF -expand rgba earth_day_01d_p.tif earth_day_01d_p.nc.

This is the output of gdalinfo earth_day_01d_p.nc

Driver: netCDF/Network Common Data Format
Files: earth_day_01d_p.nc
Size is 512, 512
Metadata:
  NC_GLOBAL#Conventions=CF-1.5
  NC_GLOBAL#GDAL=GDAL 3.3.2, released 2021/09/01
  NC_GLOBAL#GDAL_AREA_OR_POINT=Area
  NC_GLOBAL#history=Sat Nov 06 23:41:47 2021: GDAL CreateCopy( earth_day_01d_p.nc, ... )
Subdatasets:
  SUBDATASET_1_NAME=NETCDF:"earth_day_01d_p.nc":Band1
  SUBDATASET_1_DESC=[180x360] Band1 (8-bit integer)
  SUBDATASET_2_NAME=NETCDF:"earth_day_01d_p.nc":Band2
  SUBDATASET_2_DESC=[180x360] Band2 (8-bit integer)
  SUBDATASET_3_NAME=NETCDF:"earth_day_01d_p.nc":Band3
  SUBDATASET_3_DESC=[180x360] Band3 (8-bit integer)
  SUBDATASET_4_NAME=NETCDF:"earth_day_01d_p.nc":Band4
  SUBDATASET_4_DESC=[180x360] Band4 (8-bit integer)
Corner Coordinates:
Upper Left  (    0.0,    0.0)
Lower Left  (    0.0,  512.0)
Upper Right (  512.0,    0.0)
Lower Right (  512.0,  512.0)
Center      (  256.0,  256.0)

So the bands have been expanded into NetCDF subdatasets. I then loaded earth_day_01d_p.nc using rioxarray.open_rasterio, and it produces the following:

<xarray.Dataset>
Dimensions:  (y: 180, x: 360, band: 1)
Coordinates:
  * y        (y) float64 89.5 88.5 87.5 86.5 85.5 ... -86.5 -87.5 -88.5 -89.5
  * x        (x) float64 -179.5 -178.5 -177.5 -176.5 ... 176.5 177.5 178.5 179.5
  * band     (band) int64 1
    crs      int64 0
Data variables:
    Band1    (band, y, x) uint8 ...
    Band2    (band, y, x) uint8 ...
    Band3    (band, y, x) uint8 ...
    Band4    (band, y, x) uint8 ...
Attributes:
    Conventions:         CF-1.5
    GDAL:                GDAL 3.3.2, released 2021/09/01
    GDAL_AREA_OR_POINT:  Area
    history:             Sat Nov 06 23:41:47 2021: GDAL CreateCopy( earth_day...

Notice the Band1-Band4 data variables. So I'm thinking there should be a way to mimic this structure. Need to find a way to use the colormap dict currently in the attr, and get these Bands 1/2/3/4 information out as data variables.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind doing ncdump -h earth_day_01d_p.nc and pasting the output here?

rioxarray/_io.py Outdated
if hasattr(riods, "colormap"):
# A dict containing the colormap for a band
try:
attrs["colormap"] = riods.colormap(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only gets the colormap for the first band. I imagine it would be good to handle this for all of the bands. docs

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, probably need to loop through all the bands, but need to think about https://github.com/corteva/rioxarray/pull/414/files#r722041137 first 🙂

@snowman2
Copy link
Member

snowman2 commented Nov 8, 2021

#429 should help.

@codecov
Copy link

codecov bot commented Nov 8, 2021

Codecov Report

Merging #414 (edc9972) into master (1755f90) will increase coverage by 0.04%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #414      +/-   ##
==========================================
+ Coverage   96.07%   96.11%   +0.04%     
==========================================
  Files          13       13              
  Lines        1530     1546      +16     
==========================================
+ Hits         1470     1486      +16     
  Misses         60       60              
Impacted Files Coverage Δ
rioxarray/_io.py 95.21% <100.00%> (+0.09%) ⬆️
rioxarray/raster_writer.py 91.50% <100.00%> (+0.69%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1755f90...edc9972. Read the comment docs.

# write colorinterp and colormap
# see https://rasterio.readthedocs.io/en/latest/topics/color.html
try:
raster_handle.colorinterp = tags["colorinterp"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind using separate try/except blocks for colorinterp and colormap?

@snowman2
Copy link
Member

Closing because this PR has become stale. Feel free to re-open or submit a new PR if you would like to contribute this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants