Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cogeo create and erroneous values for certain 16bit datasets #112

Closed
pierotofy opened this issue Dec 19, 2019 · 9 comments
Closed

cogeo create and erroneous values for certain 16bit datasets #112

pierotofy opened this issue Dec 19, 2019 · 9 comments

Comments

@pierotofy
Copy link
Contributor

pierotofy commented Dec 19, 2019

I'm opening this to document my findings, I'm not sure it's even a problem with rio-cogeo per-se, it seems to go down all the way to rasterio (1.1.0) and perhaps GDAL (2.4.3).

Input (5 bands + alpha 16bit GeoTIFF): odm_orthophoto.zip

image

Band 1 Block=512x512 Type=UInt16, ColorInterp=Red
Band 2 Block=512x512 Type=UInt16, ColorInterp=Green
Band 3 Block=512x512 Type=UInt16, ColorInterp=Blue
Band 4 Block=512x512 Type=UInt16, ColorInterp=Gray
Band 5 Block=512x512 Type=UInt16, ColorInterp=Gray
Band 6 Block=512x512 Type=UInt16, ColorInterp=Alpha

Command:

rio cogeo create odm_orthophoto.tif out.tif

Output:

image

Band 1 Block=512x512 Type=UInt16, ColorInterp=Gray  Overviews: 1000x845, 500x423
Band 2 Block=512x512 Type=UInt16, ColorInterp=Undefined  Overviews: 1000x845, 500x423
Band 3 Block=512x512 Type=UInt16, ColorInterp=Undefined  Overviews: 1000x845, 500x423
Band 4 Block=512x512 Type=UInt16, ColorInterp=Undefined  Overviews: 1000x845, 500x423
Band 5 Block=512x512 Type=UInt16, ColorInterp=Undefined  Overviews: 1000x845, 500x423
Band 6 Block=512x512 Type=UInt16, ColorInterp=Undefined  Overviews: 1000x845, 500x423

Raster values seem to have been downscaled (?) or overflowed (?) in the process:

image

Alpha band is out of whack too, but I won't worry about that for now.

I've narrowed this down to https://github.com/cogeotiff/rio-cogeo/blob/master/rio_cogeo/cogeo.py#L210. Once the WarpedVRT object is created, read operations on the virtual raster return erroneous values (they seem scaled down, eg. 24933 --> 97, which seems linear from 65535 to 255, but other values are still in the 65535 range, so it makes little sense). The data types are correct, although we've lost ColorInterp information (perhaps another problem).

This might be a problem with rasterio, or even GDAL? If I find a solution I will open a PR.

@pierotofy pierotofy changed the title cogeo create and downscaled values for certain 16bit datasets cogeo create and erroneous values for certain 16bit datasets Dec 19, 2019
@vincentsarago
Copy link
Member

thanks @pierotofy this is an interesting finding and yes might be a rasterio one, I'll have a look!

@vincentsarago
Copy link
Member

import rasterio
from rasterio.vrt import WarpedVRT

from rio_cogeo.utils import (
    has_alpha_band,
    has_mask_band,
)


with rasterio.open("odm_orthophoto.tif") as src_dst:
    b1 = src_dst.read(indexes=1)
    print(src_dst.colorinterp)
    print(src_dst.meta)
    print()
    alpha = has_alpha_band(src_dst)
    mask = has_mask_band(src_dst)
    nodata = src_dst.nodata

    with WarpedVRT(src_dst, add_alpha=False) as vrt_dst:
        print(vrt_dst.colorinterp)
        print(vrt_dst.meta)
        b1vrt = vrt_dst.read(indexes=1)
        

print()
print(b1.mean())
print(b1vrt.mean())

(<ColorInterp.red: 3>, <ColorInterp.green: 4>, <ColorInterp.blue: 5>, <ColorInterp.gray: 1>, <ColorInterp.gray: 1>, <ColorInterp.alpha: 6>)
{'driver': 'GTiff', 'dtype': 'uint16', 'nodata': None, 'width': 1999, 'height': 1690, 'count': 6, 'crs': CRS.from_epsg(32634), 'transform': Affine(0.059693290743631604, 0.0, 530112.9277083931,
       0.0, -0.059676333155328706, 5647899.369585016)}

(<ColorInterp.red: 3>, <ColorInterp.green: 4>, <ColorInterp.blue: 5>, <ColorInterp.gray: 1>, <ColorInterp.gray: 1>, <ColorInterp.alpha: 6>)
{'driver': 'VRT', 'dtype': 'uint16', 'nodata': None, 'width': 1999, 'height': 1690, 'count': 6, 'crs': CRS.from_epsg(32634), 'transform': Affine(0.059686223049270834, 0.0, 530112.9277083931,
       0.0, -0.059686223049270834, 5647899.369585016)}

14045.552881766327
4016.9519549715687

I've seen this kind of issues before (related to MASK in GDAL, let me find the github issue)

for the colorinterp it's a rio-cogeo bug (🤦‍♂) because it's not passed to https://github.com/cogeotiff/rio-cogeo/blob/master/rio_cogeo/cogeo.py#L238

cc @pierotofy

@vincentsarago
Copy link
Member

@pierotofy rasterio/rasterio#1454 was the issue I had in mind

@vincentsarago
Copy link
Member

this is definitely in rasterio.vrt.WarpedVRT + Alpha side

import rasterio
from rasterio.vrt import WarpedVRT

with rasterio.open("odm_orthophoto.tif") as src_dst:
    b1 = src_dst.read(indexes=1)
    print(src_dst.colorinterp)
    print(src_dst.meta)
    print()
 
    with WarpedVRT(src_dst, add_alpha=False) as vrt_dst:
        print(vrt_dst.colorinterp)
        print(vrt_dst.meta)
        b1vrt = vrt_dst.read(indexes=1)
        

print()
print(b1[1000:1110, 1000:1110])
print(b1vrt[1000:1110, 1000:1110])


[[14924 17096 19449 ... 18205 18394 18723]
 [15937 17416 18572 ... 21579 21171 20516]
 [16586 17149 17860 ... 23722 24701 23123]
 ...
 [18012 19048 18469 ... 14932 16023 13568]
 [16784 18762 20572 ... 12887 15288 15703]
 [16011 16894 19262 ... 13533 15151 16618]]

[[58 67 76 ... 71 72 73]
 [62 68 72 ... 84 82 80]
 [65 67 69 ... 92 96 90]
 ...
 [70 74 72 ... 58 62 53]
 [65 73 80 ... 50 59 61]
 [62 66 75 ... 53 59 65]]

When forcing nodata, the values are then 👌

import rasterio
from rasterio.vrt import WarpedVRT

with rasterio.open("odm_orthophoto.tif") as src_dst:
    b1 = src_dst.read(indexes=1)
    print(src_dst.colorinterp)
    print(src_dst.meta)
    print()
 
    with WarpedVRT(src_dst, add_alpha=False, dtype="int16", nodata=-9999) as vrt_dst:
        print(vrt_dst.colorinterp)
        print(vrt_dst.meta)
        b1vrt = vrt_dst.read(indexes=1)
        

print()
print(b1[1000:1110, 1000:1110])
print(b1vrt[1000:1110, 1000:1110])


[[14924 17096 19449 ... 18205 18394 18723]
 [15937 17416 18572 ... 21579 21171 20516]
 [16586 17149 17860 ... 23722 24701 23123]
 ...
 [18012 19048 18469 ... 14932 16023 13568]
 [16784 18762 20572 ... 12887 15288 15703]
 [16011 16894 19262 ... 13533 15151 16618]]

[[14924 17096 19449 ... 18205 18394 18723]
 [15937 17416 18572 ... 21579 21171 20516]
 [16586 17149 17860 ... 23722 24701 23123]
 ...
 [18012 19048 18469 ... 14932 16023 13568]
 [16784 18762 20572 ... 12887 15288 15703]
 [16011 16894 19262 ... 13533 15151 16618]]

@vincentsarago
Copy link
Member

The problem is always the same, when dealing with non byte 3 band+alpha dataset, GDAL seems to don't like it. This might need a ping to GDAL dev directly.

Past discussion:
OSGeo/gdal#742
https://trac.osgeo.org/gdal/wiki/rfc15_nodatabitmask

@pierotofy For now I think your solution would be to not use alpha band and switch to nodata value if possible

@pierotofy
Copy link
Contributor Author

pierotofy commented Dec 20, 2019

Thanks so much for the detailed findings! Setting a nodata value indeed provides a workaround for my use case.

Should I try to create a PR for rio create? Something of the lines:

if dtype != byte and num bands >= 3 and alpha and nodata is None:
    set nodata = 0
    warning("Attention")

@vincentsarago
Copy link
Member

you should use nodata instead of alpha when creating the file. Setting nodata value to 0 might have unexpected results when user provide an alpha band so I guess we should try to pursue this into GDAL/Rasterio to have a proper fix there

@vincentsarago
Copy link
Member

@pierotofy I'me going to close this issue here because it's not a rio-cogeo one but a rasterio/gdal one.

@pierotofy
Copy link
Contributor Author

Sounds good, thanks for looking into it. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants