New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
statistics not excluding nodata correctly v4.1.9 - Need PER BAND Nodata Mask #579
Comments
can you share the file @AndrewAnnex ? |
@vincentsarago I can, but it's a bit larger than the 50mb file limit but I suspect there may be other issues with the data that makes me want to close this issue for now. It's hyperspectral data so there may be some very large values that should be considered nodata but technically are not set to the nodata value. I did make progress with a partial work-around by using the expression syntax to exclude values but that didn't work consistently due to that issue with the large values. in any case I can try to share next week but think it's fair to close this issue |
@vincentsarago here's a url to a file exhibiting the issue on band 16 http://murray-lab.caltech.edu/temp/annex/hrl0001fc92_07_sr182j_mtr3.tif (~53 mb). Using titiler I get the following stats on that band:
and I get reasonable values from gdalinfo:
Interestingly, I am just noticing how the valid percents differ here. |
I think I know what's going on! This is a mask problem. In rio-tiler we get the and for some reason we don't get the expected result. 👇 difference between dataset_mask and the nodata mask I always preferred the |
ah okay that makes sense now. You'd be more aware of the ramifications of changing the default behavior than me, but maybe using the nodata mask could be optional? |
@AndrewAnnex, sadly no because rio-tiler deals with other |
I am using the statistics endpoint in titiler on a set of multiband float32 geotiffs with nodata values of 65535 but the statistics include the nodata value as valid pixels throwing off the histograms/percentiles/max value/majority/std deviation. Valid pixel counts and masked pixel counts seem correct so something weird is going on either with my data or the rio tiler.
Running gdalinfo -stats or -hist on the same files computes correct metadata per band as expected (within range from -1 to 1) so it looks like gdal is doing the right thing in terms of excluding data. A guess I have is that nodatas are being handled differently between gdal and the numpy implementation here
The text was updated successfully, but these errors were encountered: