Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert REMA data to cloud optimized geotiff #1

Open
rabernat opened this issue Mar 19, 2019 · 13 comments
Open

convert REMA data to cloud optimized geotiff #1

rabernat opened this issue Mar 19, 2019 · 13 comments

Comments

@rabernat
Copy link
Member

rabernat commented Mar 19, 2019

Cloud optimized geotiff is an image format optimized for high-throughput reading via the web: http://www.cogeo.org/

The REMA data is not in COG format. A first step towards making it work efficiently in pangeo is to convert it.

I played around with this a bit, starting by downloading this sample data

wget http://data.pgc.umn.edu/elev/dem/setsm/REMA/mosaic/v1.1/8m/09_41/09_41_8m.tar.gz
tar -xvzf 09_41_8m.tar.gz

To create a cogs and stuff, you need to install the rasterio cog extension. One you have it, you can examine the file metadata.

$ rio info 09_41_8m_dem.tif
{"bounds": [1000000.0, -2200000.0, 1100000.0, -2100000.0], "colorinterp": ["grey"], "compress": "lzw", "count": 1, "crs": "EPSG:3031", "descriptions": [null], "driver": "GTiff", "dtype": "float32", "height": 12500, "indexes": [1], "interleave": "band", "lnglat": [153.9704078084865, -68.23346907615766], "mask_flags": [["nodata"]], "nodata": -9999.0, "res": [8.0, 8.0], "shape": [12500, 12500], "tiled": false, "transform": [8.0, 0.0, 1000000.0, 0.0, -8.0, -2100000.0, 0.0, 0.0, 1.0], "units": [null], "width": 12500} 

This is not a cog. We can make one like this

$ rio cogeo 09_41_8m_dem.tif 09_41_8m_dem_COG_RAW.tif --cog-profile raw
$ rio info 09_41_8m_dem_COG_RAW.tif
{"blockxsize": 512, "blockysize": 512, "bounds": [1000000.0, -2200000.0, 1100000.0, -2100000.0], "colorinterp": ["grey"], "count": 1, "crs": "EPSG:3031", "descriptions": [null], "driver": "GTiff", "dtype": "float32", "height": 12500, "indexes": [1], "interleave": "band","lnglat": [153.9704078084865, -68.23346907615766], "mask_flags": [["per_dataset"]], "nodata": null, "res": [8.0, 8.0], "shape": [12500,12500], "tiled": true, "transform": [8.0, 0.0, 1000000.0, 0.0, -8.0, -2100000.0, 0.0, 0.0, 1.0], "units": [null], "width": 12500}

This is tiled (see blocksize) but with no compression. It's better to use compression:

$ rio cogeo 09_41_8m_dem.tif 09_41_8m_dem_COG_LZW.tif --cog-profile lzw
$ rio info 09_41_8m_dem_COG_LZW.tif
{"blockxsize": 512, "blockysize": 512, "bounds": [1000000.0, -2200000.0, 1100000.0, -2100000.0], "colorinterp": ["grey"], "compress": "lzw", "count": 1, "crs": "EPSG:3031", "descriptions": [null], "driver": "GTiff", "dtype": "float32", "height": 12500, "indexes": [1], "interleave": "band", "lnglat": [153.9704078084865, -68.23346907615766], "mask_flags": [["per_dataset"]], "nodata": null, "res": [8.0, 8.0], "shape": [12500, 12500], "tiled": true, "transform": [8.0, 0.0, 1000000.0, 0.0, -8.0, -2100000.0, 0.0, 0.0, 1.0], "units": [null], "width": 12500}

Let's compare the sizes of the files

$ ls -lh 09_41_8m_dem*.tif
-rw-rw-r-- 1 jovyan users 290M Feb 13 03:20 09_41_8m_dem_COG_LZW.tif
-rw-rw-r-- 1 jovyan users 865M Feb 21 21:38 09_41_8m_dem_COG_RAW.tif
-rw-rw---- 1 jovyan users 228M Oct 26 15:09 09_41_8m_dem.tif

We see that the LSW compressed COG has the same size as the original file, but it is MUCH more useful!

@jjspergel
Copy link
Collaborator

Is the code missing the command "create"?

@rabernat
Copy link
Member Author

Hi julian? You create a cog as follows:

$ rio cogeo 09_41_8m_dem.tif 09_41_8m_dem_COG_LZW.tif --cog-profile lzw

@jjspergel
Copy link
Collaborator

Hi Ryan, Is it possible that this could be a Mac/PC issue? My command prompt doesn't recognize what you've written as is as a valid command because the filename is not considered a command. I'm able to create a cog with:
rio cogeo create 38_49_8m_dem.tif 38_49_8m_dem_COG_LZW.tif --cog-profile lzw
Julian

@rabernat
Copy link
Member Author

Did you install the cog extension?

https://github.com/cogeotiff/rio-cogeo

@jjspergel
Copy link
Collaborator

jjspergel commented Mar 20, 2019 via email

@rabernat
Copy link
Member Author

Ok I think I see what is going on. The syntax has actually changed since I first installed and ran these commands:

https://github.com/cogeotiff/rio-cogeo/blob/master/CHANGES.txt

The following changes were introduced in the version released on March 15

1.0b0 (2019-03-15)
------------------
- add more logging and `--quiet` option (#46)
- add `--overview-blocksize` to set overview's internal tile size (#60)

Bug fixes:

- copy tags and description from input to output (#19)
- copy input mask band to output mask

Breacking Changes:

- rio cogeo now has subcommands: 'create' and 'validate' (#6).
- internal mask creation is now optional (--add-mask).
- internal nodata or alpha channel can be forwarded to the output dataset.
- removed default overview blocksize to be equal to the raw data blocksize (#60)

@jjspergel
Copy link
Collaborator

jjspergel commented Mar 20, 2019 via email

@jkingslake
Copy link
Collaborator

Hi,
I got the geo-scipy environment installed thanks to your online course materials Ryan.
I am now running the code that Julian sent me:
forfiles /s /m *_dem.tif /c "cmd /c rio cogeo create @file @fname_COG_LZW.tif --cog-profile lzw
and it seems to be working very nicely and producing COGs for each DEM tif. I should also remember to do this for all the DAY tiffs (the ones showing which day the each pixel corresponds to). (i know i could have done this in one step, but I didnt think about it and now its running!)
Jonny

@jkingslake
Copy link
Collaborator

One question: I am getting the following warning. Does this matter do you think?
WARNING:rasterio._env:CPLE_NotSupported in driver GTiff does not support creation option BIGTIFF

The COGs open in arcmap OK and everything looks good.

@rabernat
Copy link
Member Author

To double check that the conversion is working, try running:

rio info <filename>

and examine the output

@jkingslake
Copy link
Collaborator

jkingslake commented Mar 29, 2019

Looks good to me:
J:\REMA_Mosaic_8m_v1.1\8m\09_38>rio info 09_38_8m_dem_COG_LZW.tif
{"blockxsize": 512, "blockysize": 512, "bounds": [700000.0, -2200000.0, 800000.0, -2100000.0], "colorinterp": ["gray"], "compress": "lzw", "count": 1, "crs": "EPSG:3031", "descriptions": [null], "driver": "GTiff", "dtype": "float32", "height": 12500, "indexes": [1], "interleave": "band", "lnglat": [160.76932762433867, -69.26294588007902], "mask_flags": [["nodata"]], "nodata": -9999.0, "res": [8.0, 8.0], "shape": [12500, 12500], "tiled": true, "transform": [8.0, 0.0, 700000.0, 0.0, -8.0, -2100000.0, 0.0, 0.0, 1.0], "units": [null], "width": 12500}

Strange thing is that when I get info on many of the original files they also apparently have a block size, e.g.:
(geo_scipy) J:\12_49_8m>rio info 12_49_8m_dem.tif
{"blockxsize": 256, "blockysize": 256, "bounds": [1800000.0, -1900000.0, 1900000.0, -1800000.0], "colorinterp": ["gray"], "compress": "lzw", "count": 1, "crs": "EPSG:3031", "descriptions": [null], "driver": "GTiff", "dtype": "float32", "height": 12500, "indexes": [1], "interleave": "band", "lnglat": [134.99999999999997, -66.2526713074506], "mask_flags": [["nodata"]], "nodata": -9999.0, "res": [8.0, 8.0], "shape": [12500, 12500], "tiled": true, "transform": [8.0, 0.0, 1800000.0, 0.0, -8.0, -1800000.0, 0.0, 0.0, 1.0], "units": [null], "width": 12500}

I don't see this in the one you originally looked at though:
(geo_scipy) J:\REMA_Mosaic_8m_v1.1\8m\09_41>rio info 09_41_8m_dem.tif
{"bounds": [1000000.0, -2200000.0, 1100000.0, -2100000.0], "colorinterp": ["gray"], "compress": "lzw", "count": 1, "crs": "EPSG:3031", "descriptions": [null], "driver": "GTiff", "dtype": "float32", "height": 12500, "indexes": [1], "interleave": "band", "lnglat": [153.9704078084865, -68.23346907615766], "mask_flags": [["nodata"]], "nodata": -9999.0, "res": [8.0, 8.0], "shape": [12500, 12500], "tiled": false, "transform": [8.0, 0.0, 1000000.0, 0.0, -8.0, -2100000.0, 0.0, 0.0, 1.0], "units": [null], "width": 12500}

@rabernat
Copy link
Member Author

ok that's an important catch! Some of these may already be COGs!

Reading the cogeo docs (https://github.com/cogeotiff/rio-cogeo) it looks like there is also a validate command

rio cogeo validate <filename>

That is probably better than rio info.

@jkingslake
Copy link
Collaborator

OK, yes validate is the better one to use.
Some files don't return a blocksize when using "rio info"--> not COGs
Some files DO return a blocksize when using "rio info", but isnt recognized as a COG by 'validate"
Some files DO return a blocksize when using "rio info", ARE recognized as a COG by 'validate", but don't have internal overviews, which is recommended apparently.

So overall, going through them all and running this conversion seems to be the safest things to do.

Thanks for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants