Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let multi-layer geotiff files be cut via gdal_translate #5819

Merged
merged 11 commits into from
Oct 5, 2021
Merged

Conversation

PaulWessel
Copy link
Member

While GMT can handle the cutting of images (including those straddling periodic boundaries) it cannot (yet) natively cut subsections out of multilayer data geotiff files. This PR allows such usage via grdcut by passing the baton to gdal_translate for getting subsections of such data files. The assumption is that such files are not global periodic files so this sophistication may not be needed in GMT. I updated the grdcut docs to list this capability as well.

I made a few other changes for the future:

  1. If image type gets assigned to be float or double I allocate the right thing (gmt_api.c)
  2. If image is more than one band then I do not compute the min/max at all (gmt_gdalread.c)
  3. For float layers the code hardwires for a single layer. While that is unchanged, I am adding a layer offset so that if this every change then the layers are not overwriting the same spot in the array (gmt_gdalread.c)
  4. Correctly remember the data type if float or double and pass the correct float or int pointer to gdalread (gmt_grdio.c).

No changes to gmt_gdalwrite.c at this time since no multi-layer float images will be sent to it.

While GMT can handle the cutting of images (including those straddling periodic boundaries) it cannot (yet) natively cut subsections out of multilayer data geotiff files. This PR allows such usage via grdcut by passing the baton to gdal_translate for getting subsections of such data files.
@PaulWessel PaulWessel added the enhancement Improving an existing feature label Sep 29, 2021
@PaulWessel PaulWessel self-assigned this Sep 29, 2021
@joa-quim
Copy link
Member

You need to use some create options otherwise the GeoTIFF files are uncompressed and huge. For floats maybe save in netCDF. See this function for the defaults I use in Julia

https://github.com/GenericMappingTools/GMT.jl/blob/master/src/gdal_tools.jl#L214

@PaulWessel
Copy link
Member Author

OK, since this is geotiff's only I added compression.

@joa-quim
Copy link
Member

joa-quim commented Oct 1, 2021

OK, since this is geotiff's only I added compression.

Right but notice that noting stops calling the output file lixo.nc and it become a geotiff disguised of netcdf.

@PaulWessel
Copy link
Member Author

That is true. But not different from surface data -Goutput.txt or -Gshit.jpg, no? You get what you deserve.

Yet, I can certainly add a check that the output file has to have .tif or tiff as well in this case. Note that if someone used +b to specify band 8 only of that geotiff then grdcut will obtain a grid instead of write a grid.

@PaulWessel
Copy link
Member Author

The latest commits adds more checks and documentation:

  1. The documentation clarifies what happens on output when you give a multiband geotiff
  2. I parse any band requests and build the proper gdal_translate command if bands are specified
  3. I check that multiband geotiff output has a filename that ends in .tif
  4. I allow a single band to be written as a geotiff file instead of a grid if the output file ends in .tif

Example commands that work for me to illustrate the usage:

Grab layers 6,7,9 from a geotiff and write a subset to another geotiff:
gmt grdcut sentinel_mosaic.tif+b5,6,8 -R-162.6/-162/61.25/61.5 -Gcut.tif

Select a single band from a geotiff and write a subset as a grid
gmt grdcut sentinel_mosaic.tif+b5 -R-162.6/-162/61.25/61.5 -Gcut.grd

Select a single band from a geotiff and write a subset as another single-band geotiff
gmt grdcut sentinel_mosaic.tif+b5 -R-162.6/-162/61.25/61.5 -Gcut.tif

@joa-quim
Copy link
Member

joa-quim commented Oct 2, 2021

Just to clarify one point. Modern GDALs pick the file format based on file extensions, so if extracting a float from one of those bands it would not be unreasonable to select a .nc format, but it would request more work to not let it go with the poor defaults of no-create-options. I.e. .nc with no compression.

@PaulWessel
Copy link
Member Author

Just to clarify one point. Modern GDALs pick the file format based on file extensions, so if extracting a float from one of those bands it would not be unreasonable to select a .nc format, but it would request more work to not let it go with the poor defaults of no-create-options. I.e. .nc with no compression.

Right. But if you select a single float band and do not give a .tif extension, then we read in the layer as a grid and now the subset is written via gmt_nc.c and compression/tiling as per GMT settings.

Are you concerned that someone might select 5 bands and give an output file as out.nc and if so we should pass a different driver etc. There is no netCDF inside a geotiff, right? I think writing multibands requires .tiff, no?

And update the docs to say so, and make the wesn_new check more robust for longitudes.
@joa-quim
Copy link
Member

joa-quim commented Oct 2, 2021

Are you concerned that someone might select 5 bands and give an output file as out.nc and if so we should pass a different driver etc. There is no netCDF inside a geotiff, right? I think writing multibands requires .tiff, no?

Not concerned but having that possibility would be more generic. Multibands can be in netcdf too. For example this reads a multiband UInt16 in GeoTIFF, computes the radiance at the Top of Atmosphere (floats) and save it as a floats multiband in netcdf.

julia> dn2radiance("c:/v/LC08_L1TP_20210525_02_cube.tiff", save="cube_rad.nc");
gdalinfo cube_rad.nc
Driver: netCDF/Network Common Data Format
Files: cube_rad.nc
Size is 512, 512
Metadata:
  NC_GLOBAL#Conventions=CF-1.5
  NC_GLOBAL#GDAL=GDAL 3.4.0dev, released 2021/99/99
  NC_GLOBAL#history=Sat Oct 02 01:56:11 2021: GDAL CreateCopy( cube_rad.nc, ... )
Subdatasets:
  SUBDATASET_1_NAME=NETCDF:"cube_rad.nc":Band1
  SUBDATASET_1_DESC=[1567x1519] Band1 (32-bit floating-point)
  SUBDATASET_2_NAME=NETCDF:"cube_rad.nc":Band2
  SUBDATASET_2_DESC=[1567x1519] Band2 (32-bit floating-point)
  SUBDATASET_3_NAME=NETCDF:"cube_rad.nc":Band3
...

@PaulWessel
Copy link
Member Author

OK, so this is all via gdal_translate with the different (netCDF) driver?

@joa-quim
Copy link
Member

joa-quim commented Oct 2, 2021

Not the gdal_translate.exe but the librarified version through Julia FFI (like it calls GMT or grdgdal calls GDAL functions) and the driver is automatically picked by the .nc extension. Not wanting to dive into grdgdal for this but maybe the system call to gdal_translate can accomplish the same if we let the driver be picked by the file name extension.

@PaulWessel
Copy link
Member Author

I tested it and it works will. So I can do this: If extension is .nc or .grd (any others) then we switch to the netCDF driver and it will write that multi-layer netcdf.

@joa-quim
Copy link
Member

joa-quim commented Oct 2, 2021

For the netcdf you need to set the compression level and better to set also the nodata.

"-a_nodata NaN -co FORMAT=NC4 -co ZLEVEL=4" 

or use IO_NC4_DEFLATION_LEVEL

@PaulWessel
Copy link
Member Author

For the netcdf you need to set the compression level and better to set also the nodata.

"-a_nodata NaN -co FORMAT=NC4 -co ZLEVEL=4" 

or use IO_NC4_DEFLATION_LEVEL

Thanks, done.

@PaulWessel
Copy link
Member Author

Let me know if there are other concerns with this branch, @joa-quim and @meghanrjones. It seems pretty stable to me now.

Copy link
Member

@maxrjones maxrjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, looks good to me. It would be good to get Joaquim's approval too on this one.

@joa-quim
Copy link
Member

joa-quim commented Oct 4, 2021

Can't test it right now. In a middle of a long Julia notebook and testing requires kill everything in order to recompile GMT.

@joa-quim
Copy link
Member

joa-quim commented Oct 5, 2021

It works but we have a problem with +b. It starts counting at 1 (like everybody should).

grdcut C:\v\LC08_L1TP_20210525_02_cube.tiff+b0 -R499668/513758/4295974/4306903 -Gcucu.tiff
...
ERROR 6: Unrecognizable band number (0).      <=== a GDAL error
...
grdcut [ERROR]: Error calling gdal_translate -projwin 499275 4306935 513435 4295955 -of GTiff -co COMPRESS=DEFLATE C:\v\LC08_L1TP_20210525_02_cube.tiff cucu.tiff -b 0

Remember to add 1 to band before passing to gdal_translate...
@PaulWessel
Copy link
Member Author

Yikes, OK fixed.

@PaulWessel
Copy link
Member Author

This is the only place we call gdal_translate with band info so forgot. Elsewhere we know the drill, @joa-quim.

Copy link
Member

@joa-quim joa-quim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the +b is fixed it should be ready

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
add-changelog Add PR to the changelog enhancement Improving an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants