Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COG #93

Open
cmheazel opened this issue Aug 5, 2020 · 24 comments
Open

COG #93

cmheazel opened this issue Aug 5, 2020 · 24 comments
Labels
2020-08 Sprint Extension Will be addressed by a future extension Users Guide This issue will be resolved in part of whole through an entry in the Users Guide

Comments

@cmheazel
Copy link
Contributor

cmheazel commented Aug 5, 2020

Can we support Cloud Optimized GeoTIFF (COG) through a Coverage API?

COG depends on HTTP Range Requests (RFC 7233). This is not currently required for OGC APIs.

@bradh
Copy link
Contributor

bradh commented Aug 6, 2020

As I understand it, COG isn't required to be accessed by Range Requests. It could be accessed as a whole file, or as tiles (or probably some Coverage thing that I don't understand yet). Behind that API facade the query to the source file could take advantage of the COG conventions, but that might be an optimisation rather than a requirement.

There might be some value in being able to say that the file is directly (efficiently) available using HTTP, and whether that supports Range Requests. That actually delivery of that could be outside of the Coverage API, where the Coverage API is just providing the URL and access info as more metadata.

There could also be an extension for OGC Coverage API that supports Range Requests so the client can treat the resulting file as if it was a COG. That could work even if GeoTIFF with COG conventions isn't the file representation backing it - e.g. its really a Processing output, or I'm actually faking the COG using tiled NITF, or something like that.

@pebau
Copy link
Contributor

pebau commented Aug 7, 2020

Fully agreed to all you say, just one point:

Accessing rangeset/ is a flawed idea in the first place (which I have explained over and over again). It cannot generate a GeoTIFF because the coordinates sit in the domain set which is not part of the range set, trivially.

Today, a GeoTIFF you always (!) obtain by requesting a coverage and encoding it in TIFF. See the spec: https://portal.opengeospatial.org/files/?artifact_id=54813 .

@bradh
Copy link
Contributor

bradh commented Aug 7, 2020

I am not sure I see the problem. How does rangeset come in to this? Sorry I haven't seen the previous discussion. Possibly I just need a more specific reference in that doc.

@pebau
Copy link
Contributor

pebau commented Aug 7, 2020

Ah, my bad, sorry - indeed, there is quite some history behind. Essentially, it s about not asking for /coverage but for /coverage/rangeset . But I see now that you were referring to something different.

So remains the "fully agreed"part. And an idea that COG might get reflected in the CIS GeoTIFF encoding, maybe by adding a conformance class. Could be a pretty low-hanging fruit, but then I am not familiar with all COG details, so just my 2 cents.

@joanma747
Copy link

COG is just a GeoTIFF that is structured to be fast usign tiling and overview.
https://www.cogeo.org/in-depth.html

A Cloud Optimized GeoTIFF (COG) is a regular GeoTIFF file, aimed at being hosted on a HTTP file server, with an internal organization that enables more efficient workflows on the cloud. It does this by leveraging the ability of clients issuing ​HTTP GET range requests to ask for just the parts of a file they need.

The whole thing is about accessing parts of the file using an header that is curiously called "range" (that is more or less the same concept in CIS). This is very "TIFF" format focussed.

I believe you can design a set of WCS requests (e..g using subsetting etc) that, of course, can be mapped to the Coverage API, that just get you the parts of the file you need. If the internal storage is a COG, then it will be faster. It will be like accessing tiles from a WMS (the so called WMS-C, in the past). We will need to expose some information about the internal structure of the COG. It will be something similar to expose a TileMatrixSet.

We need more experimentation on the to see how we could do it in practice but, a priori I cannot see why should not work.

@bradh
Copy link
Contributor

bradh commented Aug 11, 2020

I think its worth considering how and why people expect to use COG. Just because we could doesn't mean we should.

Perhaps it would help to state the problem(s) that this technology solution might address?

@jerstlouis
Copy link
Member

jerstlouis commented Aug 11, 2020

@bradh I believe this solves the efficient overview problem, i.e. being to get the data at the needed resolution using multi-resolution pyramids of data.

Related to https://github.com/opengeospatial/OGC-API-Sprint-August-2020/issues/7.

But as @joanma747 , I agree that COG is mostly a back-end implementation approach, not something the API itself defines.

@rouault
Copy link

rouault commented Aug 11, 2020

A server that returns GeoTIFF as an output media type could expose it with a mimetype (possibly derived from the geotiff one with an extra subtype) that indicates that it is a COG, if it is.

@Schpidi
Copy link
Member

Schpidi commented Aug 11, 2020

I very much like the idea of requesting "tiles" of a coverage reusing the concepts defined in OAPI-Tiles as discussed yesterday, e.g. via paths like /collections/mycov/coverage/tiles/.... This would allow service providers to tune their service for performance for example by internally using a COG aligned with the /tiles.

@jerstlouis
Copy link
Member

jerstlouis commented Aug 11, 2020

@Schpidi currently we have those at {coverageId}/tiles, but {coverageId}/coverage/tiles would work just as well.
For features {collectionId}/items/tiles was not seen as proper.
/coverage/tiles would allow having both features tiles at {collectionId}/tiles and coverage tiles at {collectionId}/coverage/tiles

I also wonder whether COG would support variable width TileMatrixSets?

@rouault
Copy link

rouault commented Aug 11, 2020

I also wonder whether COG would support variable width TileMatrixSets?

no, I'm not aware of any mainstream raster format that does.

@jerstlouis
Copy link
Member

@rouault thanks for the clarification.

It is possible to implement with this draft specifications for TileMatrixSet support in GeoPackage:

https://gitlab.com/imagemattersllc/ogc-vtp2/-/blob/master/extensions/14-tile-matrix-set.adoc

Not yet standardized though :\

@bradh
Copy link
Contributor

bradh commented Aug 11, 2020

I'm playing devil's advocate here: why do you want overviews? Why not just ask the API for the scale you want?

@jerstlouis
Copy link
Member

@bradh well that's exactly what overviews are, right?
Using COG on the server (or some other multi-resolution tile pyramid) allows to efficiently respond to scaled requests.

@jratike80
Copy link

COG on the server gives little extra value. Using tiled GeoTIFFs with a good set of overviews is a well known standard recipe for image servers (here 2013 version https://www.slideshare.net/geosolutions/gs-steroids-sgiannecfoss4g20130103). If the image is on the server disk it is not so important how the GeoTIFF is organized internally, access to IDFs and to image data is anyway fast and cheap. Of course COG works fine from local disks as well but it is especially tuned for http range requests. But if OACov server reads source data from the cloud with http access, having the data just as COG is perfect choice. Maybe you could clarify a bit what you want to achieve with the COG support?

@bradh
Copy link
Contributor

bradh commented Aug 12, 2020

I'm trying to distinguish between a server that uses geotiffs that happen to have convenient tiling and overviews (i.e. implementation detail), and the COG CONOPS as a part of the OGC API standards.

@Schpidi
Copy link
Member

Schpidi commented Oct 7, 2020

Coverages SWG call: Consensus was to not explicitly include nor preclude anything about COGs. An example could be included in the User Guide.

@Schpidi Schpidi transferred this issue from opengeospatial/OGC-API-Sprint-August-2020 Oct 7, 2020
@Schpidi Schpidi added 2020-08 Sprint Users Guide This issue will be resolved in part of whole through an entry in the Users Guide labels Oct 7, 2020
@jerstlouis
Copy link
Member

jerstlouis commented Dec 1, 2021

Should we perhaps have a conformance class for COGs (and e.g. at least recommend overviews)?
We may also need to define conformance classes specific to particular encodings, as done in e.g. Features (GeoJSON, GML) & Tiles (PNG, JPG, TIFF, netCDF).

At least CISJSON, CIS GML, CIS RDF, GeoTIFF, GeoTIFF-COG, netCDF, Zarr, CoverageJSON...?

An OGC API - Coverages implementation that supports COG could be deployed on a static server like an S3 bucket, and COG HTTP ranges with overviews could serve similar use case to Subset & Scaling.

@pebau
Copy link
Contributor

pebau commented Dec 1, 2021

are you aware of the encodings available with CIS?

@jerstlouis
Copy link
Member

@pebau yes, added GML and RDF :)

@joanma747
Copy link

COG and HTTP range is a solution where setting up a WCS or a OGC API coverages is not possible (e.g. a data center that is scared about new code) but it is simple because libraries to read COG exist. Creating a reader for COG by yourself is not an easy task at all.

WCS or OGC API coverages is much easy to use for a client that does not want or cannot use libraries (if a simple "raster" format is used/negotiated; avoid, e.g, JPEG2002 and go for a "raw" format)

So there is a place for each solution.

It is possible to use COG as a backend of a WCS / OGC API coverages and build a server implementation that takes advantage of the COG structure in terms for increasing performance.

It is also possible to have a WCS / OGC API coverages that advertises the tile structure and then clients could adapt and request only subsets that corresponds to tiles and overviews simplifying the work on the server side and allowing for more concurrent request to be managed.

@pebau, it is the last case where a format extension for WCS or OGC API coverages will be useful: It will specify how to expose the tile and overviews structure (the 2d-TMS data structure will be my selection) and recommend the client to limit requests to single tiles only.

@jerstlouis
Copy link
Member

jerstlouis commented Dec 1, 2021

@joanma747

COG and HTTP range is a solution where setting up a WCS or a OGC API coverages is not possible (e.g. a data center that is scared about new code)

A static server may also be much more economical than the ability to execute arbitrary code.

It will specify how to expose the tile and overviews structure (the 2d-TMS data structure will be my selection) and recommend the client to limit requests to single tiles only.

Even without exposing the tile and overviews, if the COG follows a 2x2 pyramidal pattern like 2D TMS, any type of requests focusing on a particular Area & Resolution of interest should still get most of those same benefits, because you would always end up requesting less than a whole tile extra in subsetting requests, and the extra resolution will always be less than twice as that which is required.

If clients have to start worrying about the tiling structure, then sometimes perhaps they might as well implement Coverage Tiles.

@cmheazel cmheazel added the Close Issue appears to be resolved and can be closed label Mar 30, 2022
@jerstlouis jerstlouis added the Extension Will be addressed by a future extension label Mar 30, 2022
@jerstlouis
Copy link
Member

SWG 2022-03-30: We might define a conformance class to advertise that /coverage conforms to COG and supports HTTP rang - tagging as extension.

@jerstlouis jerstlouis removed the Close Issue appears to be resolved and can be closed label Mar 30, 2022
@jerstlouis
Copy link
Member

Should this be included as separate conformance classes OGC API - Coverages - Part 1: Core, e.g in #166 (e.g., COG, COPC, Zarr as separate conformance classes), or a single conformance class for Cloud Optimized and HTTP range output (for all those formats)?

Discussed that in addition to reference the cloud native format, we may want to add requirements that are implied by declaring conformance to this:

  • Support for HTTP range
  • Including overviews (to efficiently support multiple resolutions)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2020-08 Sprint Extension Will be addressed by a future extension Users Guide This issue will be resolved in part of whole through an entry in the Users Guide
Projects
Core release
  
Next Discussions
Development

No branches or pull requests

8 participants