Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RasterSource Catalog #162

Closed
echeipesh opened this issue Oct 10, 2019 · 0 comments · Fixed by #207
Closed

RasterSource Catalog #162

echeipesh opened this issue Oct 10, 2019 · 0 comments · Fixed by #207
Assignees

Comments

@echeipesh
Copy link
Collaborator

We wish to be able to query a spatiotemporal "catalog" using RasterSource interface. This could be a GeoTrellis layer with temporal dimensions or it could be a STAC catalog where there are multiple COGs (Cloud Optimized GeoTiffs) available with differing time stamps.

In either case we are currently limited because the RasterSource interface presents a single raster both in its metadata and in its read method interfaces.

It is not desirable to further complicate the RasterSource interface by adding temporal query capability. Further it is likely that different time slices have may be spread over multiple files. Switching source of IO and potentially reading metadata should not be hidden from user of RasterSource interface at those are potentially expensive and error-prone operations that must be explicit and carefully considered by the user.

Instead we should preserve RasterSource as a "spatial slice" of otherwise potentially multi-dimensional dataset. Therefore "something" should be able to produce RasterSource instances given a spatiotemporal query. Let's call this something RasterCatalog because it maps nicely onto STAC catalog.

At a minimum:

trait RasterCatalog {
  def find(query: Query): Seq[RasterSource]
}

Since RasterSource as an interface is supposed to be lazy we can return large number of them safely.Additionally if the query source (like STAC catalog) had record the raster metadata, it could be included in a specialized wrapper, further delaying the initial header read until the first RasterSource.read() call is made.

Query type should encapsulate the query as an ADT, so it can inspected, serialized and optimized before actual execution.

Critically this interface should avoid making any assumptions about structure of the data because it can commonly take many forms.

For instance:

  • GeoTrellis spatiotemporal layer/pyramid
  • STAC catalog
  • Folder of COGs with timestamp encoded in filename
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants