Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mosaic table in PG database #22

Closed
vincentsarago opened this issue Dec 14, 2021 · 11 comments
Closed

mosaic table in PG database #22

vincentsarago opened this issue Dec 14, 2021 · 11 comments

Comments

@vincentsarago
Copy link
Member

Enable retrieving mosaic by name instead of mosaicid

name mosaicid minzoom maxzoom bounds default assets available assets
"vincent.mosaic" "ada12e921qduashdas" 0 24 [-180, -90, 180, 90] ["cog"] ["cog", "thumbnail", "raw"]
@sharkinsspatial
Copy link
Member

@vincentsarago Can we also consider having the root mosaic endpoint return a list of the mosaics and their name and mosaicid? This might assist in discoverability for some of the VEDA / Dashboard evolution work.

@vincentsarago
Copy link
Member Author

@sharkinsspatial

Can we also consider having the root mosaic endpoint return a list of the mosaics and their name and mosaicid?

Sure, but only if we go ahead with a new mosaic table in the pgstac database. It might be fine for EOapi but I'm a bit worry. The goal of eoAPI submodules is to connect to any pgstac db. if we introduce a mosaic table this might close some possibility.

Maybe stac-utils/titiler-pgstac#30 is a better possibility. We could require a specific metadata to be present (e.g type: Mosaic) and use it as a filter value

cc @bitner

@sharkinsspatial
Copy link
Member

@bitner As we're expanding the use of pgstac-titiler at NASA we have a few use cases where client applications will need to request information about all available mosaics in order to dynamically configure a list of available tile endpoints and their characteristics. With stac-utils/titiler-pgstac#38 and several follow on PRs @vincentsarago is serializing the majority of the information we need. Is it feasible from a performance perspective to include a root mosaics endpoint which would fetch, deserialize and return all mosaic hashes as is available with the current individual info endpoint https://github.com/stac-utils/titiler-pgstac/blob/0f2b5b4ba50bb3458237ab21cf9a154d7b811851/titiler/pgstac/factory.py#L359-L367? cc @anayeaye @abarciauskas-bgse

@vincentsarago
Copy link
Member Author

vincentsarago commented Mar 1, 2022

I've made an addition PR in stac-utils/titiler-pgstac#45

@sharkinsspatial let me know what you think!

Note, if we don't move forward with it in titiler-pgstac I'll totally add this in eoAPI anyway.

@bitner
Copy link
Contributor

bitner commented Mar 1, 2022

@sharkinsspatial EEEEK, I realllllly don't think you want to do that!

That endpoint lists every single search that has ever been made against the pgstac instance! If someone changes a date range, it's another record, etc.

For reference - Planetary Computer has over 4 million different records in the searches table!

I think it could be useful for something like seeing what people are searching on to debug things, but with no control over the searches that are getting recorded I don't see any possible world where it could be useful or scale to any reasonable amount as a "mosaic catalog". I'm not talking about performance here - it could perform just fine, it's more along the lines of I can't see how would you make any sense of it?

These mosaics are by design dynamic - a listing of "every dynamic thing that people can come up with" just doesn't seem right. It may be that I'm just missing something here, but I really don't see how this could be useful??? At least for the Planetary Computer, we are already seeing things in the logs where someone is setting up cron jobs that change the date range every so often and use that to grab new data -- someone could do this against a stac instance say every minute with each and every query being different, so being another record.

@vincentsarago
Copy link
Member Author

@bitner I totally get your point, but mosaic are a little less dynamic and will often be more hard coded search (e.g for static dataset like naip)

In stac-utils/titiler-pgstac#45 what I'm proposing is that we filter only search that have a specific metadata metadata.type = "mosaic" which should narrow things down.

@vincentsarago
Copy link
Member Author

vincentsarago commented Mar 2, 2022

or maybe we could use stac directly 🤷 which means that we could create a mosaic extension and store the mosaic info in a mosaic collection.

{
  "type": "Feature",
  "stac_version": "1.0.0",
  "stac_extensions": [
    "https://stac-extensions.github.io/mosaic/v1.0.0/schema.json",
  ],
  "id": "my search id",
  "bbox": [
    13.86148243891681,
    36.95257399124932,
    15.111074610520053,
    37.94752813015372
  ],
  "geometry": {
    "type": "Polygon",
    "coordinates": [
      [
        [
          13.876381589019879,
          36.95257399124932
        ],
        [
          13.86148243891681,
          37.942072015005024
        ],
        [
          15.111074610520053,
          37.94752813015372
        ],
        [
          15.109620666835209,
          36.95783951241028
        ],
        [
          13.876381589019879,
          36.95257399124932
        ]
      ]
    ]
  },
  "properties": {
    "datetime": "2021-02-21T10:00:17Z",  // or null
    "name": "my mosaic", // OPTIONAl: name of the mosaic
    "stac_assets": ["image", "cog"]  // OPTIONAl: List of available assets in each STAC records
  },
  "collection": "mosaics",
  "assets": {
    "true_color": {
      "title": "True color Mosaic",
      "href": "https://endpoint/{searchid}/{z}/{x}/{y}.jpeg",
      "options": {
        "assets": ["B4", "B3", "B2"],
        "color_formula": "Gamma RGB 3.5 Saturation 1.7 Sigmoidal RGB 15 0.35",
      }
    },
    "ndvi": {
        "title": "NDVI Mosaic",
        "href": "https://endpoint/{searchid}/{z}/{x}/{y}.jpeg",
        "options": {
            "expression": "(B4-B3)/(B4+B3)",
            "rescale": "-1,1",
            "colormap_name": "viridis",            
        }
    }
  },
  "links": []
}

Note: if we prefer moving forward with a pure STAC solution it means that when the user register a search it will have to also register a STAC item to the mosaic collection OR we will let titiler-pgstac /register endpoint do it 🤷‍♂️

@bitner
Copy link
Contributor

bitner commented Mar 2, 2022

@vincentsarago I see the point now on mosaics only being records with "mosaic" metadata. If nothing else, we would need to make sure to put an index on the searches table to make sure that the mosaics could be easily separated. Thant being said, I like your idea of a mosaic collection -- that further would allow us to use all the search mechanisms "for free" on any metadata that is stored as a mosaic item.

@bitner
Copy link
Contributor

bitner commented Mar 2, 2022

If we went the mosaic collection route, rather than having a /list endpoint it would just me /mosaics/items and would have search/filters as well as paging already in place.

@vincentsarago
Copy link
Member Author

re STAC way: I'm just a bit worry about creating a stac extension specific for titiler/titiler-pgstac. It seems to me that

put an index on the searches table to make sure that the mosaics could be easily separated

might just be easier 🙉

@sharkinsspatial
Copy link
Member

I do like the idea of modeling mosaic endpoints as STAC items (though as @vincentsarago noted, I don't like losing the consistency of all mosaic related requests occurring on the mosaic path but that seems a small issue). If we do consider this approach a few thoughts/questions.

  1. There is significant conceptual overlap with this and the existing extension proposals tiled assets, virtual assets and composite. Personally I think we can avoid alignment with tiled assets as it would be overly verbose to advertise all of a mosaic's supported TileMatrixSets and the dynamic nature of the mosaic's item composition makes maintaining the Tile Matrix Limits difficult. It might be worth considering aligning with the processing:expression field for community consistency.

  2. Should mosaic asset href expose a url template (which is not a valid href) or the link to the tilejson? How much of this information should be packaged in asset and how much should be packaged in the tilejson? I'd lean towards packaging most of the descriptive information at the asset level and keeping tilejson standardized and minimal.

It would be helpful to know what the current model that is being used for mosaic endpoint discovery by client applications. I took a quick look at https://github.com/microsoft/PlanetaryComputerDataCatalog but it might be good to know how the PC explorer is referencing the mosaics and how the application might like to leverage a discovery endpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants