Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data availability web page? #65

Open
wmjolly opened this issue Jan 24, 2024 · 5 comments
Open

Data availability web page? #65

wmjolly opened this issue Jan 24, 2024 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@wmjolly
Copy link
Collaborator

wmjolly commented Jan 24, 2024

Could we make an API or webpage that queries the GetCapabilities of key layers on GeoServer and provide a summary of the spatial and temporal extents of each dataset?

It would be great for monitoring and troubleshooting but also helpful in the future when we share the resource more widely.....

Example
Dataset: GFS
Spatial Extent: -180 to 180 Lon ,90 to -90 Lat
Temporal Extent: 15 Dec 2023 to 10 Feb 2024

@wmjolly wmjolly added the enhancement New feature or request label Jan 24, 2024
@alexander-petkov
Copy link
Owner

alexander-petkov commented Apr 10, 2024

Explore querying datasets via setting attributes:
image

https://docs.geoserver.org/main/en/user/data/webadmin/workspaces.html

The other way I can think of is by querying the database.

EDIT: Also explore gathering info and identifying problems via #67

@alexander-petkov
Copy link
Owner

alexander-petkov commented Apr 10, 2024

What should a status page reveal about a dataset? Some ideas:

  1. Last updated
  2. Number of granules/ temporal extent.
  3. Identify gaps in data. That leads to the question--what is a gap? There are 1,3, and 6-hourly datasets, and the definition of a gap will differ between them.
  4. Spatial coverage
  5. Retention period
  6. Number of granules:
    6.1 Expected granules
    6.2 Actual count
  7. Update frequency
  8. Last update
  9. Detected problems (yes/no or green/red icon)

| Abbreviation | Name | Description | Workspace | Number of granules | Spatial Coverage |

| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |

create table dataset_index(abbreviation varchar, name varchar, num_granules integer,  spatial_coverage varchar);

@alexander-petkov
Copy link
Owner

So here is an idea:

  1. For each data archive (workspace), have a designated "special" metadata layer, which gets updated via rest upon data retrieval.
  2. Say, it could be JSON format which gets updated. For example: metadata.json
  3. Geometry is not important, but it could be bounding envelope for example.
  4. It could hold number of granules (or features) removed/added for each layer, and consequently detected problems.

@alexander-petkov
Copy link
Owner

Checking problems with a dataset:

  1. Number of granules/rasters
  2. Number of entries in the database
  3. TIme gaps in data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
WFAS Project board
  
Awaiting triage
Development

No branches or pull requests

2 participants