Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide cube summary stats for given dimension #105

Open
forman opened this issue Jun 21, 2019 · 0 comments
Open

Provide cube summary stats for given dimension #105

forman opened this issue Jun 21, 2019 · 0 comments
Labels
enhancement New feature or request important This is very important for the project xcube gen This is related to data cube generation, CLI "xcube gen" xcube serve This is related to server component, CLI "xcube serve"

Comments

@forman
Copy link
Member

forman commented Jun 21, 2019

Is your feature request related to a problem? Please describe.

  • When extracting time-series and point data from cubes performance can be drastically improved if it would be known in advance if a given time slice contains any valid data or only NaNs.
  • The xcube-viewer requires a timeline that displays the number of valid cells for a given time slice.

Describe the solution you'd like

Generalisation: For each data variable in a cube and for each dimension provide a summary statistics variable whose values are computed by aggregating cell values for the remaining dimensions. The aggregations may be min, max, mean, counts, ratio, etc. Name pattern, e.g. "f'{var_name}_{dim_name}_{agg}", for example, "CHL_time_counts". Note, we can also provide flag stats using pattern "f'{var_name}_{flag_name}_{dim_name}_{agg}", for example, "CLASSIF_CLOUD_time_ratio".

  1. xcube addstats - a CLI tool to add summary statistics to an existing cube. Use config parameter to control the generation of the stat vars.
  2. xcube gen should generate these statistics when appending new slices to existing cubes. Add extra config parameter that control the generation of the stat vars.
  3. Some API that is used by both CLI tools,

Describe alternatives you've considered

Don't store the variables in a cube but store them in a separate dataset that will be "attached" to the cube dataset. xcube serve can be configured to use such attached datasets, similar to the way recognises a multi-resolution, levelled data cube. This approach is a still valid alternative to be considered for implementation.

  • Compute these stat vars on the fly and cache results (xcube server) but this may take way to much time.

Additional context

See also #18.

@forman forman added enhancement New feature or request important This is very important for the project xcube gen This is related to data cube generation, CLI "xcube gen" xcube serve This is related to server component, CLI "xcube serve" labels Jun 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request important This is very important for the project xcube gen This is related to data cube generation, CLI "xcube gen" xcube serve This is related to server component, CLI "xcube serve"
Projects
None yet
Development

No branches or pull requests

1 participant