Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Series-Analysis to read its own output and incrementally update output statistics over time #1371

Open
19 tasks
JohnHalleyGotway opened this issue Jun 12, 2020 · 3 comments
Assignees
Labels
alert: NEED ACCOUNT KEY Need to assign an account key to this issue priority: blocker Blocker requestor: UK Met Office United Kingdom Met Office required: FOR OFFICIAL RELEASE Required to be completed in the official release for the assigned milestone type: new feature Make it do something new
Milestone

Comments

@JohnHalleyGotway
Copy link
Collaborator

Describe the New Feature

This is a feature that was requested by the UK Met Office via met-help:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=95578

They would like to be able to create gridded statistics over a longer time period that they can hold their model output and analyses on disk. To enable this, we'd need to enhance Series-Analysis to read it own output to aggregate stats over a longer time period.

Additional definition for this task are required. Two implementation options are listed below. We could consider doing one or that other, or actually do both.

(1) Write MPR FCST and OBS matched pairs to the output and aggregate them over multiple time periods.

This option includes adding support for the MPR line type to the "output_stats" dictionary. Rather than writing a summary field, this would be a 3 dimensional variable... lat,lon,series. So we're essentially writing the matched pairs, which would create a much larger output file. However aggregating multiple files containing FCST and OBS fields would be very easy. Just concatenate the series values and compute the user-requested statistics from them.

(2) Write contingency table counts and partials sums to be aggregated later.

This option would not increase the output files as much. Enhance Series-Analysis to be able to read fields of partial sum components and contingency table counts. Aggregate multiple ones together in the correct way. And recompute aggregated statistics from them.

Some details...

  • Recommend doing this directly in Series-Analysis rather than creating a new tool to do the work.
  • It should be pretty easy for Series-Analysis to figure out if series analysis is reading it's own input files. But we should make the options clear. Instead of "-fcst" and "-obs" or "-both" consider a new command line option to tell the tool that you're passing its in its own output.
  • It may be tedious for the user to make sure they've enabled all the write output variables to be aggregated later. Consider a shorthand way for them to do so to make that easier.

Acceptance Testing

List input data types and sources.
Describe tests required for new functionality.

Time Estimate

Estimate the amount of work required here.
Issues should represent approximately 1 to 3 days of work.

Sub-Issues

Consider breaking the new feature down into sub-issues.

  • Add a checkbox for each sub-issue here.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Review projects and select relevant Repository and Organization ones
  • Select milestone

Define Related Issue(s)

Consider the impact to the other METplus components.

New Feature Checklist

See the METplus Workflow for details.

  • Complete the issue definition above.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Close this issue.
@JohnHalleyGotway JohnHalleyGotway added component: application code type: new feature Make it do something new requestor: UK Met Office United Kingdom Met Office alert: NEED MORE DEFINITION Not yet actionable, additional definition required alert: NEED ACCOUNT KEY Need to assign an account key to this issue labels Jun 12, 2020
@JohnHalleyGotway JohnHalleyGotway added this to the MET 10.0 milestone Jun 12, 2020
@JohnHalleyGotway JohnHalleyGotway changed the title Enhance Series-Analysis to read it own output and aggregate gridded statistics across multiple runs. Enhance Series-Analysis to read its own output and aggregate gridded statistics across multiple runs. Jun 12, 2020
@JohnHalleyGotway
Copy link
Collaborator Author

John Wagner, via met-help, indicated that this feature would also be useful for NOAA/MDL in their use of Series-Analysis:
https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=95583

@JohnHalleyGotway JohnHalleyGotway added the alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle label Sep 10, 2020
@JohnHalleyGotway JohnHalleyGotway removed the alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle label Nov 5, 2020
@JohnHalleyGotway JohnHalleyGotway added this to To do in MET-10.0.0-beta3 (1/27/21) via automation Nov 5, 2020
@JohnHalleyGotway JohnHalleyGotway added this to To do in MET-10.0.0-beta4 (3/2/21) via automation Jan 25, 2021
@JohnHalleyGotway JohnHalleyGotway added this to To do in MET-10.0.0-rc1 (5/10/21) via automation Feb 15, 2021
@JohnHalleyGotway JohnHalleyGotway added the alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle label Feb 16, 2021
@JohnHalleyGotway JohnHalleyGotway added priority: high High Priority and removed priority: high labels May 9, 2022
@TaraJensen TaraJensen removed the alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle label Sep 21, 2023
@JohnHalleyGotway JohnHalleyGotway added the required: FOR OFFICIAL RELEASE Required to be completed in the official release for the assigned milestone label May 15, 2024
@JohnHalleyGotway
Copy link
Collaborator Author

This is MetOffice deliverable due November 2024.

@JohnHalleyGotway JohnHalleyGotway self-assigned this Jul 18, 2024
@JohnHalleyGotway JohnHalleyGotway changed the title Enhance Series-Analysis to read its own output and aggregate gridded statistics across multiple runs. Enhance Series-Analysis to read its own output and incrementally update output statistics through time Jul 24, 2024
@JohnHalleyGotway JohnHalleyGotway changed the title Enhance Series-Analysis to read its own output and incrementally update output statistics through time Enhance Series-Analysis to read its own output and incrementally update output statistics over time Jul 24, 2024
@JohnHalleyGotway JohnHalleyGotway added priority: blocker Blocker priority: low Low Priority and removed alert: NEED MORE DEFINITION Not yet actionable, additional definition required priority: high High Priority priority: low Low Priority labels Jul 24, 2024
@JohnHalleyGotway
Copy link
Collaborator Author

This issue was discussed during the Met Office NGVER meeting on July 24, 2024.

The functionality needed here is similar to how the Gen-Vx-Mask tool works. When gen_vx_mask is given its own output as input, it initializes values using the previously defined mask.

For Series-Analysis, the logic needed is described below:

  • Add a mechanism to supply previously generated Series-Analysis output as input for the run... likely with a new -input command line option.
  • With this usage, the Series-Analysis input data will be rather limited, perhaps, only containing data for a single time.
  • Process the input fcst/obs data as normal, storing them in PairDataPoint classes... and then using them to derive partial sums and contingency table counts. However, prior to computing final statistics, check the -input file for relevant inputs.
  • Use those inputs to aggregate (i.e. += operator) the partial sums and contingency table counts prior to computing final statistics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alert: NEED ACCOUNT KEY Need to assign an account key to this issue priority: blocker Blocker requestor: UK Met Office United Kingdom Met Office required: FOR OFFICIAL RELEASE Required to be completed in the official release for the assigned milestone type: new feature Make it do something new
Projects
Status: 🔖 Ready
Development

No branches or pull requests

2 participants