Skip to content

Add QDM output Zarr priming, region writing, metadata, multi-year processing#129

Merged
brews merged 13 commits intoClimateImpactLab:mainfrom
brews:new_qdm
Nov 10, 2021
Merged

Add QDM output Zarr priming, region writing, metadata, multi-year processing#129
brews merged 13 commits intoClimateImpactLab:mainfrom
brews:new_qdm

Conversation

@brews
Copy link
Member

@brews brews commented Nov 9, 2021

This PR rewrites the QDM command prototypes, breaking backward compatibility.

Changes include:

  • New command for priming a Zarr Store so it can be written to regionally by independent processes. For example:
dodola prime-qdm-output-zarrstore \
  --simulation "gs://path/to/file/we/want/to/biascorrect.zarr" \
  --variable "tasmax" \
  --years "2015,2100" \
  --out "gs://path/to/where/we/want/biascorrected/to/go.zarr" \
  --zarr-region-dims "lat"

We should then be set to write output to that zarr store, regionally, with regions delimited along the "lat" dimension.

  • Write to regions of a primed Zarr Store aggressively with dodola apply-qdm --out-zarr-region. For example, --out-zarr-region "lat=0,2" will write QDM-adjusted output to slice(0, 2) along the "lat" dimension. Note this is written aggressively (with mode="a"), so use caution not to corrupt existing data. Use this with the --selslice and --iselslice arguments to filter input data.

  • dodola apply-qdm now expects to be applied to multiple years, rather than single years. It does this through a --years argument. It's used like --years=2015,2100. This should help work run faster and more efficiently on spatially regional subsets consuming the entire input time series.

  • Root and variable attributes are now copied from input data to output QDM-adjusted datasets. QDM output should now also match the dtype of the input dataset variable.

  • QDM's sim_q are now output as a variable instead of a coordinate. These are now output by default.

  • apply-qdm and prime-qdm-output-zarrstore both accept one or more --new-attrs options. This is a fast way of adding simple metadata to output. Each of these take key=value pairs that are merged into the output Dataset's root attrs before being written.

  • Logging is slightly more chatty on INFO, especially with respect to data slicing.

@brews brews added the enhancement New feature or request label Nov 9, 2021
@brews brews self-assigned this Nov 9, 2021
@brews brews changed the title Add QDM output Zarr priming, slicing, region writing, multi-year processing Add QDM output Zarr priming, region writing, multi-year processing Nov 9, 2021
@brews brews marked this pull request as ready for review November 10, 2021 04:42
@brews brews changed the title Add QDM output Zarr priming, region writing, multi-year processing Add QDM output Zarr priming, region writing, metadata, multi-year processing Nov 10, 2021
@brews brews merged commit 50a6714 into ClimateImpactLab:main Nov 10, 2021
@brews brews deleted the new_qdm branch November 10, 2021 04:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant