Skip to content

Commit

Permalink
Modernize regrid_time and allow setting a common calendar for decad…
Browse files Browse the repository at this point in the history
…al, yearly, and monthly data (#2311)

Co-authored-by: Valeriu Predoi <valeriu.predoi@gmail.com>
  • Loading branch information
schlunma and valeriupredoi committed Apr 26, 2024
1 parent afde692 commit e147a51
Show file tree
Hide file tree
Showing 6 changed files with 485 additions and 441 deletions.
63 changes: 54 additions & 9 deletions doc/recipe/preprocessor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1248,8 +1248,7 @@ The ``_time.py`` module contains the following preprocessor functions:
* resample_time_: Resample data
* resample_hours_: Convert between N-hourly frequencies by resampling
* anomalies_: Compute (standardized) anomalies
* regrid_time_: Aligns the time axis of each dataset to have common time
points and calendars.
* regrid_time_: Aligns the time coordinate of each dataset, against a standardized time axis.
* timeseries_filter_: Allows application of a filter to the time-series data.
* local_solar_time_: Convert cube with UTC time to local solar time.

Expand Down Expand Up @@ -1642,13 +1641,59 @@ See also :func:`esmvalcore.preprocessor.anomalies`.
``regrid_time``
---------------

This function aligns the time points of each component dataset so that the Iris
cubes from different datasets can be subtracted. The operation makes the
datasets time points common; it also resets the time
bounds and auxiliary coordinates to reflect the artificially shifted time
points. Current implementation for monthly and daily data; the ``frequency`` is
set automatically from the variable CMOR table unless a custom ``frequency`` is
set manually by the user in recipe.
This function aligns the time points and bounds of an input dataset according
to the following rules:

* Decadal data: 1 January 00:00:00 for the given year.
Example: 1 January 2005 00:00:00 for given year 2005 (decade 2000-2010).
* Yearly data: 1 July 00:00:00 for each year.
Example: 1 July 1993 00:00:00 for the year 1993.
* Monthly data: 15th day 00:00:00 for each month.
Example: 15 October 1993 00:00:00 for the month October 1993.
* Daily data: 12:00:00 for each day.
Example: 14 March 1996 12:00:00 for the day 14 March 1996.
* `n`-hourly data where `n` is a divisor of 24: center of each time interval.
Example: 03:00:00 for interval 00:00:00-06:00:00 (6-hourly data), 16:30:00
for interval 15:00:00-18:00:00 (3-hourly data), or 09:30:00 for interval
09:00:00-10:00:00 (hourly data).

The frequency of the input data is automatically determined from the CMOR table
of the corresponding variable, but can be overwritten in the recipe if
necessary.
This function does not alter the data in any way.

.. note::

By default, this preprocessor will not change the calendar of the input time
coordinate.
For decadal, yearly, and monthly data, it is possible to change the calendar
using the optional `calendar` argument.
Be aware that changing the calendar might introduce (small) errors to your
data, especially for extensive quantities (those that depend on the period
length).

Parameters:
* `frequency`: Data frequency.
If not given, use the one from the CMOR tables of the corresponding
variable.
* `calendar`: If given, transform the calendar to the one specified
(examples: `standard`, `365_day`, etc.).
This only works for decadal, yearly and monthly data, and will raise an
error for other frequencies.
If not set, the calendar will not be changed.
* `units` (default: `days since 1850-01-01 00:00:00`): Reference time units
used if the calendar of the data is changed.
Ignored if `calendar` is not set.

Examples:

Change the input calendar to `standard` and use custom units:

.. code-block:: yaml
regrid_time:
calendar: standard
units: days since 2000-01-01
See also :func:`esmvalcore.preprocessor.regrid_time`.

Expand Down
8 changes: 3 additions & 5 deletions esmvalcore/_recipe/recipe.py
Original file line number Diff line number Diff line change
Expand Up @@ -147,13 +147,11 @@ def _update_target_grid(dataset, datasets, settings):
_spec_to_latlonvals(**target_grid)


def _update_regrid_time(dataset, settings):
def _update_regrid_time(dataset: Dataset, settings: dict) -> None:
"""Input data frequency automatically for regrid_time preprocessor."""
regrid_time = settings.get('regrid_time')
if regrid_time is None:
if 'regrid_time' not in settings:
return
frequency = settings.get('regrid_time', {}).get('frequency')
if not frequency:
if 'frequency' not in settings['regrid_time']:
settings['regrid_time']['frequency'] = dataset.facets['frequency']


Expand Down
77 changes: 38 additions & 39 deletions esmvalcore/cmor/_fixes/shared.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""Shared functions for fixes."""
import logging
import os
from datetime import datetime
from datetime import datetime, timedelta
from functools import lru_cache

import dask.array as da
Expand Down Expand Up @@ -453,12 +453,12 @@ def get_next_month(month: int, year: int) -> tuple[int, int]:
def get_time_bounds(time: Coord, freq: str) -> np.ndarray:
"""Get bounds for time coordinate.
For monthly data, use the first day of the current month and the first day
of the next month. For yearly or decadal data, use 1 January of the current
year and 1 January of the next year or 10 years from the current year. For
other frequencies (daily, 6-hourly, 3-hourly, hourly), half of the
frequency is subtracted/added from the current point in time to get the
bounds.
For decadal data, use 1 January 5 years before/after the current year. For
yearly data, use 1 January of the current year and 1 January of the next
year. For monthly data, use the first day of the current month and the
first day of the next month. For other frequencies (daily or `n`-hourly,
where `n` is a divisor of 24), half of the frequency is subtracted/added
from the current point in time to get the bounds.
Parameters
----------
Expand All @@ -480,39 +480,38 @@ def get_time_bounds(time: Coord, freq: str) -> np.ndarray:
"""
bounds = []
dates = time.units.num2date(time.points)
for step, date in enumerate(dates):
month = date.month
year = date.year
if freq in ['mon', 'mo']:
next_month, next_year = get_next_month(month, year)
min_bound = date2num(datetime(year, month, 1, 0, 0),
time.units, time.dtype)
max_bound = date2num(datetime(next_year, next_month, 1, 0, 0),
time.units, time.dtype)
elif freq == 'yr':
min_bound = date2num(datetime(year, 1, 1, 0, 0),
time.units, time.dtype)
max_bound = date2num(datetime(year + 1, 1, 1, 0, 0),
time.units, time.dtype)
elif freq == 'dec':
min_bound = date2num(datetime(year, 1, 1, 0, 0),
time.units, time.dtype)
max_bound = date2num(datetime(year + 10, 1, 1, 0, 0),
time.units, time.dtype)
else:
delta = {
'day': 12.0 / 24,
'6hr': 3.0 / 24,
'3hr': 1.5 / 24,
'1hr': 0.5 / 24,
}
if freq not in delta:

for date in dates:
if 'dec' in freq:
min_bound = datetime(date.year - 5, 1, 1, 0, 0)
max_bound = datetime(date.year + 5, 1, 1, 0, 0)
elif 'yr' in freq:
min_bound = datetime(date.year, 1, 1, 0, 0)
max_bound = datetime(date.year + 1, 1, 1, 0, 0)
elif 'mon' in freq or freq == 'mo':
next_month, next_year = get_next_month(date.month, date.year)
min_bound = datetime(date.year, date.month, 1, 0, 0)
max_bound = datetime(next_year, next_month, 1, 0, 0)
elif 'day' in freq:
min_bound = date - timedelta(hours=12.0)
max_bound = date + timedelta(hours=12.0)
elif 'hr' in freq:
(n_hours_str, _, _) = freq.partition('hr')
if not n_hours_str:
n_hours = 1
else:
n_hours = int(n_hours_str)
if 24 % n_hours:
raise NotImplementedError(
f"Cannot guess time bounds for frequency '{freq}'"
f"For `n`-hourly data, `n` must be a divisor of 24, got "
f"'{freq}'"
)
point = time.points[step]
min_bound = point - delta[freq]
max_bound = point + delta[freq]
min_bound = date - timedelta(hours=n_hours / 2.0)
max_bound = date + timedelta(hours=n_hours / 2.0)
else:
raise NotImplementedError(
f"Cannot guess time bounds for frequency '{freq}'"
)
bounds.append([min_bound, max_bound])

return np.array(bounds)
return date2num(np.array(bounds), time.units, time.dtype)

0 comments on commit e147a51

Please sign in to comment.