Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for date-based restart frequency #363

Merged
merged 5 commits into from
Oct 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
33 changes: 23 additions & 10 deletions docs/source/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -189,16 +189,29 @@ configuration.
ncpus: 0

``restart_freq`` (*Default:* ``5``)
Specifies the rate of saved restart files. For the default rate of 5, we
keep the restart files for every fifth run (``restart004``, ``restart009``,
``restart014``, etc.).

Intermediate restarts are not deleted until a permanently archived restart
has been produced. For example, if we have just completed run ``11``, then
we keep ``restart004``, ``restart009``, ``restart010``, and ``restart011``.
Restarts 10 through 13 are not deleted until ``restart014`` has been saved.

``restart_freq: 1`` saves all restart files.
Specifies the rate of saved restart files. This rate can be either an
integer or date-based. For the default rate of 5, we
keep the restart files for every fifth run (``restart000``, ``restart005``,
``restart010``, etc.). To save all restart files, set ``restart_freq: 1``.

If ``restart_history`` is not configured, intermediate restarts are not
deleted until a permanently archived restart has been produced.
For example, if we have just completed run ``11``, then
we keep ``restart000``, ``restart005``, ``restart010``, and ``restart011``.
Restarts 11 through 14 are not deleted until ``restart015`` has been saved.

To use a date-based restart frequency, specify a number with a time unit.
The supported time units are ``YS`` - year-start, ``MS`` - month-start,
``W`` - week, ``D`` - day, ``H`` - hour, ``T`` - minute and ``S`` - second.
For example, ``restart_freq: 10YS`` would save earliest restart of the year,
10 years from the last permanently archived restart's datetime.

Please note that currently, only ACCESS-OM2, MOM5 and MOM6 models support
date-based restart frequency, as it depends on the payu model driver being
able to parse restarts files for a datetime.

``restart_history``
Specifies how many of the most recent restart files to retain regardless of `restart_freq`

*The following model-based tags are typically not configured*

Expand Down
6 changes: 6 additions & 0 deletions docs/source/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,12 @@ To run from an existing model run, also called a warm start, set the
``restart`` option to point to the folder containing the restart files
from a previous matching experiment.

If restart pruning configuration has changed, there may be warnings if
many restarts will be pruned as a result. If this is desired, at the next
run use ``-F/--force-prune-restarts`` flag:

payu run --force-prune-restarts


Cleaning up
===========
Expand Down
121 changes: 115 additions & 6 deletions payu/calendar.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
from dateutil.relativedelta import relativedelta
import datetime
import re

from dateutil.relativedelta import relativedelta
import cftime

NOLEAP, GREGORIAN = range(2)

Expand All @@ -17,8 +20,7 @@ def int_to_date(date):


def date_to_int(date):

return (date.year * 10**4 + date.month * 10**2 + date.day)
return date.year * 10**4 + date.month * 10**2 + date.day


def runtime_from_date(start_date, years, months, days, seconds, caltype):
Expand All @@ -28,8 +30,9 @@ def runtime_from_date(start_date, years, months, days, seconds, caltype):
Ignores Feb 29 for caltype == NOLEAP.
"""

end_date = start_date + relativedelta(years=years, months=months,
days=days)
end_date = start_date + relativedelta(
years=years, months=months, days=days
)
runtime = end_date - start_date

if caltype == NOLEAP:
Expand Down Expand Up @@ -67,7 +70,6 @@ def get_leapdays(init_date, final_date):
leap_days = 0

while curr_date != final_date:

if curr_date.month == 2 and curr_date.day == 29:
leap_days += 1

Expand All @@ -86,3 +88,110 @@ def calculate_leapdays(init_date, final_date):
# TODO: Internal date correction (e.g. init_date is 1-March or later)

return datetime.timedelta(days=leap_days)


def add_year_start_offset_to_datetime(initial_dt, n):
"""Return a cftime datetime at the start of the year, that is n years
from the initial datetime"""
return cftime.datetime(
year=initial_dt.year + n,
month=1,
day=1,
hour=0,
minute=0,
second=0,
calendar=initial_dt.calendar,
)


def add_month_start_offset_to_datetime(initial_dt, n):
"""Return a cftime datetime of the start of the month, that is n months
from the initial datetime"""
years_to_add = (initial_dt.month + n - 1) // 12
months_to_add = n - years_to_add * 12

return cftime.datetime(
year=initial_dt.year + years_to_add,
month=initial_dt.month + months_to_add,
day=1,
hour=0,
minute=0,
second=0,
calendar=initial_dt.calendar,
)


def add_timedelta_fn(timedelta):
"""Returns a function that takes initial datetime and multiplier n,
and returns a datetime that is n * offset from the initial datetime"""
return lambda initial_dt, n: initial_dt + n * timedelta

aidanheerdegen marked this conversation as resolved.
Show resolved Hide resolved

class DatetimeOffset:
"""A utility class for adding various time offsets to cftime datetimes.

Parameters:
unit (str): The unit of the time offset. Supported units are:
- "YS" for years (start of the year)
- "MS" for months (start of the month)
- "W" for weeks
- "D" for days
- "H" for hours
- "T" for minutes
- "S" for seconds
magnitude (int): The magnitude of the time offset.

Methods:
- `add_to_datetime(initial_dt: cftime.datetime) -> cftime.datetime`:
Adds the specified time offset to the given cftime datetime and
returns the resulting datetime.

Attributes:
- unit (str): The unit of the time offset.
- magnitude (int): The magnitude of the time offset.
"""

def __init__(self, unit, magnitude):
supported_datetime_offsets = {
"YS": add_year_start_offset_to_datetime,
"MS": add_month_start_offset_to_datetime,
"W": add_timedelta_fn(datetime.timedelta(weeks=1)),
"D": add_timedelta_fn(datetime.timedelta(days=1)),
"H": add_timedelta_fn(datetime.timedelta(hours=1)),
"T": add_timedelta_fn(datetime.timedelta(minutes=1)),
"S": add_timedelta_fn(datetime.timedelta(seconds=1)),
}
if unit not in supported_datetime_offsets:
raise ValueError(
f"Unsupported datetime offset: {unit}. "
"Supported offsets: YS, MS, W, D, H, T, S"
)
self.unit = unit
self.magnitude = magnitude
self._add_offset_to_datetime = supported_datetime_offsets[unit]

def add_to_datetime(self, initial_dt):
"""Takes an initial cftime datetime,
and returns a datetime with the offset added"""

if not (isinstance(initial_dt, cftime.datetime)):
raise TypeError(
f"Invalid initial datetime type: {type(initial_dt)}. "
"Expected type: cftime.datetime"
)

return self._add_offset_to_datetime(
initial_dt=initial_dt, n=self.magnitude
)


def parse_date_offset(offset):
"""Parse a given string date offset string and return an DatetimeOffset"""
match = re.search("[0-9]+", offset)
if match is None:
raise ValueError(
f"No numerical value given for offset: {offset}"
)
n = match.group()
unit = offset.lstrip(n)
return DatetimeOffset(unit=unit, magnitude=int(n))
5 changes: 4 additions & 1 deletion payu/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ def get_model_type(model_type, config):


def set_env_vars(init_run=None, n_runs=None, lab_path=None, dir_path=None,
reproduce=False, force=False):
reproduce=False, force=False, force_prune_restarts=False):
"""Construct the environment variables used by payu for resubmissions."""
payu_env_vars = {}

Expand Down Expand Up @@ -134,6 +134,9 @@ def set_env_vars(init_run=None, n_runs=None, lab_path=None, dir_path=None,
if force:
payu_env_vars['PAYU_FORCE'] = force

if force_prune_restarts:
payu_env_vars['PAYU_FORCE_PRUNE_RESTARTS'] = force_prune_restarts

# Pass through important module related environment variables
module_env_vars = ['MODULESHOME', 'MODULES_CMD', 'MODULEPATH', 'MODULEV']
for var in module_env_vars:
Expand Down