# Dateutil

The third-party library `dateutil` is recommended in the Python standard library documentation.  It works with the same underlying `datetime.datetime` objects used by the standard libary, but adds many extra conveniences.  We saw in the previous lesson that it provides access to the IANA timezone database, for more accurate time zones.

In [1]:
from datetime import date, time, datetime, timedelta
start = datetime.now()

## Parser

The standard library can read ISO 8601 format, and can read other formats using explicit format codes that are easy to get wrong.  With `dateutil` we can heuristically parse most formats used to represent dates and datetimes.

In [2]:
from dateutil.parser import parse
from dateutil.tz import gettz
# Offset in seconds or at tzinfo object
tzinfos = {"MSK": +10800, "CST": gettz("America/Chicago")}

Various formats are guessed successfully.

In [3]:
parse('2020-01-31T12:30:45')

datetime.datetime(2020, 1, 31, 12, 30, 45)

In [4]:
parse('2020-01-31T12:30:45 MSK', tzinfos=tzinfos)

datetime.datetime(2020, 1, 31, 12, 30, 45, tzinfo=tzoffset('MSK', 10800))

In [5]:
parse('2020-01-31T12:30:45 CST', tzinfos=tzinfos)

datetime.datetime(2020, 1, 31, 12, 30, 45, tzinfo=tzfile('/usr/share/zoneinfo/America/Chicago'))

In [6]:
parse('January 1, 2020 1:30:45 pm +0500')

datetime.datetime(2020, 1, 1, 13, 30, 45, tzinfo=tzoffset(None, 18000))

Different locations choose day-first or month-first date format.

In [7]:
parse('01/02/2020')

datetime.datetime(2020, 1, 2, 0, 0)

In [8]:
parse('01/02/2020', dayfirst=True)

datetime.datetime(2020, 2, 1, 0, 0)

Day of week is still mostly ignored.

In [9]:
parse('Wednesday Aug 12; 2020; 05:29:12 PM')

datetime.datetime(2020, 8, 12, 17, 29, 12)

In [10]:
parse('Monday Aug 12; 2020; 05:29:12 PM')

datetime.datetime(2020, 8, 12, 17, 29, 12)

In [11]:
try:
    parse('Humpday Aug 08; 2020; 05:29:12 PM')
except Exception as err:
    print(err.__class__.__name__, "|", err)

ParserError | Unknown string format: Humpday Aug 08; 2020; 05:29:12 PM


In [12]:
try:
    parse('Thursday Sep 31; 2020; 05:29:12 PM')
except Exception as err:
    print(err.__class__.__name__, "|", err)

ParserError | day is out of range for month: Thursday Sep 31; 2020; 05:29:12 PM


## Timedeltas

In the standard library, an object called `timedelta` is useful in measuring or adding durations to `datetime` objects.  That object has some limitations, notably in that it only handles regular units, the largest of those being days.  

In commerce and ordinary life, we often think of durations in terms of months, weeks, and years, even though months are of varying lengths, as are years that may or may not be leap years.

For example, let us take two dates (here as `datetime`s, but a simple `date`s would work for this purpose).  One we colloquially call "end of January" the other "end of February."  We might want to *move forward a month*.

In [13]:
from dateutil.relativedelta import *
jan_end = datetime(2020, 1, 31)
feb_end = datetime(2020, 2, 29)

In [14]:
jan_end + timedelta(days=30)

datetime.datetime(2020, 3, 1, 0, 0)

In [15]:
feb_end + timedelta(days=30)

datetime.datetime(2020, 3, 30, 0, 0)

Those are unsatisfying answers that are numerically "correct" but not what we probably mean.  `relativedelta` from dateutil is more flexible.

In [16]:
jan_end + relativedelta(months=1)

datetime.datetime(2020, 2, 29, 0, 0)

In [17]:
feb_end + relativedelta(months=1)

datetime.datetime(2020, 3, 29, 0, 0)

We might combine different "human scale" increments.

In [18]:
# A year and a month later
jan_end + relativedelta(years=1, months=1)

datetime.datetime(2021, 2, 28, 0, 0)

In [19]:
# The monday before a year and a month later
jan_end + relativedelta(years=1, months=1, weekday=MO(-1))

datetime.datetime(2021, 2, 22, 0, 0)

## Recurring Events

The `dateutil` library allows you to create collections of recurring or related dates and datetimes.  By combining arguments to the `rrule()`, `rruleset()` and `rrrulestr()` functions, we can generate iterators over these collections of related times.

In [20]:
from dateutil.rrule import *

Every two-and-a-half hours from now until the same time tomorrow.

In [21]:
alarms = rrule(MINUTELY, interval=150,
               dtstart=start,
               until=start + relativedelta(days=1))
for alarm in alarms:
    print(alarm)

2020-08-21 17:38:44
2020-08-21 20:08:44
2020-08-21 22:38:44
2020-08-22 01:08:44
2020-08-22 03:38:44
2020-08-22 06:08:44
2020-08-22 08:38:44
2020-08-22 11:08:44
2020-08-22 13:38:44
2020-08-22 16:08:44


United States presidential elections follow a slightly odd rule. The occur every four years of the first Tuesday that follows a Monday in November.  When are the next 5 of them from right now?

In [22]:
elections = rrule(YEARLY, interval=4, count=5, 
                  bymonth=11,
                  byweekday=TU, 
                  bymonthday=(2, 3, 4, 5, 6, 7, 8),
                  dtstart=start)

for election in elections:
    print(election.date())

2020-11-03
2024-11-05
2028-11-07
2032-11-02
2036-11-04


Sometimes we *almost* want a recurrence rule, but we need to include something extra or exclude something that would otherwise occur.  Here is this time of day, every day of the next week, excluding Tuesday and Thursday. 

In [27]:
type(events)

dateutil.rrule.rruleset

In [23]:
events = rruleset()
events.rrule(rrule(DAILY, count=7, dtstart=start))
events.exrule(rrule(DAILY, byweekday=(TU, TH), dtstart=start))

for event in events:
    print(event)

2020-08-21 17:38:44
2020-08-22 17:38:44
2020-08-23 17:38:44
2020-08-24 17:38:44
2020-08-26 17:38:44


Perhaps we would like to add additional datetimes to a collection.  For example, on Tuesday/Thursday we want an event, but at exactly noon, rather than based on current time.

In [24]:
noon = start.replace(hour=12, minute=0, second=0, microsecond=0)
events.rrule(rrule(DAILY, 
                   dtstart=noon, 
                   until=noon+relativedelta(days=7), 
                   byweekday=(TU, TH)))

for event in events:
    print(event)

2020-08-21 17:38:44
2020-08-22 17:38:44
2020-08-23 17:38:44
2020-08-24 17:38:44
2020-08-25 12:00:00
2020-08-26 17:38:44
2020-08-27 12:00:00


We can also read fragments of Internet Calendaring and Scheduling Core Object Specification (iCalendar) string descriptions of recurring events.  Only the timstamp portions are recognized, the other metadata must be stripped separately.

In [25]:
meetings = rrulestr(f"""
  DTSTART:{date.today()}T14:00:00
  RRULE:FREQ=DAILY;INTERVAL=10;COUNT=5
  RRULE:FREQ=DAILY;INTERVAL=5;COUNT=3""")
for meeting in meetings:
    print(meeting)

2020-08-21 14:00:00
2020-08-26 14:00:00
2020-08-31 14:00:00
2020-09-10 14:00:00
2020-09-20 14:00:00
2020-09-30 14:00:00
