![Erudio logo](img/erudio-logo-small.png)

# Time Zones

Timezones, a critical aspect of programming that often poses challenges for developers. Understanding timezones is not just about dealing with different regions on the globe; it's a key skill for creating robust and accurate applications with time-related functionalities.

## The Nature of Time

Getting datetimes **entirely** right in the face of timezones is surprisingly difficult, and the more effort you put into it, the more difficulty you discover.  In this lesson, we will make a first approximation, and a second approximation, toward getting the general problem of timezones right.  Subsequent approximations are outside the scope of this course.

In [1]:
from datetime import timezone, datetime, timedelta
from dateutil import tz
start = datetime.now()

## First Pass

The Python standard library module, `datetime` has rudimentary handling of timezones, and provides a framework for the third party tool we discuss in this lesson.  In this lesson, we will only look at datetime objects, which are the only ones with a concept of timezone.

Suppose we have several geographic locations in our organization, and they record events that occur at their location.  The record they make—this might be a server log file, or it might be a human action, for example—shows the local time when something occurred.

Being able to ask questions like whether one event occurred before another, or what was the time duration between these events is often relevant.  Thank about bank or trading transactions, for example; or of latency in computer servers being monitored.

We can define time zones where these local offices are operating and recording.

In [2]:
katmandu = timezone(timedelta(hours=5, minutes=45), name="Nepal")
havana = timezone(timedelta(hours=-5), name="Cuba")
newyork = timezone(timedelta(hours=-5), name="US-East")
nome = timezone(timedelta(hours=-9), name="US-Artic")

for server in [katmandu, havana, newyork, nome]:
    print(f"{str(server):>10} | {server.utcoffset(start)}")

     Nepal | 5:45:00
      Cuba | -1 day, 19:00:00
   US-East | -1 day, 19:00:00
  US-Artic | -1 day, 15:00:00


Let us record several events that we might want to process, about activities of Santa Claus and his elves.  Notice that now when we create datetimes, we add the optional `tzinfo` field with a timezone object in each.

In [3]:
elves = {"Give Hartaj train set":
             datetime(2020, 12, 24, 13, 30, 45, tzinfo=katmandu),
         "Build a train set":
             datetime(2020, 12, 23, 22, 35, 45, tzinfo=nome),
         "Bagels for Spring Party":
             datetime(2020, 3, 7, 14, 0, 0, tzinfo=newyork),
         "Rum for Spring Party":
             datetime(2020, 3, 8, 0, 30, 0, tzinfo=havana),
         "Lox for Spring Party":
             datetime(2020, 3, 8, 0, 30, 0, tzinfo=newyork),
         "Back for Pineapple":
             datetime(2020, 3, 8, 14, 0, 0, tzinfo=havana),
        }

Santa's workshop would like to examine how efficient his magic elves are in their "build on demand" schedule for toys, but the even times in various time zones make it less than immediately apparent.

In [4]:
build_on_demand = (elves['Give Hartaj train set'] 
                     - elves['Build a train set'])
print(build_on_demand)

0:10:00


Not bad, managing to build the train then deliver it to Katmandu in 10 minutes.  Magic powers definitely aid in the Taylorist supply chain.

We might also want to compare events in different time zones for equality or inequality.

In [5]:
print(elves['Rum for Spring Party'].tzinfo, "/",
      elves['Lox for Spring Party'].tzinfo)
elves['Rum for Spring Party'] == elves['Lox for Spring Party']

Cuba / US-East


True

In [6]:
elves["Bagels for Spring Party"] < elves["Rum for Spring Party"]

True

We can ask of any datetime what its offset from Coordinated Universal Time (UTC) is.

In [7]:
bagels = elves['Bagels for Spring Party']
print(bagels.tzname(), datetime.utcoffset(bagels))

US-East -1 day, 19:00:00


In [8]:
build = elves['Build a train set']
print(build.tzname(), datetime.utcoffset(build))

US-Artic -1 day, 15:00:00


## Second Pass

The simplified time zones defined using standard Python handle a fixed offset from UTC.  Actual time zones are much more complicated though.  Here we will use the third-party module `dateutil` to access the *IANA time zone database*—also called *tzdata*, the *zoneinfo database*, or *Olson database*—to get more sophisticated `timezone` objects.

At the time I am writing, Python 3.9 is in release candidate 1.  When it 3.9 arrives, it will contain a new standard library module called `zoneinfo` which will provide access to the IANA time zone database.  However, `dateutil` will remain a useful module, since it contains numerous other capabilities; it is discussed more in the next lesson.

### History

Some arcane difficulties arise with historical changes to calendrics.  If you are only concerned with working with "modern" datetimes from the last few decades, much of this has been standardized and officially documented.  In the last lesson, I passingly mentioned the transition between Gregorian and Julian calendar in England and its colonies, in 1752.  Most of Europe transitioned a century or two earlier; but Russia, for example, only did so in 1918, and therefore has a analagous gap of February 1-13, 1918 not existing there.

Or in another similar footnote, between 1155 and 1752 CE, England celebrated its new year on March 25 (Lady Day/Feast of the Annunciation), rather than on January 1.  Which is to say, for example, one day after Walter Raleigh was given a charter to colonize Virginia, in 1584, the date was March 26, 1585.  Automating caclulations of historical durations is tricky.

In a perfect world, some sophisticated database or marker of the meaning of a date and time at a particular date, time, and place, might work out all these special rules for us. I do not know of any software that exists that does that.

In [9]:
soviet = None  # Placeholder, not actually doing this right
olddays = datetime(1918, 1, 31, tzinfo=soviet)
newdays = datetime(1918, 2, 14, tzinfo=soviet)

In [10]:
fmt = "%B %d, %G"
print(" Old:", datetime.strftime(olddays, fmt))
print(" New:", datetime.strftime(newdays, fmt))
print("Diff:", newdays - olddays)
print("Wish: SHOULD BE 1 day")

 Old: January 31, 1918
 New: February 14, 1918
Diff: 14 days, 0:00:00
Wish: SHOULD BE 1 day


### Since 1970

Let us assume we are only worried about datetimes after 1970.  The IANA timezone database covers time zones, and their changes back to the start of Unix time.  Even here there are complications.  Often using the database will do the right thing; but exceptions remain.

Different nations, or other jurisdictions (such as different US states), follow different rules about Daylight Savings Time (DST).  Whether a time change of an hour (or occasionally two hours) occured between two datetimes depends, in part, on whether DST went into or came out of effect, in either or both timezone-sensitive objects.  

Notably, the southern and northern hemispheres have oppositive seasonality, so locations in each typically jump in opposite directions around similar dates.  Of course, many different specific times for a DST change exist legislatively in different jurisdictions, if that system is used at all in a jurisdiction.  

Let us define several time zones using the IANA timezone database.  Notice that when we ask for name or offset, it is specific to the datetime asked about.  For example, New York is either EST or EDT, depending on the time of year.

In [11]:
katmandu = tz.gettz("Asia/Kathmandu")
havana = tz.gettz("America/Havana")
newyork = tz.gettz("America/New_York")
nome = tz.gettz("America/Nome")

for server in [katmandu, havana, newyork, nome]:
    print(f"{str(server):>47} | "
          f"{server.tzname(start):>5} | "
          f"{server.utcoffset(start)}")

                       tzfile('Asia/Kathmandu') | +0545 | 5:45:00
                                 tzfile('Cuba') |   CST | -1 day, 19:00:00
                           tzfile('US/Eastern') |   EST | -1 day, 19:00:00
                         tzfile('America/Nome') |  AKST | -1 day, 15:00:00


With our more nuanced timezones available, let us reconfigure our events to use these improved time zones.  The setup is identical, but the timezone objects are no longer the simple offsets from UTC.

In [12]:
elves2 = {"Give Hartaj train set":
              datetime(2020, 12, 24, 13, 30, 45, tzinfo=katmandu),
          "Build a train set":
              datetime(2020, 12, 23, 22, 35, 45, tzinfo=nome),
          "Bagels for Spring Party":
              datetime(2020, 3, 7, 14, 0, 0, tzinfo=newyork),
          "Rum for Spring Party":
              datetime(2020, 3, 8, 0, 30, 0, tzinfo=havana),
          "Lox for Spring Party":
              datetime(2020, 3, 8, 0, 30, 0, tzinfo=newyork),
          "Back for Pineapple":
              datetime(2020, 3, 8, 14, 0, 0, tzinfo=havana),
         }

We can ask the same questions we did before.  For example, this does not change the "build on demand" duration for magic elves.

In [13]:
build_on_demand = (elves2['Give Hartaj train set'] 
                     - elves2['Build a train set'])
print(build_on_demand)

0:10:00


Other questions give different answers.  Remember that in naive time zones, the elves got rum in Havana and lox in New York at the very same moment.  Something funny happens when we use the database.

In [14]:
print("Havana", elves2['Back for Pineapple'])
print("Havana", elves2['Rum for Spring Party'])
# First purchase 12:30 am local time
print(elves2['Back for Pineapple'] - elves2['Rum for Spring Party'])

Havana 2020-03-08 14:00:00-04:00
Havana 2020-03-08 00:30:00-04:00
13:30:00


In [15]:
print("Havana  ", elves2['Back for Pineapple'])
print("New York", elves2['Lox for Spring Party'])
# First purchase 12:30 am local time
print(elves2['Back for Pineapple'] - elves2['Lox for Spring Party'])

Havana   2020-03-08 14:00:00-04:00
New York 2020-03-08 00:30:00-05:00
12:30:00


That was strange! Surely 12:30 am to 14:00 is 13.5 hours.  The problem is that on March 8, daylight savings time jumps ahead by an hour.

The further problem is that it does so at midnight in Havana, but at 2 am in New York.  Moreover, when the clock jumps ahead, some period of time simply does not exist (much as some dates did not exist historically, but in a place dependent way).

In [16]:
for event in ['Bagels for Spring Party', 'Rum for Spring Party',
              'Lox for Spring Party', 'Back for Pineapple']:
    dt = elves2[event]
    print(f"{event:>25} | datetime exists? {tz.datetime_exists(dt)}")

  Bagels for Spring Party | datetime exists? True
     Rum for Spring Party | datetime exists? False
     Lox for Spring Party | datetime exists? True
       Back for Pineapple | datetime exists? True


There is the problem.  The time the elf recorded for picking up rum in Havana never existed at all as a local time.  Someone is cooking the books!

Let us think about another duration.  How long is it from March 7 at 2 pm in New York, until March 8 at 2 pm in Havana (at least in 2020)?

In [17]:
print(elves2['Back for Pineapple'] 
        - elves2['Bagels for Spring Party'])

23:00:00


These datetimes are, naturally, 23 hours apart.  Both locations had their jump forward for daylight savings, at some point between those two datetimes.  Of course, sometimes DST moves in the other direction.  In that case, it is not that some datetimes are skipped, but rather that some occur twice.  In that case, some datetimes occur twice.  We can ask about that ambiguity.

In [18]:
back1 = datetime(2020, 11, 1, 1, 30, 0, tzinfo=newyork)
back2 = datetime(2020, 11, 1, 1, 31, 0, tzinfo=newyork)
later = datetime(2020, 11, 1, 3, 0, 0, tzinfo=newyork)
for dt in [back1, back2, later]:
    print(dt, "| ambiguous?", tz.datetime_ambiguous(dt))

2020-11-01 01:30:00-04:00 | ambiguous? True
2020-11-01 01:31:00-04:00 | ambiguous? True
2020-11-01 03:00:00-05:00 | ambiguous? False


We could *try* to make comparisons of these ambiguous datetimes, but the results do not give us great confidence.  Those two datetimes *might be* a minute apart, but we really do not know.

In [19]:
back2 - back1

datetime.timedelta(seconds=60)

We can explicitly disambiguate by specifying the argument `fold` that says whether we want the first or second version of an ambiguous datetime.  Unfortunately, this only gives us a marker of the time repetition, it does not do the appropriate arithmetic for us.  By default, a datetime assumes it is unfolded (earlier).

In [20]:
back3 = datetime(2020, 11, 1, 1, 30, 0, tzinfo=newyork, fold=False)
back4 = datetime(2020, 11, 1, 1, 31, 0, tzinfo=newyork, fold=True)
print(back4 - back3)

back5 = datetime(2020, 11, 1, 1, 30, 0, tzinfo=newyork, fold=True)
back6 = datetime(2020, 11, 1, 1, 31, 0, tzinfo=newyork, fold=False)
print(back6 - back5)

back7 = datetime(2020, 11, 1, 1, 30, 0, tzinfo=newyork, fold=True)
back8 = datetime(2020, 11, 1, 1, 31, 0, tzinfo=newyork, fold=True)
print(back8 - back7)

0:01:00
0:01:00
0:01:00


Let us do better, slightly by hand.

In [21]:
def fold_compare(dt1, dt2):
    "How long from dt1 until dt2? (possibly negative)"
    delta = timedelta(hours=dt2.fold) - timedelta(hours=dt1.fold) 
    return (dt2 - dt1) + delta

print("Fold none:", fold_compare(back1, back2))
print(" Fold 2nd:", fold_compare(back3, back4))
print(" Fold 1st:", fold_compare(back5, back6))
print("Fold both:", fold_compare(back7, back8))

Fold none: 0:01:00
 Fold 2nd: 1:01:00
 Fold 1st: -1 day, 23:01:00
Fold both: 0:01:00


## Down the rabbit hole...

Often, using the IANA timezone database, either via the `dateutil` module, the future standard library `zoneinfo`, or using another module like `pytz` that references the same database, will do the right thing.  *Often*. **Not always**.

One problem is that leap seconds are simply ignored by `datetime`.  The thing to understand is that a "second" in UTC—and in all the timezones that are defined by offsets from UTC—is **not a measure of time**.  It is a measure of the angular rotation of the Earth.  Specifically, it is approximately 15.041 seconds of angular longitude (relative to the sun).  This odd number comes from the fact that Earth needs to rotate slightly more than 360° to return to facing the sun because of simultaneos prograde revolution around the sun.

Unfortunately, the speed of the rotation of the Earth is not constant; rather it varies slightly because of plate tectonics, magma flow, and the tidal locking of the moon's orbit.  Hence, the International Earth Rotation and Reference Systems Service (IERS) adds (or possibly subtracts, though that has not occurred since 1970) a second to UTC from time to time.  This happens slightly less often than once a year, with a few months warning.

Here is an example of a leap second that `datetime`, and in fact also the IANA timezone database, simply ignores.

In [22]:
before = datetime(2008, 12, 31, 18, 59, 59, tzinfo=tz.gettz('EST'))
after = datetime(2008, 12, 31, 19, 0, 0, tzinfo=tz.gettz('EST'))
print("SHOULD BE 2 seconds")
after - before

SHOULD BE 2 seconds


datetime.timedelta(seconds=1)

Not only does Python's `datetime` not know how to handle those leap seconds, your operating system and other applications do not know how to handle them.  "Unix time" is also not a measure of time, but rather builds in the assumption that a day consists of 86,400 seconds, whether or not a particular day actually has one more or one less second according to UTC.  This is wrong about once a year.  Windows, MacOS, iOS, and Android behave the same as Linux and other Unix-like systems, in this regard.

What actually happens is that your computer checks a Network Time Protocol (NTP) server fairly often, when it is online, and after leap second events, the NTP server tells your computer that it is a second off.  These checks happen other times as well, of course.  Your computer quite likely adjusts its internal clock by a few microseconds every day, even in the absence of leap seconds.

### Does it Matter?

If you are trying to remember your family's birthdays or set a wake-up alarm, these leap seconds do not matter.  However, if you are trying to coordinate computer servers, they quite likely do.  For example, if your database has transactions with commit and rollback capabilities (like nearly all RDBMSs do), knowing the actual order of commands that might be less than a second apart is crucial.

### Descending Further

Even some of the things that the timezone database *should* handle, it seems not to; at least not as exposed in Python.  For example, Samoa and Tokelau decided in 2011 to switch to the other side of the international dateline. As a consequence:

In [23]:
# Capital city Apia cannonical TZ description (Samoa is alias)
before = datetime(2011, 12, 29, 23, 59, 0, tzinfo=tz.gettz('Pacific/Apia'))
after = datetime(2011, 12, 31, 0, 0, 0, tzinfo=tz.gettz('Pacific/Apia'))

print("SHOULD BE 1 minute")
print(after - before)

SHOULD BE 1 minute
1 day, 0:01:00


It would be correct if the database handled this as a permanent 24 hour "daylight savings time" jump forward.

The last section of this lesson is mostly warnings about issues that you will, hopefully, not face.  Few of them have anything to do with Python per se, but are general world time issues.

If you are fortunate enough to control the creation of timestamps by the various systems, it is always easiest to canonicalize everything to UTC before recording any data at all.  That option does not always exist, however.

Having all systems use UTC avoids a number of problems, but does not really address the leap second issue.  Special systems deal with this in various ways.  Google, for example, who obviously have a huge number of servers under their control, have implemented a special "leap smear" in their NTP.

Rather than add a leap second all at once, the Google servers (and others) keep 60 second minutes throughout, but make each second over the course of a period of time (e.g. day) a few microseconds longer than it would be otherwise.  This makes time monotonic and continuous, but different from the rest of the world during that day.

# Exercise 

In this exercise, you will examine an example log file from events in a variety of locations.  Each event is marked with an ISO 8601 timestamp, followed by the time zone it occured in.  Each event is mnemonically described by different firstname.

```
2020-01-01T23:00:01 Pacific/Tahiti Alice
2020-01-01T12:00:02 Europe/Ulyanovsk Bob
2020-01-01T15:00:03 Asia/Makassar Carlos
2020-01-01T06:00:04 Chile/Continental David
2020-01-01T22:00:05 Australia/Brisbane Eve
```

You need to create a list of these names, sorted by the chronological order in which they were logged, in the variable `names`.  The log file where all of these lines occur is `events.log`, and each line is similar to the first few shown.

# Setup

In [24]:
from datetime import datetime, timedelta
from dateutil import tz

# Wrong order, but you want a list like this...
names = ['Ivan', 'Bob', 'Rupert', 'Ted', 'Carlos', 'Olivia', 
         'Niaj', 'Faythe', 'David', 'Sybil', 'Judy', 'Heidi', 
         'Eve', 'Grace', 'Alice', 'Mallory', 'Peggy']

# Solution

In [25]:
with open('events.log') as log:
    events = []
    for record in log:
        dt, zone, name = record.split()
        dt = datetime.fromisoformat(dt)
        dt = dt.replace(tzinfo=tz.gettz(zone))
        events.append((dt.astimezone(tz.tzutc()), name))
        
names = [e[1] for e in sorted(events)]

# Test Cases

In [28]:
def test_order():
    from hashlib import sha1
    assert len(names) == 17, "Should be 17 names"
    assert len(set(names)) == 17, "Must all be distinct names"
    # Hash the list to avoid giving away answer
    hash = sha1('|'.join(names).encode()).hexdigest()
    assert hash == 'b9655d5a8efab9e2aecd41706d3491e296ed92d4',\
        "Got: "+hash
    
test_order()

# Dateutil

The third-party library `dateutil` is recommended in the Python standard library documentation.  It works with the same underlying `datetime.datetime` objects used by the standard libary, but adds many extra conveniences.  We saw in the previous lesson that it provides access to the IANA timezone database, for more accurate time zones.

In [29]:
from datetime import date, time, datetime, timedelta
start = datetime.now()

## Parser

The standard library can read ISO 8601 format, and can read other formats using explicit format codes that are easy to get wrong.  With `dateutil` we can heuristically parse most formats used to represent dates and datetimes.

In [30]:
from dateutil.parser import parse
from dateutil.tz import gettz
# Offset in seconds or at tzinfo object
tzinfos = {"MSK": +10800, "CST": gettz("America/Chicago")}

Various formats are guessed successfully.

In [31]:
parse('2020-01-31T12:30:45')

datetime.datetime(2020, 1, 31, 12, 30, 45)

In [32]:
parse('2020-01-31T12:30:45 MSK', tzinfos=tzinfos)

datetime.datetime(2020, 1, 31, 12, 30, 45, tzinfo=tzoffset('MSK', 10800))

In [33]:
parse('2020-01-31T12:30:45 CST', tzinfos=tzinfos)

datetime.datetime(2020, 1, 31, 12, 30, 45, tzinfo=tzfile('US/Central'))

In [34]:
parse('January 1, 2020 1:30:45 pm +0500')

datetime.datetime(2020, 1, 1, 13, 30, 45, tzinfo=tzoffset(None, 18000))

Different locations choose day-first or month-first date format.

In [35]:
parse('01/02/2020')

datetime.datetime(2020, 1, 2, 0, 0)

In [36]:
parse('01/02/2020', dayfirst=True)

datetime.datetime(2020, 2, 1, 0, 0)

Day of week is still mostly ignored.

In [37]:
parse('Wednesday Aug 12; 2020; 05:29:12 PM')

datetime.datetime(2020, 8, 12, 17, 29, 12)

In [38]:
parse('Monday Aug 12; 2020; 05:29:12 PM')

datetime.datetime(2020, 8, 12, 17, 29, 12)

In [39]:
try:
    parse('Humpday Aug 08; 2020; 05:29:12 PM')
except Exception as err:
    print(err.__class__.__name__, "|", err)

ParserError | Unknown string format: Humpday Aug 08; 2020; 05:29:12 PM


The above code failed due to a `ParseError` as the function parse is unable to recognize the format of the string "Humpday Aug 08; 2020; 05:29:12 PM." The parse function expects a string in a specific date-time format, and the provided string does not match any known format.

The day we passed was invalid

In [41]:
try:
    parse('Thursday Sep 31; 2020; 05:29:12 PM')
except Exception as err:
    print(err.__class__.__name__, "|", err)

ParserError | day is out of range for month: Thursday Sep 31; 2020; 05:29:12 PM


The above code failed due to a `ParseError` as the date we passed is invalid

## Timedeltas

In the standard library, an object called `timedelta` is useful in measuring or adding durations to `datetime` objects.  That object has some limitations, notably in that it only handles regular units, the largest of those being days.  

In commerce and ordinary life, we often think of durations in terms of months, weeks, and years, even though months are of varying lengths, as are years that may or may not be leap years.

For example, let us take two dates (here as `datetime`s, but a simple `date`s would work for this purpose).  One we colloquially call "end of January" the other "end of February."  We might want to *move forward a month*.m

In [43]:
from dateutil.relativedelta import *
jan_end = datetime(2020, 1, 31)
feb_end = datetime(2020, 2, 29)

In [44]:
jan_end + timedelta(days=30)

datetime.datetime(2020, 3, 1, 0, 0)

In [45]:
feb_end + timedelta(days=30)

datetime.datetime(2020, 3, 30, 0, 0)

Those are unsatisfying answers that are numerically "correct" but not what we probably mean.  `relativedelta` from dateutil is more flexible.

In [46]:
jan_end + relativedelta(months=1)

datetime.datetime(2020, 2, 29, 0, 0)

In [47]:
feb_end + relativedelta(months=1)

datetime.datetime(2020, 3, 29, 0, 0)

We might combine different "human scale" increments.

In [48]:
# A year and a month later
jan_end + relativedelta(years=1, months=1)

datetime.datetime(2021, 2, 28, 0, 0)

In [49]:
# The monday before a year and a month later
jan_end + relativedelta(years=1, months=1, weekday=MO(-1))

datetime.datetime(2021, 2, 22, 0, 0)

In [50]:
relativedelta(years=5, months=-5, weekday=TH)

relativedelta(years=+5, months=-5, weekday=TH)

## Recurring Events

The `dateutil` library allows you to create collections of recurring or related dates and datetimes.  By combining arguments to the `rrule()`, `rruleset()` and `rrrulestr()` functions, we can generate iterators over these collections of related times.

In [51]:
from dateutil.rrule import *

Every two-and-a-half hours from now until the same time tomorrow.

In [54]:
alarms = rrule(MINUTELY, interval=150,
               dtstart=start,
               until=start + relativedelta(days=1))
for alarm in alarms:
    print(alarm)

2024-02-08 20:55:42
2024-02-08 23:25:42
2024-02-09 01:55:42
2024-02-09 04:25:42
2024-02-09 06:55:42
2024-02-09 09:25:42
2024-02-09 11:55:42
2024-02-09 14:25:42
2024-02-09 16:55:42
2024-02-09 19:25:42


United States presidential elections follow a slightly odd rule. The occur every four years of the first Tuesday that follows a Monday in November. When are the next 5 of them from right now?


In [55]:
elections = rrule(YEARLY, interval=4, count=5, 
                  bymonth=11,
                  byweekday=TU, 
                  bymonthday=(2, 3, 4, 5, 6, 7, 8),
                  dtstart=start)

for election in elections:
    print(election.date())

2024-11-05
2028-11-07
2032-11-02
2036-11-04
2040-11-06


Sometimes we *almost* want a recurrence rule, but we need to include something extra or exclude something that would otherwise occur.  Here is this time of day, every day of the next week, excluding Tuesday and Thursday. 

In [57]:
events = rruleset()
events.rrule(rrule(DAILY, count=7, dtstart=start))
events.exrule(rrule(DAILY, byweekday=(TU, TH), dtstart=start))

for event in events:
    print(event)

2024-02-09 20:55:42
2024-02-10 20:55:42
2024-02-11 20:55:42
2024-02-12 20:55:42
2024-02-14 20:55:42


Perhaps we would like to add additional datetimes to a collection. For example, on Tuesday/Thursday we want an event, but at exactly noon, rather than based on current time.

In [58]:
noon = start.replace(hour=12, minute=0, second=0, microsecond=0)
events.rrule(rrule(DAILY, 
                   dtstart=noon, 
                   until=noon+relativedelta(days=7), 
                   byweekday=(TU, TH)))

for event in events:
    print(event)

2024-02-08 12:00:00
2024-02-09 20:55:42
2024-02-10 20:55:42
2024-02-11 20:55:42
2024-02-12 20:55:42
2024-02-13 12:00:00
2024-02-14 20:55:42
2024-02-15 12:00:00


We can also read fragments of Internet Calendaring and Scheduling Core Object Specification (iCalendar) string descriptions of recurring events.  Only the timstamp portions are recognized, the other metadata must be stripped separately.

In [60]:
meetings = rrulestr(f"""
  DTSTART:{date.today()}T14:00:00
  RRULE:FREQ=DAILY;INTERVAL=10;COUNT=5
  RRULE:FREQ=DAILY;INTERVAL=5;COUNT=3""")
for meeting in meetings:
    print(meeting)

2024-02-08 14:00:00
2024-02-13 14:00:00
2024-02-18 14:00:00
2024-02-28 14:00:00
2024-03-09 14:00:00
2024-03-19 14:00:00


# Exercise

In this exercise you will create a will create a collection of dates using the capabilities to express durations and recurring events that are provided by `dateutil`.  In your practice, you may want to wrap rule objects in the `list()` constructor to debug your results.  For example

```python
>>> rrule(DAILY, count=3)
<dateutil.rrule.rrule at 0x7fc85865eb80>

>>> list(rrule(DAILY, count=3))
[datetime.datetime(2020, 8, 21, 19, 3, 18),
 datetime.datetime(2020, 8, 22, 19, 3, 18),
 datetime.datetime(2020, 8, 23, 19, 3, 18)]
```

The latter is more useful to eyeball.  

You need to define a set of date objects (not datetime) that begin on January 31, 1980, and end on October 31, 2009, where each date is spaced 17 months apart, and each date is the last day of the month it occurs in.  There will be 22 dates in your result, in the variable `end_of_months`.

Note that we have not obscured the desired answer in the tests.  You should solve this exercise using functions/classes in `datetime` and `dateutil` rather than simply copy the right answer.

# Setup

In [61]:
from datetime import date, datetime
from dateutil.relativedelta import *
from dateutil.rrule import *

rset = rruleset()
rset.rrule(rrule(DAILY, count=3))

# Right kind of object, but wrong dates
end_of_months = set(dt.date() for dt in rset)

# Solution

In [62]:
rset = rruleset()
rset.rrule(
    rrule(MONTHLY, interval=17, 
          dtstart=datetime(1980, 2, 1),
          until=datetime(2009, 12, 1)))

end_of_months = {dt.date() - relativedelta(days=1) for dt in rset}

# Test Cases

In [63]:
def test_type():
    assert isinstance(end_of_months, set)
    
test_type()

In [64]:
def test_number():
    assert len(end_of_months) == 22
    
test_number()

In [65]:
def test_set():
    correct = {
        date(1980, 1, 31), date(1981, 6, 30), date(1982, 11, 30),
        date(1984, 4, 30), date(1985, 9, 30), date(1987, 2, 28),
        date(1988, 7, 31), date(1989, 12, 31),date(1991, 5, 31),
        date(1992, 10, 31),date(1994, 3, 31), date(1995, 8, 31),
        date(1997, 1, 31), date(1998, 6, 30), date(1999, 11, 30),
        date(2001, 4, 30), date(2002, 9, 30), date(2004, 2, 29),
        date(2005, 7, 31), date(2006, 12, 31),date(2008, 5, 31),
        date(2009, 10, 31)}
    assert end_of_months == correct
    
test_set()

In [66]:
def test_set_obscured():
    from hashlib import sha1
    em = sorted(dt.isoformat() for dt in end_of_months)
    hash = sha1('|'.join(em).encode()).hexdigest()
    assert hash == '1711158808fe535c3c80c43181704fb1e7c6a351'
    
test_set_obscured()

# Third-Party Tools

* Several libraries in the Python ecosystem provide alternate APIs for datetime handling.
* Tools that are simlar but often complementary:
  * *Arrow*
  * *Maya*
  * *Delorean*
  * *Pendulum*
* Built on a custom class rather than `datetime`.
* Most provide evaluation heuristics along the lines of *Dateutil* `parse()` function.
* Some of them produce informal strings for user messages, e.g.:


In [2]:
!pip install arrow

Collecting arrow
  Using cached arrow-1.3.0-py3-none-any.whl (66 kB)
Installing collected packages: arrow
Successfully installed arrow-1.3.0



[notice] A new release of pip available: 22.2.2 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


In [3]:
import arrow
present = arrow.utcnow()
future = present.shift(minutes=150)
future.humanize(present)

'in 2 hours'

More on the functionalities and utilities of the library can be foun [here](https://arrow.readthedocs.io/en/latest/)

-------------
Materials licensed under [CC BY-NC-ND 4.0](https://creativecommons.org/licenses/by-nc-nd/4.0/) by the authors