<div class="pagebreak"></div>

# Dates and Times
Dates and times exist in a vast number of different applications. At a very minimum, log messages have datetime values to show when an event occurred.    

Datetime representation and functionality seems to be a problem with straightforward solutions, but underneath the surface, implementing datetime correctly has been surprisingly difficult to get correct.  

One problem stems from how we represent dates in our daily lives -
- October 5, 2005
- Oct. 5, 2005
- 5 Oct 2005
- 5/10/2005
- 10/5/2005
- 20051005

So many formats?  Which comes first - the day or the month?  do we start counting at 0 or at 1?   What about time zones?  Does  Arizona still not follow daylight savings time? Can't confuse coyotes. When did the cows in Indiana start following daylight savings time? Then there's leap years that occur every 4 years, except at the century mark, unless divisible by 400...  Then there's leap seconds.  When are those applied?  When was the last one applied?  Most of western civilization follows the Gregorian calendar, but what about dates prior to its introduction in 1582?  What do other civilizations use? 

Most computer systems now track datetime based upon [Unix Epoch](https://en.wikipedia.org/wiki/Unix_time), which is the number of seconds since January 1st, 1970 at 00:00:00 UTC.   Some implementations will also use even finer grained representations such as milliseconds and nano seconds. Due to the size of most computer systems when this approach was adopted, the datetime type was track with a signed 32 bit integer which limits the acceptable date range from Dec 13, 1901 to January 19, 20238.  Using a 64 bit numbers in modern system means these limits are no longer a practical concern.  

Web page to see the current Epoch time and perform conversions: [https://currentmillis.com/](https://currentmillis.com/)

For string representations of dates and times, you should strive to follow [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601).  The format expresses dates and times from largest unit to the smallest unit. Python's datetime module uses the following format for ISO (though several alternatives exist): 2005-10-05T22:40:32.123456  (YYYYMMDDTHHmmSS.ssssss) where
- YYYY: 4 digit year
- MM: 2 digit month from 01 to 12
- DD: 2 digit day from 01 to 31
- T: separate of the date and time portions.  Time portions can also be separated by :
- HH: 2 digit hours from 00 to 23
- mm: 2 digit minutes from 00 to 59
- SS: 2 digit seconds from 00 to 59
- sssuuu: 3 digits for milliseconds and 3 digits for microsecods.

## The `datetime` Module
Python's `datetime` module provides functionality to support date and times.  The module defines four primary classes:
- `date`: represents just dates
- `time`: represents time, but no dates.
- `datetime`: represents the date and time together
- `timedelta`: used for the difference/interval between dates and/or times

In [None]:
import datetime as dt

In [None]:
# run a command to see what's available within the datetime/dt module


In the output, you should have seen at least the four main classes listed. Ideally, you can see the online information available from within Python on these classes.

### `date`

Several different methods exist to create a date object:

In [None]:
from datetime import date

# From the current date
today = date.today()
print(today)

# From a specific year, month and date
someday = date(2005,10,5)
print(someday)

# From Unix epoch time value
anotherday = date.fromtimestamp(1656549504)
print(anotherday)

# From an isoformat string
aday = date.fromisoformat("2021-12-02")
print(aday)

# from the number of days since January 1st, 0001.
oday = date.fromordinal(738340)
print(oday)

# Show the min and max values possibles for date
print("Minimum date:",date.min)
print("Maximum date:",date.max)

The `date` object supports the comparison operators: ==, !=, <, <=, >=, >

You can also also substract two date objects to get a `timedelta` object representing the difference between dates.

In [None]:
x = today-oday
print(type(x))
print(x)

### `time`

The `time` object represents time.  As with the `date` object, we have several ways to contruct a `time` object.

In [None]:
from datetime import time

# from the current datetime, extract the time portion 
#(can do this with any datetime object)
t = dt.datetime.now().time()
print(type(t))
print("Current time:",t)

# from a specific time
spec_time = time(23,4,30)
print(spec_time)

# from a specific time, using microseconds
spec_time = time(11,5,24,123456)
print(spec_time)

# display the minimum and maximum values
print("Minimum time:", time.min)
print("Maximum time:", time.max)

# Access the time attributes
print("hour:", t.hour)
print("minute:", t.minute)
print("second:", t.second)
print("microsecond:", t.microsecond)

The `time` object supports the comparison operators: ==, !=, <, <=, >=, >

The `time` object does not support the subtraction operator.

The `time` object does support timezones, which will cover as part of `datetime`.

### `datetime`
The `datetime` class inherites from the `date` class, adding the capabilities to also represent time.

In [None]:
# main constructor from
a = dt.datetime(2014, 11, 28, 17, 10, 30, 342380)
print("year =", a.year)
print("month =", a.month)
print("hour =", a.hour)
print("minute =", a.minute)
print("timestamp =", a.timestamp())
print()

# create datetime from the current date and time
now = dt.datetime.now()
utcnow = dt.datetime.utcnow()
print(now.isoformat())
print(utcnow.isoformat())
print()

# Construct date from an iso formatted string
iso = dt.datetime.fromisoformat("2022-06-30T20:19:56+05:00")
print(iso.isoformat())

If you notice in the above code that the datetime objects do not have an associated timezone unless you explicitly specify one.  While `datetime.now()` returns the current time in the local timezone, the datetime object is not aware of that timezone by default.  Python considers calls these two types of datetime objects as aware and naive.  Aware datetime objects include timezone information.  Naive objects do not have timezone information.  Aware objects can unambiguously define a specific moment in time and can be computed relative to aware datetime objects.  A naive datetime object has neither of those two properties - the exact meaning of is left to the program.  The lesson - always use timezones when you work with datetime objects. 

The following code returns the local timezone as a `datetime.timezone` object.

In [None]:
LOCAL_TIMEZONE = dt.datetime.now().astimezone().tzinfo
print("Time Zone:",LOCAL_TIMEZONE)
print("Type:",type(LOCAL_TIMEZONE))

Recommended approach to creating UTC current datetime objects:

In [None]:
utc = dt.datetime.now(dt.timezone.utc)
print(utc)

The next block defines a function that converts a datetime to a specific timezone.

In [None]:
def utc_to_timezone(utc_dt,time_zone=None):
    """Converts a datetime assumed to be in UTC timezone to a specifc time zone. 
       If the timezone is not defined, then the local timezone is assumed."""
    return utc_dt.replace(tzinfo=dt.timezone.utc).astimezone(tz=time_zone)

In [None]:
# get the current time and convert to the local timezone.
print(utc_to_timezone(dt.datetime.utcnow()))

In the following code, we'll use the zoneinfo package which was added to the Standard Library in 3.9.

With this module, we can instantiate timezone objects based upon a [time zone name](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) - the available ones for Python can be found through `zoneinfo.available_timezones()`

In [None]:
import zoneinfo

# produces a set of timezone names
zones = zoneinfo.available_timezones()

# convert the curent datetime to Hong Kong's timezone
hk_tz = zoneinfo.ZoneInfo('HongKong')
print(utc_to_timezone(dt.datetime.utcnow(),hk_tz))

## Date and Time Arithmetic
TODO: need more

In [None]:
t4 = dt.datetime(year = 2018, month = 7, day = 12, hour = 7, minute = 9, second = 33)
t5 = dt.datetime(year = 2019, month = 6, day = 10, hour = 5, minute = 55, second = 13)
t6 = t4 - t5
print("t6 =", t6)

t4 = dt.datetime(year = 2018, month = 12, day = 31, hour = 23, minute = 0, second = 33)
t5 = dt.datetime(year = 2019, month = 1, day = 31, hour = 1, minute = 0, second = 13)
t6 = t4 - t5
print("t6 =", t6)

## Formatting and Parsing
As with with many other programming languages, Python allows you to define custom formats to display date & time as well as to parse strings for those custom formats.

Use the `strftime` function to convert datetime into a specific formatted string.  The format string defines a series of format codes that correspond to the different parts of a date as well as different ways to display that part.

In [None]:
utcnow = dt.datetime.utcnow().replace(tzinfo=dt.timezone.utc)
print(utcnow.strftime("%m/%d/%Y, %H:%M:%S %Z"))

<table class="docutils align-default">
<colgroup>
<col style="width: 15%">
<col style="width: 43%">
<col style="width: 32%">
<col style="width: 9%">
</colgroup>
<thead>
<tr class="row-odd"><th class="head"><p>Directive</p></th>
<th class="head"><p>Meaning</p></th>
<th class="head"><p>Example</p></th>
<th class="head"><p>Notes</p></th>
</tr>
</thead>
<tbody>
<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">%y</span></code></p></td>
<td><p>Year without century as a
zero-padded decimal number.</p></td>
<td><p>00, 01, …, 99</p></td>
<td><p>(9)</p></td>
</tr>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">%Y</span></code></p></td>
<td><p>Year with century as a decimal
number.</p></td>
<td><p>0001, 0002, …, 2013,
2014, …, 9998, 9999</p></td>
<td><p>(2)</p></td>
</tr>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">%b</span></code></p></td>
<td><p>Month as locale’s abbreviated
name.</p></td>
<td><div class="line-block">
<div class="line">Jan, Feb, …, Dec
(en_US);</div>
<div class="line">Jan, Feb, …, Dez
(de_DE)</div>
</div>
</td>
<td><p>(1)</p></td>
</tr>
<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">%B</span></code></p></td>
<td><p>Month as locale’s full name.</p></td>
<td><div class="line-block">
<div class="line">January, February,
…, December (en_US);</div>
<div class="line">Januar, Februar, …,
Dezember (de_DE)</div>
</div>
</td>
<td><p>(1)</p></td>
</tr>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">%m</span></code></p></td>
<td><p>Month as a zero-padded
decimal number.</p></td>
<td><p>01, 02, …, 12</p></td>
<td><p>(9)</p></td>
</tr>

<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">%d</span></code></p></td>
<td><p>Day of the month as a
zero-padded decimal number.</p></td>
<td><p>01, 02, …, 31</p></td>
<td><p>(9)</p></td>
</tr>
<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">%H</span></code></p></td>
<td><p>Hour (24-hour clock) as a
zero-padded decimal number.</p></td>
<td><p>00, 01, …, 23</p></td>
<td><p>(9)</p></td>
</tr>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">%I</span></code></p></td>
<td><p>Hour (12-hour clock) as a
zero-padded decimal number.</p></td>
<td><p>01, 02, …, 12</p></td>
<td><p>(9)</p></td>
</tr>
<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">%p</span></code></p></td>
<td><p>Locale’s equivalent of either
AM or PM.</p></td>
<td><div class="line-block">
<div class="line">AM, PM (en_US);</div>
<div class="line">am, pm (de_DE)</div>
</div>
</td>
<td><p>(1),
(3)</p></td>
</tr>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">%M</span></code></p></td>
<td><p>Minute as a zero-padded
decimal number.</p></td>
<td><p>00, 01, …, 59</p></td>
<td><p>(9)</p></td>
</tr>
<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">%S</span></code></p></td>
<td><p>Second as a zero-padded
decimal number.</p></td>
<td><p>00, 01, …, 59</p></td>
<td><p>(4),
(9)</p></td>
</tr>
<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">%z</span></code></p></td>
<td><p>UTC offset in the form
<code class="docutils literal notranslate"><span class="pre">±HHMM[SS[.ffffff]]</span></code> (empty
string if the object is
naive).</p></td>
<td><p>(empty), +0000,
-0400, +1030,
+063415,
-030712.345216</p></td>
<td><p>(6)</p></td>
</tr>
<tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">%Z</span></code></p></td>
<td><p>Time zone name (empty string
if the object is naive).</p></td>
<td><p>(empty), UTC, GMT</p></td>
<td><p>(6)</p></td>
</tr>
<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">%%</span></code></p></td>
<td><p>A literal <code class="docutils literal notranslate"><span class="pre">'%'</span></code> character.</p></td>
<td><p>%</p></td>
<td></td>
</tr>
</tbody>
</table>
Source: <a href='https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes'>https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes</a>
<br> The above table only shows the more commonly used formats.  These codes were adopted from the 1989 C Standard.

Similarly, we can use `strptime` to parse datetime objects from a string.

In [None]:
str = '07/01/2022, 12:20:04 UTC'
mydate = dt.datetime.strptime(str,"%m/%d/%Y, %H:%M:%S %Z")
print(mydate, mydate.tzinfo)    ## Notice that mydate is not an aware datetime

## Pendulum
Several other developers and groups have written alternate libraries for datetime functionality within Python

[Pendulum](https://pendulum.eustace.io/) is a drop-in replacement for the datetime library. Besides have a cleaner API, the Pendulum library creates timezone aware objects by default.

In [None]:
import pendulum
now = pendulum.now()
print(now)
print(now.timezone)
print(pendulum.now('UTC'))

Using the method resolution order function `mro()`, we can see the inheritance hierarchy and notice that pendulum's classes extend Python's `datetime` objects.

In [None]:
print(now.__class__.mro())
print()
print(now.timezone.__class__.mro())

In [None]:
# Converting to iso8601 format
print(now.to_iso8601_string())

# Date arithimetic 
print(now.add(days=2))

# Definining a duration
dur = pendulum.duration(days=10, hours=5)
print(dur)
print("dur.weeks:",dur.weeks)
print("dur.days:",dur.days)
print("dur.hours:",dur.hours)
print(dur.in_hours())
print(dur.in_words(locale="zh"))

#Using the duration in datetime arithmetic
print(now - dur)

# A period is the difference between 2 datetime instances.
# it maintains a reference to those datetimes
period = now - now.subtract(days=3)
print(period)
print(period.in_seconds())
print(period.in_days())
print(now - period - period)

Pendulum also supports an alternative format to convert datetimes to objects.  This format approach is used with many other languages and libraries.

[Further details and available tokens](https://github.com/sdispater/pendulum/blob/master/docs/docs/string_formatting.md)

In [None]:
now = pendulum.now()
print(now)
now.format('YYYY-MM-DD HH:mm:ssZ')

In [None]:
parsed = pendulum.from_format('2022-07-01 08:44:39-04:00','YYYY-MM-DD HH:mm:ssZ')
print(parsed)

[Arrow](https://arrow.readthedocs.io/en/latest/) is another popular alternative to Python's datetime module.

## Best Practices
Generally speaking, you should process datetimes with the UTC timezone and then display datetimes to users in their local timezone (and show that timezone). For web applications, the timezone conversion should occur within the client's browser.

Storing datetimes can be a bit more complicated. For dates that have already occurred, UTC with the timezone offset is the most appropriate.  For datetimes that occur in the future, the situation is not necessarily unambiguous - the problem arises when changes are made to timezone rules after a datetime has been saved. [Saving datetimes for the future](https://web.archive.org/web/20220623083001/http://www.creativedeletion.com/2015/03/19/persisting_future_datetimes.html) You'll want to store the timezone name in addition to the datetime in UTC/offset.  By having the timezone name, you can check if the offset has subsequently changed. Provided you can map the event location back to a timezone, the event location suffices.

For data science applications, most datetimes will have occurred in the past so storing those dates with the relevant timezone is appropriate.  That relevant timezone is where the record occurred (or where the given user was). However, you will also want to create additional features for either visualization or machine learning purposes.  This includes such things as -
- day of the week
- weekend flag
- holiday flag
- local date and time
- local hour
- time before an event

And last, but certainly not least, use a library to process dates and times.

## Studying the Python Source Code
- immutable, uses `__new__` not `__init__`
- validates parameters 
  - must be integers which the call to `_index()` enforces
  - in the appropriate ranges. days in month is checked
- offset the days_in_months array to make indexing faster
- hash computation is lazily initialized
- docstrings
- property methods
- chain constructors w/ the class factory methods
- sanity check of assert in \_days_in_month. The month check should have already occured in the prior code.
- type checking in `__sub__`
Source: https://github.com/python/cpython/blob/main/Lib/datetime.py
Note: the following code is not complete..

In [None]:
from operator import index as _index
class date:
    """Concrete date type.
    Constructors:
    __new__()
    fromtimestamp()
    today()
    fromordinal()
    Operators:
    __repr__, __str__
    __eq__, __le__, __lt__, __ge__, __gt__, __hash__
    __add__, __radd__, __sub__ (add/radd only with timedelta arg)
    Methods:
    timetuple(), toordinal(), weekday(),  isoweekday(), isocalendar(), isoformat()
    ctime(), strftime()
    Properties (readonly):
    year, month, day
    """
    __slots__ = '_year', '_month', '_day', '_hashcode'

    def __new__(cls, year, month=None, day=None):
        """Constructor.
        Arguments:
        year, month, day (required, base 1)
        """
        ### removed "pickle code" ...
        year, month, day = _check_date_fields(year, month, day)
        self = object.__new__(cls)
        self._year = year
        self._month = month
        self._day = day
        self._hashcode = -1
        return self

    # Additional constructors
    @classmethod
    def fromtimestamp(cls, t):
        "Construct a date from a POSIX timestamp (like time.time())."
        y, m, d, hh, mm, ss, weekday, jday, dst = _time.localtime(t)
        return cls(y, m, d)

    @classmethod
    def today(cls):
        "Construct a date from time.time()."
        t = _time.time()
        return cls.fromtimestamp(t)
    
    
    def isoformat(self):
        """Return the date formatted according to ISO.  This is 'YYYY-MM-DD'."""
        return "%04d-%02d-%02d" % (self._year, self._month, self._day)
        
    # Read-only field accessors
    @property
    def year(self):
        """year (1-9999)"""
        return self._year   
        
    def __sub__(self, other):
        """Subtract two dates, or a date and a timedelta."""
        if isinstance(other, timedelta):
            return self + timedelta(-other.days)
        if isinstance(other, date):
            days1 = self.toordinal()
            days2 = other.toordinal()
            return timedelta(days1 - days2)
        return NotImplemented

    def __hash__(self):
        "Hash."
        if self._hashcode == -1:
            self._hashcode = hash(self._getstate())
        return self._hashcode
        
    def _getstate(self):
        yhi, ylo = divmod(self._year, 256)
        return bytes([yhi, ylo, self._month, self._day])
        
def _check_date_fields(year, month, day):
    year = _index(year)
    month = _index(month)
    day = _index(day)
    if not MINYEAR <= year <= MAXYEAR:
        raise ValueError('year must be in %d..%d' % (MINYEAR, MAXYEAR), year)
    if not 1 <= month <= 12:
        raise ValueError('month must be in 1..12', month)
    dim = _days_in_month(year, month)
    if not 1 <= day <= dim:
        raise ValueError('day must be in 1..%d' % dim, day)
        
def _is_leap(year):
    "year -> 1 if leap year, else 0."
    return year % 4 == 0 and (year % 100 != 0 or year % 400 == 0)

# -1 is a placeholder for indexing purposes.
_DAYS_IN_MONTH = [-1, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]

def _days_in_month(year, month):
    "year, month -> number of days in that month in that year."
    assert 1 <= month <= 12, month
    if month == 2 and _is_leap(year):
        return 29
    return _DAYS_IN_MONTH[month]

## Exercises
TODO