# Undate: uncertain, incomplete, and partially-known dates

A python library for working with dates in humanistic and cultural data.


*(TODO: more context / background here)*

## Basic functionality

Like Python's builtin `datetime.date` object, an `Undate` can be initialized by specifying numeric values for year, month, and day.

We can print them using the default serialization (ISO8601, or YYYY-MM-DD), and we can compare them.

In [None]:
import datetime

from IPython.display import display, Markdown
from undate.undate import Undate

# these are equivalent
dt_november7 = datetime.date(2000, 11, 7)
november7 = Undate(2000, 11, 7)

We can print them out. By default, both of these dates will be displayed in ISO8601 format (YYYY-MM-DD).

In [29]:
print(dt_november7)

2000-11-07


In [30]:
print(november7)

2000-11-07


We can also compare them. Is this the same date?

In [33]:
bool(november7 == dt_november7) 

True

Unlike Python's `datetime.date`, an `Undate` can be initialized without providing all values for year, month, and day.

We can create Undate instances for the month of November in 2000, for the year 2000, or even for November 7th in some unknown year.

`Undate` also has an optional `label` field, since it's sometimes useful to attach a label to date.

In [47]:
# November 2000
november = Undate(2000, 11, label="November 2000")
# Year 2000
year2k = Undate(2000, label="Y2K")
# November 7 in an unknown year
november7_some_year = Undate(month=11, day=7, label="Some November 7")
# let's reinitialize our first date with a label too
november7 = Undate(2000, 11, 7, label="November 7, 2000")

# sometimes names are important
easter1916 = Undate(1916, 4, 23, label="Easter 1916")

Each of these `Undate` objects can be displayed in a standard format, and also has information about the precision of the date and duration information.

In [46]:
for example_date in [november, year2k, november7_some_year, november7, easter1916]:
    print(f"\n{example_date.label}: {example_date}")
    print(f"Date precision: {example_date.precision}")
    print(f"Duration in days: {example_date.duration().days}")


November 2000: 2000-11
Date precision: MONTH
Duration in days: 30

Y2K: 2000
Date precision: YEAR
Duration in days: 366

Some November 7: --11-07
Date precision: DAY
Duration in days: 1

November 7, 2000: 2000-11-07
Date precision: DAY
Duration in days: 1

Easter 1916: 1916-04-23
Date precision: DAY
Duration in days: 1


We can also do some simple calculations, like whether one date falls within another date.

In [52]:
november in year2k

True

In [53]:
november7 in year2k

True

In [56]:
november7 in november

True

In [55]:
easter1916 in year2k

False

In [54]:
november7_some_year in year2k

False

## Partially unknown values

We can also intialize an `Undate` object with string values, when a date is only partially known. We use the character `X` to indicate an unknown digit, following the notation used in the [Extended Date Time Format (EDTF)](https://www.loc.gov/standards/datetime/).

In [65]:
someyear_1900s = Undate("19XX", label="1900s")
late2022 = Undate(2022, "1X", label="late 2022")

# FIXME: duration isn't right for year! and assumes max for month
# can we get UnInt duration for both of these?

for example_date in [someyear_1900s, late2022]:
    print(f"\n{example_date.label}: {example_date}")
    print(f"Date precision: {example_date.precision}")
    # print(f"Duration in days: {example_date.duration().days}")   # inaccurate! fix or omit?


1900s: 19XX
Date precision: YEAR

late 2022: 2022-1X
Date precision: MONTH


When an `Undate` instance is initialized, internally the class calculates earliest and latest possible values for that date in the Gregorian calendar.

This means that some comparisons are possible even without precise information.

For instance, is a year sometime during the 1900s before a month in late 2022?

In [66]:
someyear_1900s < late2022

True

But uncertain dates aren't equivalent:

In [67]:
late2022 == Undate(2022, "1X")

False