# Working with Datetimes
- Date/time data is also called **temporal** data. Temporal means "of or relating to time".

In [1]:
import polars as pl

## Import a Dataset with Datetimes
- ISO8601 is a standard/convention for representing a time as a string. Polars supports most ISO8601-formats.
- Polars will import datetime columns as strings by default.
- The `clock_in` column uses `YYYY-MM-DD HH:MM:SS` format.
- The `clock_out` column uses `YYYY-MM-DDTHH:MM:SS` format (`T` is a separator for date and time).

In [2]:
pl.read_csv("clock_in_times.csv")

employee_id,clock_in,clock_out
str,str,str
"""E001""","""2025-07-01 08:55:00""","""2025-07-01T17:05:00"""
"""E002""","""2025-07-01 09:10:00""","""2025-07-01T17:45:00"""
"""E003""","""2025-07-01 08:50:00""","""2025-07-01T16:30:00"""
"""E004""","""2025-07-01 10:00:00""","""2025-07-01T18:00:00"""
"""E005""","""2025-07-01 07:45:00""","""2025-07-01T15:15:00"""


- Pass `True` to the `try_parse_dates` parameter of the `read_csv` function to attempt to parse datetime values.
- Polars will fallback to strings if it cannot convert a column's values to datetimes.
- The `[μs]` (mu) symbol means "microsecond precision". There are 1,000,000 microseconds in a second.

In [3]:
pl.read_csv("clock_in_times.csv", try_parse_dates=True)

employee_id,clock_in,clock_out
str,datetime[μs],datetime[μs]
"""E001""",2025-07-01 08:55:00,2025-07-01 17:05:00
"""E002""",2025-07-01 09:10:00,2025-07-01 17:45:00
"""E003""",2025-07-01 08:50:00,2025-07-01 16:30:00
"""E004""",2025-07-01 10:00:00,2025-07-01 18:00:00
"""E005""",2025-07-01 07:45:00,2025-07-01 15:15:00


- As an alternative, use the `schema_overrides` parameter to cast specific columns to different data types.
- The `pl.Datetime` type represents a datetime.

In [4]:
pl.read_csv(
    "clock_in_times.csv",
    schema_overrides={"clock_in": pl.Datetime, "clock_out": pl.Datetime},
)

employee_id,clock_in,clock_out
str,datetime[μs],datetime[μs]
"""E001""",2025-07-01 08:55:00,2025-07-01 17:05:00
"""E002""",2025-07-01 09:10:00,2025-07-01 17:45:00
"""E003""",2025-07-01 08:50:00,2025-07-01 16:30:00
"""E004""",2025-07-01 10:00:00,2025-07-01 18:00:00
"""E005""",2025-07-01 07:45:00,2025-07-01 15:15:00


- If the `DataFrame` is already created, use the `str.to_datetime` method cast it into a datetime column.

In [5]:
clock_in_times = pl.read_csv("clock_in_times.csv")
clock_in_times

employee_id,clock_in,clock_out
str,str,str
"""E001""","""2025-07-01 08:55:00""","""2025-07-01T17:05:00"""
"""E002""","""2025-07-01 09:10:00""","""2025-07-01T17:45:00"""
"""E003""","""2025-07-01 08:50:00""","""2025-07-01T16:30:00"""
"""E004""","""2025-07-01 10:00:00""","""2025-07-01T18:00:00"""
"""E005""","""2025-07-01 07:45:00""","""2025-07-01T15:15:00"""


In [6]:
clock_in_times.with_columns(
    pl.col("clock_in").str.to_datetime(), pl.col("clock_out").str.to_datetime()
)

employee_id,clock_in,clock_out
str,datetime[μs],datetime[μs]
"""E001""",2025-07-01 08:55:00,2025-07-01 17:05:00
"""E002""",2025-07-01 09:10:00,2025-07-01 17:45:00
"""E003""",2025-07-01 08:50:00,2025-07-01 16:30:00
"""E004""",2025-07-01 10:00:00,2025-07-01 18:00:00
"""E005""",2025-07-01 07:45:00,2025-07-01 15:15:00


### Further Reading
- https://docs.pola.rs/api/python/stable/reference/api/polars.read_csv.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.str.to_datetime.html

## Parse Datetimes with the strptime Method
- The `try_parse_dates` parameter may fail to convert a datetime column into string values.
- Polars will fail to convert the `us_format` and `custom_format` columns below.
- If the conversion fails, import the column as strings, then use the `str.strptime` method to convert.
- Pass the desired Polars data type to the `dtype` parameter.

In [7]:
weird_datetimes = pl.read_csv("weird_datetimes.csv", try_parse_dates=True)
weird_datetimes.head(2)

iso_format,us_format,custom_format
datetime[μs],str,str
2025-07-06 14:30:00,"""07/06/2025 02:30 PM""","""06-Jul-2025 14:30"""
2025-07-07 09:15:00,"""07/07/2025 09:15 AM""","""07-Jul-2025 09:15"""


- `strptime` stands for "string - parse time" (i.e., parse/read a datetime from a string).
- The method accepts a format string which uses symbols to designate the components of the datetime format.
- For example, the `%m` symbol designates a month, the `%d` symbol designates a day, and the `%Y` symbol designates a 4-digit year.
- Include spaces in the format string. It must perfectly match the format of the datetime string.
- The `strptime` method relies on the Rust `chrono` crate behind the scenes.
- The format string syntax must follow the `chrono` crate standard which may deviate from Python's standard.

In [8]:
weird_datetimes.with_columns(
    pl.col("us_format").str.strptime(dtype=pl.Datetime, format="%m/%d/%Y %I:%M %p"),
    pl.col("custom_format").str.strptime(dtype=pl.Datetime, format="%d-%b-%Y %H:%M"),
)

iso_format,us_format,custom_format
datetime[μs],datetime[μs],datetime[μs]
2025-07-06 14:30:00,2025-07-06 14:30:00,2025-07-06 14:30:00
2025-07-07 09:15:00,2025-07-07 09:15:00,2025-07-07 09:15:00
2025-07-08 17:45:00,2025-07-08 17:45:00,2025-07-08 17:45:00
2025-07-09 23:59:59,2025-07-09 23:59:00,2025-07-09 23:59:00
2025-07-10 00:00:00,2025-07-10 00:00:00,2025-07-10 00:00:00


### Further Reading
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.str.strptime.html

## Parse Dates and Times

In [9]:
weird_datetimes = pl.read_csv("weird_datetimes.csv", try_parse_dates=True)
weird_datetimes.head(2)

iso_format,us_format,custom_format
datetime[μs],str,str
2025-07-06 14:30:00,"""07/06/2025 02:30 PM""","""06-Jul-2025 14:30"""
2025-07-07 09:15:00,"""07/07/2025 09:15 AM""","""07-Jul-2025 09:15"""


- Pass a type of `pl.Date` to extract just date information (no associated time).
- The `format` parameter must still receive the full format string so that it can parse the string.

In [10]:
weird_datetimes.with_columns(
    pl.col("us_format").str.strptime(dtype=pl.Date, format="%m/%d/%Y %I:%M %p"),
    pl.col("custom_format").str.strptime(dtype=pl.Date, format="%d-%b-%Y %H:%M"),
)

iso_format,us_format,custom_format
datetime[μs],date,date
2025-07-06 14:30:00,2025-07-06,2025-07-06
2025-07-07 09:15:00,2025-07-07,2025-07-07
2025-07-08 17:45:00,2025-07-08,2025-07-08
2025-07-09 23:59:59,2025-07-09,2025-07-09
2025-07-10 00:00:00,2025-07-10,2025-07-10


- Use a `dtype` of `pl.Time` to to extract a `time` column (no date information).

In [11]:
weird_datetimes.with_columns(
    pl.col("us_format").str.strptime(dtype=pl.Time, format="%m/%d/%Y %I:%M %p"),
    pl.col("custom_format").str.strptime(dtype=pl.Time, format="%d-%b-%Y %H:%M"),
)

iso_format,us_format,custom_format
datetime[μs],time,time
2025-07-06 14:30:00,14:30:00,14:30:00
2025-07-07 09:15:00,09:15:00,09:15:00
2025-07-08 17:45:00,17:45:00,17:45:00
2025-07-09 23:59:59,23:59:00,23:59:00
2025-07-10 00:00:00,00:00:00,00:00:00


- As an alternative to `strptime` and `dtype`, use the `str.to_datetime`, `str.to_date` and `str.to_time` methods.
- The methods accept the format of the time string.

In [12]:
weird_datetimes.select(
    pl.col("us_format"),
    pl.col("us_format").str.to_datetime("%m/%d/%Y %I:%M %p").alias("datetime"),
    pl.col("us_format").str.to_date("%m/%d/%Y %I:%M %p").alias("date"),
    pl.col("us_format").str.to_time("%m/%d/%Y %I:%M %p").alias("time"),
)

us_format,datetime,date,time
str,datetime[μs],date,time
"""07/06/2025 02:30 PM""",2025-07-06 14:30:00,2025-07-06,14:30:00
"""07/07/2025 09:15 AM""",2025-07-07 09:15:00,2025-07-07,09:15:00
"""07/08/2025 05:45 PM""",2025-07-08 17:45:00,2025-07-08,17:45:00
"""07/09/2025 11:59 PM""",2025-07-09 23:59:00,2025-07-09,23:59:00
"""07/10/2025 12:00 AM""",2025-07-10 00:00:00,2025-07-10,00:00:00


### Further Reading
- https://docs.pola.rs/user-guide/transformations/time-series/parsing/#casting-strings-to-dates
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.str.strptime.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.str.to_datetime.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.str.to_date.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.str.to_time.html

## Converting from Date to String
- The `Datetime` type handles datetime values (both a calendar day and a specific time).
- The `Date` type represents a calendar date without an associated time.
- The `Time` type represents a time without an associated calendar date.
- The `dt` attribute/namespace holds all methods for building temporal expressions.

In [13]:
clock_in_times = pl.read_csv("clock_in_times.csv", try_parse_dates=True)
clock_in_times

employee_id,clock_in,clock_out
str,datetime[μs],datetime[μs]
"""E001""",2025-07-01 08:55:00,2025-07-01 17:05:00
"""E002""",2025-07-01 09:10:00,2025-07-01 17:45:00
"""E003""",2025-07-01 08:50:00,2025-07-01 16:30:00
"""E004""",2025-07-01 10:00:00,2025-07-01 18:00:00
"""E005""",2025-07-01 07:45:00,2025-07-01 15:15:00


- The `dt.date` method extracts the date from a datetime column.
- The `dt.time` method extracts the time from a datetime column.

In [14]:
clock_in_times.with_columns(
    pl.col("clock_in").dt.date().alias("clock_in_date"),
    pl.col("clock_in").dt.time().alias("clock_in_time"),
)

employee_id,clock_in,clock_out,clock_in_date,clock_in_time
str,datetime[μs],datetime[μs],date,time
"""E001""",2025-07-01 08:55:00,2025-07-01 17:05:00,2025-07-01,08:55:00
"""E002""",2025-07-01 09:10:00,2025-07-01 17:45:00,2025-07-01,09:10:00
"""E003""",2025-07-01 08:50:00,2025-07-01 16:30:00,2025-07-01,08:50:00
"""E004""",2025-07-01 10:00:00,2025-07-01 18:00:00,2025-07-01,10:00:00
"""E005""",2025-07-01 07:45:00,2025-07-01 15:15:00,2025-07-01,07:45:00


- Datetime columns enable temporal operations that would be impossible with strings.
- For example, datetimes support adding or subtracting durations as columns.
- Use the `dt.to_string` method to convert datetime columns to strings.
- Polars will convert the datetime to a ISO 8601 standard string.

In [15]:
clock_in_times.select(
    pl.col("clock_in"),
    pl.col("clock_in").dt.to_string().alias("default"),
    pl.col("clock_in").dt.to_string("%B %m, %Y").alias("formatted"),
    pl.col("clock_in").dt.date().dt.to_string().alias("date_default"),
    pl.col("clock_in").dt.date().dt.to_string("%B %m, %Y").alias("date_formatted"),
)

clock_in,default,formatted,date_default,date_formatted
datetime[μs],str,str,str,str
2025-07-01 08:55:00,"""2025-07-01 08:55:00.000000""","""July 07, 2025""","""2025-07-01""","""July 07, 2025"""
2025-07-01 09:10:00,"""2025-07-01 09:10:00.000000""","""July 07, 2025""","""2025-07-01""","""July 07, 2025"""
2025-07-01 08:50:00,"""2025-07-01 08:50:00.000000""","""July 07, 2025""","""2025-07-01""","""July 07, 2025"""
2025-07-01 10:00:00,"""2025-07-01 10:00:00.000000""","""July 07, 2025""","""2025-07-01""","""July 07, 2025"""
2025-07-01 07:45:00,"""2025-07-01 07:45:00.000000""","""July 07, 2025""","""2025-07-01""","""July 07, 2025"""


### Further Reading
- https://docs.pola.rs/user-guide/expressions/casting/#parsing-formatting-temporal-data-types
- https://docs.pola.rs/user-guide/transformations/time-series/parsing/#extracting-date-features-from-a-date-column
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.date.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.time.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.to_string.html

## Extracting Datetime Components
- The `dt` namespace holds various methods for extracting date components (day, month, year, etc)

In [16]:
history = pl.read_csv("history.csv", try_parse_dates=True)
history.head(3)

event,date
str,date
"""Declaration of Independence""",1776-07-04
"""Constitution Signed""",1787-09-17
"""Louisiana Purchase""",1803-04-30


- The `dt.millennium` method returns the millenium. A millenium is a period of 1000 years.
- The `dt.century` method returns the century. A century is a period of 100 years.
- The `dt.year` method returns the year.
- The `dt.month` method returns the month.
- The `dt.day` method returns the day.
- The `dt.quarter` method returns the quarter of the year.

In [17]:
history.with_columns(
    pl.col("date").dt.millennium().alias("millennium"),
    pl.col("date").dt.century().alias("century"),
    pl.col("date").dt.year().alias("year"),
    pl.col("date").dt.month().alias("month"),
    pl.col("date").dt.day().alias("day"),
    pl.col("date").dt.quarter().alias("quarter"),
)

event,date,millennium,century,year,month,day,quarter
str,date,i32,i32,i32,i8,i8,i8
"""Declaration of Independence""",1776-07-04,2,18,1776,7,4,3
"""Constitution Signed""",1787-09-17,2,18,1787,9,17,3
"""Louisiana Purchase""",1803-04-30,2,19,1803,4,30,2
"""Civil War Begins""",1861-04-12,2,19,1861,4,12,2
"""Emancipation Proclamation""",1863-01-01,2,19,1863,1,1,1
"""Pearl Harbor Attack""",1941-12-07,2,20,1941,12,7,4
"""Moon Landing""",1969-07-20,2,20,1969,7,20,3
"""Release of Polars Course""",2025-09-30,3,21,2025,9,30,3


- The `dt.weekday` method returns the day of the week as a number. A Monday is 1 and a Sunday is a 7.
- The `dt.days_in_month` method returns the number of days in the date's month.
- The `dt.ordinal_day` method returns the day of the year.

In [18]:
history.with_columns(
    pl.col("date").dt.weekday().alias("weekday"),
    pl.col("date").dt.days_in_month().alias("days_in_month"),
    pl.col("date").dt.ordinal_day().alias("ordinal_day"),
)

event,date,weekday,days_in_month,ordinal_day
str,date,i8,i8,i16
"""Declaration of Independence""",1776-07-04,4,31,186
"""Constitution Signed""",1787-09-17,1,30,260
"""Louisiana Purchase""",1803-04-30,6,30,120
"""Civil War Begins""",1861-04-12,5,30,102
"""Emancipation Proclamation""",1863-01-01,4,31,1
"""Pearl Harbor Attack""",1941-12-07,7,31,341
"""Moon Landing""",1969-07-20,7,31,201
"""Release of Polars Course""",2025-09-30,2,30,273


- The `dt.is_business_day` returns True if the day is a work day (Monday through Friday).
- The `dt.is_leap_year` returns True if the year is a leap year.

In [19]:
history.with_columns(
    pl.col("date").dt.is_business_day().alias("is_business_day"),
    pl.col("date").dt.is_leap_year().alias("is_leap_year"),
)

event,date,is_business_day,is_leap_year
str,date,bool,bool
"""Declaration of Independence""",1776-07-04,True,True
"""Constitution Signed""",1787-09-17,True,False
"""Louisiana Purchase""",1803-04-30,False,False
"""Civil War Begins""",1861-04-12,True,False
"""Emancipation Proclamation""",1863-01-01,True,False
"""Pearl Harbor Attack""",1941-12-07,False,False
"""Moon Landing""",1969-07-20,False,False
"""Release of Polars Course""",2025-09-30,True,False


### Further Reading
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.millennium.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.century.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.year.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.month.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.day.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.quarter.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.weekday.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.days_in_month.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.ordinal_day.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.is_business_day.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.is_leap_year.html

## Filtering by Date, Time, and Datetime
- Polars supports dates, times, and datetimes in expressions.
- Use the `pl.date`, `pl.time`, and `pl.datetime` functions to model the temporal value to compare each row with.
- These are distinct from the `pl.Date`, `pl.Time`, and `pl.Datetime` _types_.
- The `date`, `datetime`, and `time` objects from Python's `datetime` module also work.

In [20]:
history = pl.read_csv("history.csv", try_parse_dates=True)
history

event,date
str,date
"""Declaration of Independence""",1776-07-04
"""Constitution Signed""",1787-09-17
"""Louisiana Purchase""",1803-04-30
"""Civil War Begins""",1861-04-12
"""Emancipation Proclamation""",1863-01-01
"""Pearl Harbor Attack""",1941-12-07
"""Moon Landing""",1969-07-20
"""Release of Polars Course""",2025-09-30


In [21]:
history.filter(pl.col("date").dt.is_business_day())

event,date
str,date
"""Declaration of Independence""",1776-07-04
"""Constitution Signed""",1787-09-17
"""Civil War Begins""",1861-04-12
"""Emancipation Proclamation""",1863-01-01
"""Release of Polars Course""",2025-09-30


In [22]:
history.filter(pl.col("date") == pl.date(1803, 4, 30))
history.filter(pl.col("date").eq(pl.date(1803, 4, 30)))

event,date
str,date
"""Louisiana Purchase""",1803-04-30


In [23]:
import datetime as dt

In [24]:
history.filter(pl.col("date") > dt.date(1800, 1, 1))
history.filter(pl.col("date").gt(dt.date(1800, 1, 1)))

event,date
str,date
"""Louisiana Purchase""",1803-04-30
"""Civil War Begins""",1861-04-12
"""Emancipation Proclamation""",1863-01-01
"""Pearl Harbor Attack""",1941-12-07
"""Moon Landing""",1969-07-20
"""Release of Polars Course""",2025-09-30


- The `is_between` is ideal for extracting dates that fall within a time range.
- Both endpoints are inclusive.

In [25]:
history.filter(pl.col("date").is_between(pl.date(1800, 1, 1), pl.date(1899, 12, 31)))
history.filter(pl.col("date").dt.century().eq(19))

event,date
str,date
"""Louisiana Purchase""",1803-04-30
"""Civil War Begins""",1861-04-12
"""Emancipation Proclamation""",1863-01-01


- The same methods apply to datetimes and times.

In [26]:
clock_in_times = (
    pl.read_csv("clock_in_times.csv", try_parse_dates=True)
    .select("clock_in")
    .sort("clock_in")
)
clock_in_times

clock_in
datetime[μs]
2025-07-01 07:45:00
2025-07-01 08:50:00
2025-07-01 08:55:00
2025-07-01 09:10:00
2025-07-01 10:00:00


In [27]:
clock_in_times.filter(pl.col("clock_in") >= pl.datetime(2025, 7, 1, 8, 53, 0))
clock_in_times.filter(pl.col("clock_in").ge(pl.datetime(2025, 7, 1, 8, 53, 0)))

clock_in
datetime[μs]
2025-07-01 08:55:00
2025-07-01 09:10:00
2025-07-01 10:00:00


In [28]:
clock_in_times.select(pl.col("clock_in").cast(pl.Time)).filter(
    pl.col("clock_in").le(pl.time(9, 0, 0))
)

clock_in
time
07:45:00
08:50:00
08:55:00


### Further Reading
- https://docs.pola.rs/user-guide/transformations/time-series/filter/#filtering-by-single-dates
- https://docs.pola.rs/user-guide/transformations/time-series/filter/#filtering-by-a-date-range
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.is_business_day.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.datetime
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.date

## Adding and Subtracting Time I
- A duration/timedelta models an interval of time (i.e., 5 hours, 40 minutes).

In [29]:
history = pl.read_csv("history.csv", try_parse_dates=True)
history.head(2)

event,date
str,date
"""Declaration of Independence""",1776-07-04
"""Constitution Signed""",1787-09-17


- Use the `+` sign to add time and the `-`  sign to subtract time.
- The `dt.timedelta` object and `pl.duration` object represent a duration.
- The `pl.duration` function accepts `days`, `weeks`, `hours` parameters, and more.

In [30]:
history.with_columns((pl.col("date") + pl.duration(days=3)).alias("new_date"))
history.with_columns(pl.col("date").add(pl.duration(days=3)).alias("new_date"))
history.with_columns(pl.col("date").add(pl.duration(weeks=8, days=3)).alias("new_date"))

event,date,new_date
str,date,date
"""Declaration of Independence""",1776-07-04,1776-09-01
"""Constitution Signed""",1787-09-17,1787-11-15
"""Louisiana Purchase""",1803-04-30,1803-06-28
"""Civil War Begins""",1861-04-12,1861-06-10
"""Emancipation Proclamation""",1863-01-01,1863-03-01
"""Pearl Harbor Attack""",1941-12-07,1942-02-04
"""Moon Landing""",1969-07-20,1969-09-17
"""Release of Polars Course""",2025-09-30,2025-11-28


In [31]:
history.with_columns((pl.col("date") - pl.duration(days=3)).alias("new_date"))
history.with_columns(pl.col("date").sub(pl.duration(days=3)).alias("new_date"))

event,date,new_date
str,date,date
"""Declaration of Independence""",1776-07-04,1776-07-01
"""Constitution Signed""",1787-09-17,1787-09-14
"""Louisiana Purchase""",1803-04-30,1803-04-27
"""Civil War Begins""",1861-04-12,1861-04-09
"""Emancipation Proclamation""",1863-01-01,1862-12-29
"""Pearl Harbor Attack""",1941-12-07,1941-12-04
"""Moon Landing""",1969-07-20,1969-07-17
"""Release of Polars Course""",2025-09-30,2025-09-27


- Polars will maintain the data type of the original column.
- It will thus ignore hours, minutes, and seconds for a date column.

In [32]:
history.with_columns(
    pl.col("date").add(pl.duration(weeks=8, days=3, hours=5)).alias("new_date")
)

event,date,new_date
str,date,date
"""Declaration of Independence""",1776-07-04,1776-09-01
"""Constitution Signed""",1787-09-17,1787-11-15
"""Louisiana Purchase""",1803-04-30,1803-06-28
"""Civil War Begins""",1861-04-12,1861-06-10
"""Emancipation Proclamation""",1863-01-01,1863-03-01
"""Pearl Harbor Attack""",1941-12-07,1942-02-04
"""Moon Landing""",1969-07-20,1969-09-17
"""Release of Polars Course""",2025-09-30,2025-11-28


- Convert the column to datetimes _first_, then perform the addition.

In [33]:
history.with_columns(
    pl.col("date")
    .cast(pl.Datetime)
    .add(pl.duration(weeks=8, days=3, hours=5))
    .alias("new_date")
)

event,date,new_date
str,date,datetime[μs]
"""Declaration of Independence""",1776-07-04,1776-09-01 05:00:00
"""Constitution Signed""",1787-09-17,1787-11-15 05:00:00
"""Louisiana Purchase""",1803-04-30,1803-06-28 05:00:00
"""Civil War Begins""",1861-04-12,1861-06-10 05:00:00
"""Emancipation Proclamation""",1863-01-01,1863-03-01 05:00:00
"""Pearl Harbor Attack""",1941-12-07,1942-02-04 05:00:00
"""Moon Landing""",1969-07-20,1969-09-17 05:00:00
"""Release of Polars Course""",2025-09-30,2025-11-28 05:00:00


### Further Reading
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.duration
- https://docs.pola.rs/api/python/stable/reference/api/polars.datatypes.Datetime.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.add.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.sub.html

## Adding and Subtracting Time II
- The `dt.offset_by` method is recommended for adding/subtracting datetimes.
- Unlike the duration type, it supports years/months and accounts for nuances like leap years.
- The `%D` specifier is a shortcut for `%m/%d/%y` (month/day/year).

In [34]:
deliveries = pl.read_csv("deliveries.csv").select(
    pl.col("order_date").str.to_datetime(format="%D")
)
deliveries.head(2)

order_date
datetime[μs]
1998-05-24 00:00:00
1992-04-22 00:00:00


- The `dt.offset_by` method accepts its own format string.
- `h` stands for hour, `m` stands for minute, and so on.

In [35]:
deliveries.with_columns(
    pl.col("order_date").dt.offset_by("-12h30m").alias("time_in_12_and_a_half_hours")
)

order_date,time_in_12_and_a_half_hours
datetime[μs],datetime[μs]
1998-05-24 00:00:00,1998-05-23 11:30:00
1992-04-22 00:00:00,1992-04-21 11:30:00
1991-02-10 00:00:00,1991-02-09 11:30:00
1992-07-21 00:00:00,1992-07-20 11:30:00
1993-09-02 00:00:00,1993-09-01 11:30:00
…,…
1991-06-24 00:00:00,1991-06-23 11:30:00
1991-09-09 00:00:00,1991-09-08 11:30:00
1990-11-16 00:00:00,1990-11-15 11:30:00
1993-06-03 00:00:00,1993-06-02 11:30:00


- The advantage of `dt.offset_by` is the increased awareness of time.
- For example, say we want to find the same calendar day next month.
- We can't add a consistent duration because months have a different number of days.
- The `mo` symbol stands for "calendar month".

In [36]:
deliveries.with_columns(
    pl.col("order_date").dt.offset_by("1mo").alias("same_calendar_day_next_month")
)

order_date,same_calendar_day_next_month
datetime[μs],datetime[μs]
1998-05-24 00:00:00,1998-06-24 00:00:00
1992-04-22 00:00:00,1992-05-22 00:00:00
1991-02-10 00:00:00,1991-03-10 00:00:00
1992-07-21 00:00:00,1992-08-21 00:00:00
1993-09-02 00:00:00,1993-10-02 00:00:00
…,…
1991-06-24 00:00:00,1991-07-24 00:00:00
1991-09-09 00:00:00,1991-10-09 00:00:00
1990-11-16 00:00:00,1990-12-16 00:00:00
1993-06-03 00:00:00,1993-07-03 00:00:00


### Further Reading
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.offset_by.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.str.to_datetime.html

## The duration Type
- The Polars duration type represents a span of time.
- Sorting durations works as expected. An ascending order means "shortest duration" to "longest duration".

In [37]:
deliveries = pl.read_csv("deliveries.csv").with_columns(
    pl.col("order_date", "delivery_date").str.to_date(format="%D")
)
deliveries.head(2)

ID,order_date,delivery_date
i64,date,date
1,1998-05-24,1999-02-05
2,1992-04-22,1998-03-06


In [38]:
deliveries_with_durations = deliveries.with_columns(
    (pl.col("delivery_date") - pl.col("order_date")).alias("time_to_deliver")
).sort("time_to_deliver")

- Methods for the `duration` type are also found within the `dt` namespace.
- The `dt.total_` family of methods represents the duration with a different time unit.
- For example, `dt.total_hours` represents the duration in hours (as an `i64`).

In [39]:
deliveries_with_durations.with_columns(
    pl.col("time_to_deliver").dt.total_days().alias("total_days"),
    pl.col("time_to_deliver").dt.total_hours().alias("total_hours"),
    pl.col("time_to_deliver").dt.total_minutes().alias("total_minutes"),
)

ID,order_date,delivery_date,time_to_deliver,total_days,total_hours,total_minutes
i64,date,date,duration[μs],i64,i64,i64
898,1990-05-24,1990-06-01,8d,8,192,11520
19,1998-05-10,1998-05-19,9d,9,216,12960
612,1994-08-11,1994-08-20,9d,9,216,12960
994,1993-06-03,1993-06-13,10d,10,240,14400
310,1997-09-20,1997-10-06,16d,16,384,23040
…,…,…,…,…,…,…
331,1990-09-18,1999-12-19,3379d,3379,81096,4865760
130,1990-04-02,1999-08-16,3423d,3423,82152,4929120
904,1990-02-13,1999-11-15,3562d,3562,85488,5129280
314,1990-03-07,1999-12-25,3580d,3580,85920,5155200


### Further Reading
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.total_days.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.total_hours.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.total_minutes.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.str.to_date.html
- https://docs.pola.rs/api/python/stable/reference/api/polars.datatypes.Duration.html

## Timezones I
- Coordinated Universal Time (UTC) establishes a reference point/standard for the current time.
- A "naive" date/datetime is one that has no awareness of a timezone. All datetimes we've seen so far have been naive.
- The `read_json` method does NOT support a `try_parse_dates` parameter.

In [40]:
pl.read_json("flights.json")

flight_id,airline,aircraft,status,departure_city,departure_time,arrival_city,arrival_time
str,str,str,str,str,str,str,str
"""AA593""","""American Airlines""","""Boeing 787""","""Delayed""","""Paris""","""2025-06-22 21:00:00""","""Seattle""","""2025-06-23 01:00:00"""
"""AA885""","""Lufthansa""","""Airbus A320""","""On Time""","""Denver""","""2025-06-18 19:00:00""","""New York""","""2025-06-18 23:00:00"""
"""UA303""","""United""","""Boeing 737""","""Delayed""","""Miami""","""2025-06-16 23:00:00""","""Denver""","""2025-06-17 04:00:00"""
"""AA602""","""Delta""","""Boeing 787""","""Delayed""","""Chicago""","""2025-06-20 15:00:00""","""Seattle""","""2025-06-20 19:00:00"""
"""DL801""","""JetBlue""","""Boeing 787""","""Delayed""","""Seattle""","""2025-06-20 06:00:00""","""Miami""","""2025-06-20 10:00:00"""
…,…,…,…,…,…,…,…
"""AA426""","""American Airlines""","""Boeing 737""","""Cancelled""","""New York""","""2025-06-22 06:00:00""","""Los Angeles""","""2025-06-22 12:00:00"""
"""DL544""","""Delta""","""Embraer 190""","""On Time""","""Seattle""","""2025-06-20 06:00:00""","""London""","""2025-06-20 15:00:00"""
"""AA943""","""United""","""Embraer 190""","""On Time""","""New York""","""2025-06-19 18:00:00""","""Miami""","""2025-06-19 22:00:00"""
"""DL157""","""Delta""","""Airbus A320""","""Delayed""","""Paris""","""2025-06-19 04:00:00""","""New York""","""2025-06-19 08:00:00"""


- Let's pass `schema_overrides` a dictionary of columns to cast string columms to datetime columns.
- Let's also sort by the `departure_time` while we're here.

In [41]:
pl.read_json(
    "flights.json",
    schema_overrides={"departure_time": pl.Datetime, "arrival_time": pl.Datetime},
).sort("departure_time")

flight_id,airline,aircraft,status,departure_city,departure_time,arrival_city,arrival_time
str,str,str,str,str,datetime[μs],str,datetime[μs]
"""DL961""","""Delta""","""Boeing 787""","""On Time""","""Chicago""",2025-06-15 08:00:00,"""Paris""",2025-06-15 12:00:00
"""DL677""","""Lufthansa""","""Airbus A320""","""Cancelled""","""New York""",2025-06-15 13:00:00,"""Los Angeles""",2025-06-15 19:00:00
"""UA204""","""United""","""Embraer 190""","""On Time""","""New York""",2025-06-15 14:00:00,"""Los Angeles""",2025-06-15 20:00:00
"""DL898""","""Lufthansa""","""Boeing 787""","""On Time""","""Miami""",2025-06-15 15:00:00,"""London""",2025-06-15 19:00:00
"""DL734""","""American Airlines""","""Boeing 787""","""Delayed""","""Tokyo""",2025-06-15 17:00:00,"""Miami""",2025-06-15 21:00:00
…,…,…,…,…,…,…,…
"""AA169""","""JetBlue""","""Boeing 737""","""Delayed""","""Miami""",2025-06-24 04:00:00,"""Tokyo""",2025-06-24 08:00:00
"""DL638""","""American Airlines""","""Embraer 190""","""Cancelled""","""Seattle""",2025-06-24 04:00:00,"""New York""",2025-06-24 08:00:00
"""AA791""","""JetBlue""","""Boeing 737""","""Cancelled""","""Tokyo""",2025-06-25 00:00:00,"""Los Angeles""",2025-06-25 11:00:00
"""UA603""","""Delta""","""Boeing 737""","""Delayed""","""New York""",2025-06-25 01:00:00,"""Denver""",2025-06-25 05:00:00


- These datetimes are naive. What timezone are they supposed to represent?
- The `dt.replace_time_zone` method establishes a timezone for a date column.
- If we know these datetime values are storing UTC times, we can pass a string of `"UTC"`.
- A column's values must be in only one timezone. A column does not support multiple timezones.
- Notice the column type changes from `datetime[μs]` to `datetime[μs, UTC]`.

In [42]:
pl.read_json(
    "flights.json",
    schema_overrides={"departure_time": pl.Datetime, "arrival_time": pl.Datetime},
).sort("departure_time").with_columns(
    pl.col("departure_time")
    .dt.replace_time_zone("America/New_York")
    .dt.offset_by("4h"),
    pl.col("arrival_time").dt.replace_time_zone("America/New_York").dt.offset_by("4h"),
)

flight_id,airline,aircraft,status,departure_city,departure_time,arrival_city,arrival_time
str,str,str,str,str,"datetime[μs, America/New_York]",str,"datetime[μs, America/New_York]"
"""DL961""","""Delta""","""Boeing 787""","""On Time""","""Chicago""",2025-06-15 12:00:00 EDT,"""Paris""",2025-06-15 16:00:00 EDT
"""DL677""","""Lufthansa""","""Airbus A320""","""Cancelled""","""New York""",2025-06-15 17:00:00 EDT,"""Los Angeles""",2025-06-15 23:00:00 EDT
"""UA204""","""United""","""Embraer 190""","""On Time""","""New York""",2025-06-15 18:00:00 EDT,"""Los Angeles""",2025-06-16 00:00:00 EDT
"""DL898""","""Lufthansa""","""Boeing 787""","""On Time""","""Miami""",2025-06-15 19:00:00 EDT,"""London""",2025-06-15 23:00:00 EDT
"""DL734""","""American Airlines""","""Boeing 787""","""Delayed""","""Tokyo""",2025-06-15 21:00:00 EDT,"""Miami""",2025-06-16 01:00:00 EDT
…,…,…,…,…,…,…,…
"""AA169""","""JetBlue""","""Boeing 737""","""Delayed""","""Miami""",2025-06-24 08:00:00 EDT,"""Tokyo""",2025-06-24 12:00:00 EDT
"""DL638""","""American Airlines""","""Embraer 190""","""Cancelled""","""Seattle""",2025-06-24 08:00:00 EDT,"""New York""",2025-06-24 12:00:00 EDT
"""AA791""","""JetBlue""","""Boeing 737""","""Cancelled""","""Tokyo""",2025-06-25 04:00:00 EDT,"""Los Angeles""",2025-06-25 15:00:00 EDT
"""UA603""","""Delta""","""Boeing 737""","""Delayed""","""New York""",2025-06-25 05:00:00 EDT,"""Denver""",2025-06-25 09:00:00 EDT


### Further Reading
- https://docs.pola.rs/user-guide/transformations/time-series/parsing/#mixed-offsets
- https://docs.pola.rs/user-guide/transformations/time-series/timezones/
- https://docs.pola.rs/api/python/stable/reference/api/polars.read_json.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.replace_time_zone.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.offset_by.html

## Timezones II: Conversions
- Python's standard library includes a `zoneinfo` module. It was introduced in Python 3.9.
- The `zoneinfo` module's `available_timezones` function returns a set of all timezones.

In [43]:
import zoneinfo

len(zoneinfo.available_timezones())

timezones = zoneinfo.available_timezones()

- The `timezones` set includes both "UTC" and city-specific timezones.

In [44]:
"Europe/Paris" in timezones

True

- We can pass the the `dt.replace_time_zone` method any one of these timezones.
- Let's say our dataset's datetimes are stored based on New York time.
- The data type of the column changes. Notice each row's formatted value also includes `EDT`.

In [45]:
flights = (
    pl.read_json(
        "flights.json",
        schema_overrides={"departure_time": pl.Datetime, "arrival_time": pl.Datetime},
    )
    .sort("departure_time")
    .with_columns(
        pl.col("departure_time").dt.replace_time_zone("Europe/Paris"),
        pl.col("arrival_time").dt.replace_time_zone("Europe/Paris"),
    )
)

- The `dt.convert_time_zone` method converts one timezone to another.
- For example, Los Angeles is 3 hours ahead of New york.

In [46]:
flights.select(
    pl.col("departure_time"),
    pl.col("departure_time")
    .dt.convert_time_zone("America/Los_Angeles")
    .alias("departure_time_on_west_coast"),
    pl.col("departure_time")
    .dt.convert_time_zone("Europe/London")
    .alias("departure_time_in_london"),
)

departure_time,departure_time_on_west_coast,departure_time_in_london
"datetime[μs, Europe/Paris]","datetime[μs, America/Los_Angeles]","datetime[μs, Europe/London]"
2025-06-15 08:00:00 CEST,2025-06-14 23:00:00 PDT,2025-06-15 07:00:00 BST
2025-06-15 13:00:00 CEST,2025-06-15 04:00:00 PDT,2025-06-15 12:00:00 BST
2025-06-15 14:00:00 CEST,2025-06-15 05:00:00 PDT,2025-06-15 13:00:00 BST
2025-06-15 15:00:00 CEST,2025-06-15 06:00:00 PDT,2025-06-15 14:00:00 BST
2025-06-15 17:00:00 CEST,2025-06-15 08:00:00 PDT,2025-06-15 16:00:00 BST
…,…,…
2025-06-24 04:00:00 CEST,2025-06-23 19:00:00 PDT,2025-06-24 03:00:00 BST
2025-06-24 04:00:00 CEST,2025-06-23 19:00:00 PDT,2025-06-24 03:00:00 BST
2025-06-25 00:00:00 CEST,2025-06-24 15:00:00 PDT,2025-06-24 23:00:00 BST
2025-06-25 01:00:00 CEST,2025-06-24 16:00:00 PDT,2025-06-25 00:00:00 BST


### Further Reading
- https://docs.python.org/3/library/zoneinfo.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.replace_time_zone.html
- https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.convert_time_zone.html