# Time series preliminaries
By the end of this lecture you will be able to:
- use python `datetime` objects
- create a date range in Polars

Specifying dates and times as strings is failure-prone as a given string can map too different dates depending on the formatting used. As such the Polars developers have decided to not allow strings to be used in the library.

Instead in Polars dates and times are specified using python's built-in `datetime` module. We import these from the `datetime` module

In [None]:
from datetime import datetime,date,time,timedelta

import polars as pl

## Datetimes
A datetime is a combination of a date and a time that can be specified down to microseconds.

### Creating a datetime
We create a `datetime` object by specifying at least the year, month and day and optionally the hour, minute, second and microsecond.  Here we create a datetime for 2023/02/02 12:00:03.000001

In [None]:
dt = datetime(2023,2,1,12,0,3,1)
dt

### Accessing Date and Time Components
A datetime object allows you to access individual components of the date and time:



In [None]:
year = dt.year
month = dt.month
day = dt.day
hour = dt.hour
minute = dt.minute
second = dt.second

print(f"Year: {year}, Month: {month}, Day: {day}, Hour: {hour}, Minute: {minute}, Second: {second}")

We can use the `datetime` sub-module to get the current datetime with the `now` method

In [None]:
datetime.now()

### Timestamps
All datetime objects are stored internally as counts from the start of the Unix epoch on 1st January 1970. We can get this underlying representation for a `datetime` object with the `timestamp` method

In [None]:
dt.timestamp()

To create a datetime object from a timestamp, you can use the fromtimestamp method of the datetime class. This method converts a timestamp to a datetime object representing the corresponding date and time

In [None]:
timestamp = 1672531200

datetime.fromtimestamp(timestamp)

### Formatting datetime objects
The strftime method allows you to format a datetime object into a string:

In [None]:
dt.strftime("%Y-%m-%d %H:%M:%S")

You can use various format codes to customize the output. Some commonly used format codes include:

- %Y: Year with century as a decimal number.
- %m: Month as a zero-padded decimal number.
- %d: Day of the month as a zero-padded decimal number.
- %H: Hour (24-hour clock) as a zero-padded decimal number.
- %M: Minute as a zero-padded decimal number.
- %S: Second as a zero-padded decimal number.

### Parsing strings into datetime objects
The strptime method allows you to create a datetime object from a string representing a date and time:

In [None]:
date_string = "2023-01-01 12:00:00"
datetime.strptime(date_string, "%Y-%m-%d %H:%M:%S")

## Dates
The datetime.date class provides methods for creating, manipulating, and formatting date objects.

### Creating date objects
To create a date object, you can use the date class constructor. Here is an example:

In [None]:
date(2023,2,1)

### Getting the Current Date
You can get the current date using the today() method:

In [None]:
date.today()

### Accessing Date Components
A date object allows you to access individual components of the date:

In [None]:
date_obj = date(2023,2,1)

year = date_obj.year
month = date_obj.month
day = date_obj.day

print(f"Year: {year}, Month: {month}, Day: {day}")

### Formatting date objects
The strftime method allows you to format a date object into a string:

In [None]:
date_obj.strftime("%Y-%m-%d")

As with datetime objects, you can use various format codes to customize the output. Some commonly used format codes include:

- %Y: Year with century as a decimal number.
- %m: Month as a zero-padded decimal number.
- %d: Day of the month as a zero-padded decimal number.

### Parsing Strings into date objects
The strptime method of the datetime module can be used to create a date object from a string representing a date:

In [None]:
date_string = "2023-01-01"
datetime.strptime(date_string, "%Y-%m-%d").date()

### Creating date objects from timestamps
You can create date objects from timestamps using the fromtimestamp method, which is useful when working with data stored as Unix timestamps:

In [None]:
timestamp = 1672531200  # Corresponds to 2023-01-01
date.fromtimestamp(timestamp)

## Times
Next, we'll focus on the datetime.time class, which is designed to handle time (hours, minutes, seconds, and microseconds) without any reference to a date.

The datetime.time class provides methods for creating, manipulating, and formatting time objects. This class is useful when you need to work with times independently of dates.

### Creating time objects
To create a time object, you can use the time class constructor. Here is an example where we create a `time` object by specifying the hour and optionally the minute, second and microsecond

In [None]:
time(14, 30, 0)

### Accessing Time Components
A time object allows you to access individual components of the time:

In [None]:
time_obj = time(14, 30, 0)

hour = time_obj.hour
minute = time_obj.minute
second = time_obj.second
microsecond = time_obj.microsecond

print(f"Hour: {hour}, Minute: {minute}, Second: {second}, Microsecond: {microsecond}")


### Formatting time objects
The strftime method allows you to format a time object into a string:

In [None]:
time_obj.strftime("%H:%M:%S")

As with datetime and date objects, you can use various format codes to customize the output. Some commonly used format codes include:

- %H: Hour (24-hour clock) as a zero-padded decimal number.
- %M: Minute as a zero-padded decimal number.
- %S: Second as a zero-padded decimal number.
- %f: Microsecond as a decimal number, zero-padded on the left.

### Parsing strings into time objects
The strptime method of the datetime module can be used to create a time object from a string representing a time:

In [None]:
time_string = "14:30:00"
datetime.strptime(time_string, "%H:%M:%S").time()

## Duration / time difference
Next, we'll focus on the `datetime.timedelta` class, which is designed to represent the difference between two dates or times. This class is essential for performing arithmetic operations involving dates and times or comparing dates and times.

The datetime.timedelta class represents a duration, which is the difference between two dates, times, or datetime objects. It provides methods for creating, manipulating, and performing arithmetic operations on time durations.

### Creating timedelta objects
To create a timedelta object, you can use the timedelta class constructor. The constructor accepts several keyword arguments, including days, seconds, microseconds, milliseconds, minutes, hours, and weeks

In [None]:
timedelta(days=1,seconds=15)

You can create a `timedelta` object by taking the difference between datetimes

In [None]:
datetime(2023,1,1) - datetime(2022,1,1)

or dates

In [None]:
date(2023,1,1) - date(2022,1,1)

We can compare `timedeltas` with standard comparison operators 

In [None]:
(date(2023,1,1) - date(2020,1,1)) > (date(2023,1,1) - date(2022,1,1))

### Getting the Total Duration
You can get the total duration represented by a timedelta object using the total_seconds method

In [None]:
dt = timedelta(days=1,seconds=15)
dt.total_seconds()

Be aware that there is also a `seconds` attribute - but this only holds the seconds part of the duration.  It does not include the full duration in terms of days or microseconds.

Note that the largest interval in `timedelta` is days. This means `timedelta` does not have to deal with tricky things like months. For example, consider that if we added one month to 1st February we would expect to get 1st March. But if we add one month to 28th February do we expect to get 28th March or 31st March. Polars has ways to deal with this ambiguity that we see later.

Polars also has its own string intervals:
- "ns"
- "us"
- "ms"
- "s"
- "m"
- "h"
- "d"
- "w"
- "mo"
- "y"

So one week would be "1w".

These can also be concatenated so 1 day 3 hours is "1d3h"

We learn more about these intervals later in the time series section.

## Creating a datetime range
There are a number of ways to create a datetime range in Polars. We introduce the simplest way here.

We first specify our start, end and interval with `datetime` module objects

In [None]:
start_datetime = datetime(2023,1,1)
end_datetime = datetime(2023,1,1,4)
hourly_interval = timedelta(hours=1)

We create a datetime range `Series` using `pl.datetime_range`. Note that we have to specify `eager=True` for this to be evaluated - we explore why this is in a later lecture on date ranges

In [None]:
pl.datetime_range(
    start=start_datetime,
    end=end_datetime,
    interval=hourly_interval,
    eager=True
)

The output is a Polars `Series`. The dtype in this case is `pl.Datetime`. We learn more about Polars datetime dtypes in the next lecture.

There are other options we can pass to `pl.datetime_range` including:
- how the date range is closed (on both sides by default) and
- whether to specify a time zone

In [None]:
pl.datetime_range(
    start=start_datetime,
    end=end_datetime,
    interval=hourly_interval,
    eager=True,
    closed="left",
)

We can also create a date range with dates rather than datetimes if the interval is even days

In [None]:
start_date = date(2023,1,1)
end_date = date(2023,1,23)
weekly_interval = timedelta(weeks=1)

In [None]:
pl.datetime_range(
    start=start_date,
    end=end_date,
    interval=weekly_interval,
    eager=True,

)

## Exercises
In the exercises you will learn to:
- use `datetime` objects
- create a date range in Polars

### Exercise 1
Create `date` objects for the 1st and 2nd January 2020 along with a 3 hour time interval using a `timedelta`

Create a `DataFrame` with a date range column called `date` using these parameters

In [None]:
df = pl.DataFrame(
    {
        <blank>
    }
)
df

Create the `DataFrame` again using Polars string intervals at 2 hour 30 minute intervals

## Solutions

### Solution to exericise 1

Create `date` objects for the 1st and 2nd January 2020 along with a 3 hour time interval

In [None]:
start_date = date(2020,1,1)
end_date = date(2020,1,2)
interval = timedelta(hours=3)

Create a `DataFrame` with a date range column called `date` using these parameters

In [None]:
df = pl.DataFrame(
    {
        "date":pl.datetime_range(
            start=start_date,
            end=end_date,
            interval=interval,
            eager=True
        )
    }
)
df

Note the `eager=True` argument that is not the default for `pl.datetime_range`!

Create the `DataFrame` again using Polars string intervals at 2 hour 30 minute intervals

In [None]:
df = pl.DataFrame(
    {
        "date":pl.datetime_range(
            start=start_date,
            end=end_date,
            interval="2h30m",
            eager=True
        )
    }
)
df