# Lecture 15

## datetime
- [basics](#basics)
- the [time module](#time)
- time zone [awareness](#tzinfo)
- datetime in [Pandas](#to_datetime)
---

### Basics <a class='anchor' id="basics"></a>

Date is not a datatype in Python, but the `datetime` [module](https://docs.python.org/3.8/library/datetime.html) provides access to date and time functionalities. It defines `date`, `time` and `datetime` objects which provide the necessary tools to handle date and time, including timestamps.

In [1]:
import datetime

In [2]:
D = datetime.date(2022, 9, 1)
T = datetime.time(12,0,0) # noon
DT= datetime.datetime(2022, 9, 1, 14, 32, 23)

In [3]:
print('D:', D)
print('T:', T)
print('DT:', DT)

D: 2022-09-01
T: 12:00:00
DT: 2022-09-01 14:32:23


In [4]:
print('D:', type(D))
print('T:', type(T))
print('DT:', type(DT))

D: <class 'datetime.date'>
T: <class 'datetime.time'>
DT: <class 'datetime.datetime'>


To make things simple you can always use the `datetime()` function by giving only the date parts as inputs. Hour, minute, and second default to zero. 

In [5]:
DT = datetime.datetime(2022, 9, 1)
print('DT:', DT)

DT: 2022-09-01 00:00:00


Datetime objects have useful date attributes.

In [6]:
DN = datetime.datetime.now()
print(DN.year)
print(DN.month)
print(DN.day)
print(DN.hour)
print(DN.minute)
print(DN.second)

2022
8
2
15
49
12


Weekday code starts at 0 with Monday beeing 0.

In [7]:
print(DN.weekday()) # Note, that while year, month, etc are 'attributes', weekday() is a 'method'. That's why you need the perenthesis.

1


Math operations are also meaningful with datetime objects.

In [8]:
D1 = datetime.datetime(2010, 12, 10, 14, 32, 23)
DN = datetime.datetime.now()
dt = DN - D1
print(dt)

4253 days, 1:16:49.866214


In [9]:
type(dt)

datetime.timedelta

These operations create a `timedelta` object. These objects also have the usual datetime attributes. 

In [10]:
print(dt.days)

4253


The `timedelta()` function helps you create a datetime object from another datetime object. 

In [11]:
D0 = datetime.datetime(2022, 9, 1, 14, 32)
D1 = D0 + datetime.timedelta(days = 3)
print(D1)

2022-09-04 14:32:00


In [12]:
D2 = D0 - datetime.timedelta(hours = 1) # You can go back in time in both ways.
D3 = D0 + datetime.timedelta(weeks = -2)
print(D2)
print(D3)

2022-09-01 13:32:00
2022-08-18 14:32:00


`strptime()` (**str**ing **p**arse **time**) will cast a string to datetime.

In [13]:
d1 = '2022-04-30'
d2 = 'April 30, 2022'

In [14]:
print('d1:', datetime.datetime.strptime( d1, '%Y-%m-%d'))
print('d2:', datetime.datetime.strptime( d2, '%B %d, %Y'))

d1: 2022-04-30 00:00:00
d2: 2022-04-30 00:00:00


The opposite is `strftime()` (**str**ing **f**rom **time**), which is _datetime object method_, and only needs the *'format='* option as input. 

In [15]:
DT = datetime.datetime(2022, 4, 30, 14, 32, 0)
print(DT)
print(DT.strftime('%Y-%m-%d'))
print(DT.strftime('%B %d, %Y'))

2022-04-30 14:32:00
2022-04-30
April 30, 2022


The type casting between strings and datetime, together with the string format codes can be found in the [datetime documentation](https://docs.python.org/3.8/library/datetime.html#strftime-strptime-behavior).

### The `time` Module <a class='anchor' id='time'></a>

In [16]:
import time

In [17]:
print(time.time())

1659448153.0722294


`time.time()` will return the *current time* in seconds since the [*UNIX epoch*](https://en.wikipedia.org/wiki/Unix_time), or January 1, 1970. This is ***almost*** the number of seconds since midnight that reference day, excluding leap seconds. The `time` module gives two useful functions: `time()` and `sleep()`. 

In [18]:
n = 0

while n < 5:
    print(time.time())
    time.sleep(2)
    n += 1

1659448153.0913315
1659448155.1061146
1659448157.110048
1659448159.1126032
1659448161.1127906


### Aware And Naive Timestamps <a class='anchor' id='tzinfo'></a>

In [19]:
D = datetime.datetime(2021,9,28,1,0,15)
D

datetime.datetime(2021, 9, 28, 1, 0, 15)

Date and time objects may be categorized as “**aware**” or “**naive**” depending on whether or not they include timezone information. Our D variable is *naive*.

In [20]:
print(D.tzinfo)

None


In [21]:
# Python timezone module
import pytz

In [22]:
D_V = D.astimezone(pytz.timezone("Europe/Vienna"))

In [23]:
print(D_V.tzinfo)

Europe/Vienna


In [24]:
print(D_V)
print(D_V.hour)

2021-09-28 01:00:15+02:00
1


In [25]:
# What would it be in Denver, CO?
D_D = D_V.astimezone(pytz.timezone("America/Denver"))
print(D_D)
print(D_D.hour)
print(D_D.tzinfo)

2021-09-27 17:00:15-06:00
17
America/Denver


While at first blush these two object look different, the difference between them is zero.

In [27]:
D_D - D_V

datetime.timedelta(0)

While this looks to be a minor issue, in reality timezone awareness (or the lack of it) can create a mess in your code if you don't check your datetime data. Interactions with databases can be very tricky, because they may return your timestamps and datetime data in a different timezone as you have entered. Also, some modules, like [fbProphet](https://facebook.github.io/prophet/), are picky regarding timezone awareness. 

Finally, list of Python timezones can be found [here](https://gist.github.com/heyalexej/8bf688fd67d7199be4a1682b3eec7568) or by runing the 
```
pytz.all_timezones
```
command.

### Datetime In Pandas <a class = 'anchor' id = 'to_datetime'></a>

Pandas has the `to_datetime()` module to effectively handle date and time information.

In [28]:
import pandas as pd

In [29]:
df = pd.DataFrame(
    {
        'year': [2022, 2022, 2021],
        'month': [12, 11, 9],
        'day': [15, 14, 9],
        'hour': [23, 20, 6],
        'data': ['a', 'b', 'b']
    }
)

In [30]:
df

Unnamed: 0,year,month,day,hour,data
0,2022,12,15,23,a
1,2022,11,14,20,b
2,2021,9,9,6,b


In [31]:
df['date'] = pd.to_datetime(df[['year', 'month', 'day']])

In [32]:
df

Unnamed: 0,year,month,day,hour,data,date
0,2022,12,15,23,a,2022-12-15
1,2022,11,14,20,b,2022-11-14
2,2021,9,9,6,b,2021-09-09


The new column is a *Pandas timestamp*.

In [33]:
type(df.date.iloc[0])

pandas._libs.tslibs.timestamps.Timestamp

In [34]:
df['date_hour'] = pd.to_datetime(df[['year', 'month', 'day', 'hour']])

In [35]:
df

Unnamed: 0,year,month,day,hour,data,date,date_hour
0,2022,12,15,23,a,2022-12-15,2022-12-15 23:00:00
1,2022,11,14,20,b,2022-11-14,2022-11-14 20:00:00
2,2021,9,9,6,b,2021-09-09,2021-09-09 06:00:00


`to_datetime()` also parses string into datetime. 

In [36]:
df = pd.DataFrame(
    {
        'date': ['April 30, 2022', 'May 1, 2022', 'May 2, 2022'],
        'data': ['a', 'b', 'b']
    }
)

In [37]:
df

Unnamed: 0,date,data
0,"April 30, 2022",a
1,"May 1, 2022",b
2,"May 2, 2022",b


In [38]:
df['date'] = pd.to_datetime(df.date) # Using the 'format=' option you can help the function to parse string input but many usual formats are automatically parsed into the correct time value. 

In [39]:
df

Unnamed: 0,date,data
0,2022-04-30,a
1,2022-05-01,b
2,2022-05-02,b
