# Date and Time

## Date

We orinally discussed the four main data types: `bool`, `int`, `float` and `string`. So which one of these would be best for handling dates? At first glance, a string seems to work best.

In [14]:
birthday = '01-02-1993'

This is fine if we want to view the date, but not great if we want to do any calculations based off it. How to we add a day to it? Or get the next month. Or how many days are between two dates?

To do this we need to be able to treat it numerically. One approach is to turn each component into integers. To do this, firstly can split the date up by its separator (in this case the dash).

In [65]:
birthday.split('-')

['01', '02', '1993']

This returns a list of each component. We can now unpack this to assign variables to each

In [15]:
day, month, year = birthday.split('-')

In [18]:
"I was born in " + year

'I was born in 1993'

But how would we go about calculating age in days?


You can see how this can get complicated really quickly. Thankfully, Python has some an in built way of handing all this. It is to create a new object, somewhere between a string and a number, that can do all this computation for us.

In [62]:
from datetime import date

Provide the date in the form year, month, day

In [51]:
new_birthday = date(1993, 2, 1)

In [52]:
new_birthday

datetime.date(1993, 2, 1)

This doesn't look immediately different. A little harder to read if anything. To print it properly, use the `print()` function

In [66]:
print(new_birthday)

1993-02-01


### String to Date

Combining this together, to go from a string to a datetime object, we could do something like this:

In [77]:
day, month, year = birthday.split('-')

# Don't forget to turn the variables into integers
day, month, year = int(day), int(month), int(year)

In [79]:
date(year, month, day)

datetime.date(1993, 2, 1)

### Date to String

What about going the other way. How could we print out a string from a date object. Well, we know we can use `print()` anyway for a start

In [83]:
print(new_birthday)

1993-02-01


This is the date formate that the International Standard for Organization (ISO) settled on. Going Year-Month-Day is ideal because when you sort the numbers from left to right it will sort in the correct order.

But what if we want to convert it to a string? Here we use a "string from date" function, called `strfdate()`. Here we can provide the structure for the string. We use `%d` for the day, `%m` for the month and `%Y` for the year.

Now you can choose you're preferred format:

In [93]:
new_birthday.strftime("%d/%m/%Y")

'01/02/1993'

Or pull out individual elements:

In [92]:
new_birthday.strftime("%Y")

'1993'

Or even construct sentences:

In [89]:
new_birthday.strftime("The day is %d and month is %m")

'The day is 01 and month is 02'

### Date Attributes

Now that we have a date object, we can pull out each element from this by going inside of the object.

In [53]:
new_birthday.day

1

In [54]:
new_birthday.month

2

In [55]:
new_birthday.year

1993

We could also find out what day of the week it was

In [56]:
new_birthday.weekday()

0

0 means it is the first day of the week, a Monday

We can also use this to pull out today's date.

In [64]:
date.today()

datetime.date(2021, 9, 1)

### Date Difference

So now, getting the difference between two dates is as easy as subtracting one from the other.

In [39]:
date.today() - new_birthday

datetime.timedelta(days=10439)

Great! We can now get the difference between two dates. This is a new type of object. We could take a look inside in this object and see a lot more ways we could represent this difference.

In [40]:
diff = today - new_birthday

In [45]:
diff.days

10439

## Time

In [157]:
from datetime import time

The same thing can be done with creating a time. This is going to be of the format `time(hour, minute, seconds)`. So ten seconds after midnight would be

In [159]:
time(0,0,10)

datetime.time(0, 0, 10)

And 6:30pm would be

In [160]:
time(6, 30)

datetime.time(6, 30)

To get the difference between two times, we need to create a `timedelta` object. This takes in th

In [8]:
from datetime import timedelta

Calculating the difference between 9am and 5:30pm

In [176]:
start = timedelta(hours=9)
end = timedelta(hours=17, minutes=30)

In [177]:
end - start

datetime.timedelta(seconds=30600)

## DateTime

The date object we created can simply be extended to include time information as well. For this, we will import the `datetime` function

In [115]:
from datetime import datetime

In [123]:
start = datetime(2017, 10, 1, 15, 26, 26)
end = datetime(2021, 10, 1, 10, 20, 26)

In [124]:
end - start

datetime.timedelta(days=1460, seconds=68040)

This is now presented in a combination of days and seconds

### DateTime from String

Parsing a DateTime from a string is going to be slightly more difficult than a date. Whereas a date was easy to separate (by dashes or slashes), a datetime has many different separators, e.g. `01/02/1990 17:59:00`

In [126]:
start = "01/02/1990 17:59:00"

To achieve this we are going to use the function `strptime()` to turn it into a datetime object. We will just need to provide it with the format of our datetime. We can use the same arguments as before, but with the addition of `%H` for hour, `%M` for minute, and `%S` for second (note the capital M for minute and the small m for month).

In [133]:
datetime.strptime(start, "%d/%m/%Y %H:%M:%S")

datetime.datetime(1990, 2, 1, 17, 59)

## Timezones

One last parameter we haven't looked at when creating our datetimes, is the `tzinfo` parameter. This is going to keep the information about what timezone is being recorded in. This can be entered using the `timedelta` object we created earlier

In [185]:
from datetime import timezone

In [192]:
jpn = timezone(timedelta(hours=-8))

In [193]:
dt = datetime(2017, 10, 1, 15, 26, 26, tzinfo=jpn)

In [194]:
dt

datetime.datetime(2017, 10, 1, 15, 26, 26, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=57600)))

## In DataFrames

In [3]:
import pandas as pd

In [5]:
climate = pd.read_csv('data/climate_change.csv')

In [9]:
climate.head()

Unnamed: 0,date,co2,relative_temp
0,1958-03-06,315.71,0.1
1,1958-04-06,317.45,0.01
2,1958-05-06,317.5,0.08
3,1958-06-06,,-0.05
4,1958-07-06,315.86,0.06


In [8]:
climate.dtypes

date              object
co2              float64
relative_temp    float64
dtype: object

We can see here that the date is stored as a plain string (object). We can use the Pandas function `to_datetime` to convert it to a datetime object.

In [147]:
climate['date'] = pd.to_datetime(climate['date'])

In [148]:
climate.dtypes

date             datetime64[ns]
co2                     float64
relative_temp           float64
dtype: object

It will even be clever enough to figure out different formats

In [155]:
weather = pd.read_csv('data/weather.csv')
weather.head()

Unnamed: 0,date,max_temperature_f,mean_temperature_f,min_temperature_f,max_dew_point_f,mean_dew_point_f,min_dew_point_f,max_humidity,mean_humidity,min_humidity,...,mean_visibility_miles,min_visibility_miles,max_wind_Speed_mph,mean_wind_speed_mph,max_gust_speed_mph,precipitation_inches,cloud_cover,events,wind_dir_degrees,zip_code
0,8/29/2013,74.0,68.0,61.0,61.0,58.0,56.0,93.0,75.0,57.0,...,10.0,10.0,23.0,11.0,28.0,0,4.0,,286.0,94107
1,8/30/2013,78.0,69.0,60.0,61.0,58.0,56.0,90.0,70.0,50.0,...,10.0,7.0,29.0,13.0,35.0,0,2.0,,291.0,94107
2,8/31/2013,71.0,64.0,57.0,57.0,56.0,54.0,93.0,75.0,57.0,...,10.0,10.0,26.0,15.0,31.0,0,4.0,,284.0,94107
3,9/1/2013,74.0,66.0,58.0,60.0,56.0,53.0,87.0,68.0,49.0,...,10.0,10.0,25.0,13.0,29.0,0,4.0,,284.0,94107
4,9/2/2013,75.0,69.0,62.0,61.0,60.0,58.0,93.0,77.0,61.0,...,10.0,6.0,23.0,12.0,30.0,0,6.0,,277.0,94107


In [154]:
pd.to_datetime(weather['date'])

0      2013-08-29
1      2013-08-30
2      2013-08-31
3      2013-09-01
4      2013-09-02
          ...    
3660   2015-08-27
3661   2015-08-28
3662   2015-08-29
3663   2015-08-30
3664   2015-08-31
Name: date, Length: 3665, dtype: datetime64[ns]

### Adjusting DateTimes

We have an example DataFrame here, of timestamps from two difference countries. Here we will adjust both of them to be in UTC.

In [21]:
feed = {'Timestamp':['2021/01/01 09:30:00', '2021/01/01 11:30:00', '2021/01/01 15:30:00', '2021/01/01 15:30:00'],
       'Country': ['Ireland', 'Japan', 'Ireland', 'Japan']}

In [22]:
feed = pd.DataFrame(feed)

In [23]:
feed.head()

Unnamed: 0,Timestamp,Country
0,2021/01/01 09:30:00,Ireland
1,2021/01/01 11:30:00,Japan
2,2021/01/01 15:30:00,Ireland
3,2021/01/01 15:30:00,Japan


In [24]:
feed['Timestamp'] = pd.to_datetime(feed['Timestamp'])

If we want to add a subtract hours from a column, we can add on a `timedelta` object. Here, add 8 hours on to the Japan timestamps and taking one hour off the Ireland timestamps.

In [29]:
feed.loc[feed['Country'] == 'Japan', 'UTC'] = feed['Timestamp'] + timedelta(hours=8)
feed.loc[feed['Country'] == 'Ireland', 'UTC'] = feed['Timestamp'] + timedelta(hours=-1)

In [30]:
feed

Unnamed: 0,Timestamp,Country,UTC
0,2021-01-01 09:30:00,Ireland,2021-01-01 08:30:00
1,2021-01-01 11:30:00,Japan,2021-01-01 19:30:00
2,2021-01-01 15:30:00,Ireland,2021-01-01 14:30:00
3,2021-01-01 15:30:00,Japan,2021-01-01 23:30:00
