In [1]:
# Import libraries
from datetime import datetime
from datetime import timedelta
from datetime import timezone

from dateutil import tz
from dateutil.zoneinfo import get_zonefile_instance

from pprint import pprint

import pandas as pd

In [2]:
# Read data
bikes_data = pd.read_csv('datasets/capital-onebike.csv') #list of dates
print(bikes_data.shape)
bikes_data.columns = bikes_data.columns.str.lower().str.replace(' ','_')
bikes_data.head()

(290, 8)


Unnamed: 0,start_date,end_date,start_station_number,start_station,end_station_number,end_station,bike_number,member_type
0,2017-10-01 15:23:25,2017-10-01 15:26:26,31038,Glebe Rd & 11th St N,31036,George Mason Dr & Wilson Blvd,W20529,Member
1,2017-10-01 15:42:57,2017-10-01 17:49:59,31036,George Mason Dr & Wilson Blvd,31036,George Mason Dr & Wilson Blvd,W20529,Casual
2,2017-10-02 06:37:10,2017-10-02 06:42:53,31036,George Mason Dr & Wilson Blvd,31037,Ballston Metro / N Stuart & 9th St N,W20529,Member
3,2017-10-02 08:56:45,2017-10-02 09:18:03,31037,Ballston Metro / N Stuart & 9th St N,31295,Potomac & M St NW,W20529,Member
4,2017-10-02 18:23:48,2017-10-02 18:45:05,31295,Potomac & M St NW,31230,Metro Center / 12th & G St NW,W20529,Member


In [3]:
bikes_data[bikes_data.start_date=='2017-11-05 01:56:50']

Unnamed: 0,start_date,end_date,start_station_number,start_station,end_station_number,end_station,bike_number,member_type
129,2017-11-05 01:56:50,2017-11-05 01:01:04,31615,6th & H St NE,31627,3rd & M St NE,W20529,Member


In [4]:
# Define global variables
onebike_datetimes = [{'end'  : pd.to_datetime(row['end_date']).to_pydatetime(),
                      'start': pd.to_datetime(row['start_date']).to_pydatetime()} for i, row in bikes_data.iterrows()]
onebike_datetimes[:2]

[{'end': datetime.datetime(2017, 10, 1, 15, 26, 26),
  'start': datetime.datetime(2017, 10, 1, 15, 23, 25)},
 {'end': datetime.datetime(2017, 10, 1, 17, 49, 59),
  'start': datetime.datetime(2017, 10, 1, 15, 42, 57)}]

# 3. Time Zones and Daylight Saving

In this chapter, you'll learn to confidently tackle the time-related topic that causes people the most trouble: time zones and daylight saving. Continuing with our bike data, you'll learn how to compare clocks around the world, how to gracefully handle "spring forward" and "fall back," and how to get up-to-date timezone data from the dateutil library.

# <font color=darkred>3.1 UTC offsets</font>

1. UTC offsets
>Sometimes, you really need to know exactly when something happened. Up until now, the datetime objects you have worked with are what is called "naive", and can't be compared across different parts of the world. They don't know anything about their time zone.

2. Time zones
>Why does this matter? Before time zones, each town or city set its clock so that noon was directly overhead.

3. Time zones
>Another city 100 miles away would also set their clocks to be noon when the sun was overhead.

4. Time zones
>But this meant that these two cities had clocks that were different, by maybe 15 or 20 minutes. When people moved by foot or horseback, this wasn't a problem.

5. Time zones
>Then railroads, and later telegraphs, came into existence. Now you could move or communicate with someone 100 or even 1000 miles away fast enough that time had to be consistent.

6. Time zones
>Governments solved this problem by declaring that all clocks within a wide area would agree on the hour, even if some were ahead or behind of their solar time. The United States, for example, has 4 major time zones, plus one for Alaska and another for Hawaii. Our bike data was collected in Washington, D.C., which observes Eastern time.

7. UTC
>But since we're not using the sun anymore, how do we know how to set the clock? Because the United Kingdom was the first to standardize its time, everyone in the world sets their clocks relative to the original historical UK standard. This standard time is called UTC. Because all clocks are set relative to UTC, we can compare time around the world. Generally, clocks west of the UK are set earlier than UTC, and clocks east of the UK are set later than UTC. For example, the eastern United States is typically UTC minus 5 hours, while India is typically UTC plus 5 hours 30 minutes.

8. UTC
>Let's see this in code. As before, you import datetime and timedelta. Now you also import timezone. This will let you specify what timezone the clock was in when our data was recorded.

9. UTC
>We create a timezone object, which accepts a timedelta that explains how to translate your datetime into UTC. In this case, since the clock that measured our bicycle data set was five hours behind UTC, we create ET to be at UTC-5. We can specify what time zone the clock was in when the last ride started in our data set. The clock that recorded the ride was 5 hours behind UTC. Now if you print it, your datetime includes the UTC offset.

10. UTC
>Making a datetime "aware" of its timezone means you can ask Python new questions. For example, suppose you want to know what the date and time would have been if the clock had been set to India Standard Time instead. First, create a new timezone object set to UTC plus 5 hours 30 minutes. Now use the astimezone() method to ask Python to create a new datetime object corresponding to the same moment, but adjusted to a different time zone. In this case, because clocks in India would have been set 10.5 hours ahead of clocks on the eastern US, the last ride would have taken place on December 31, at 1 hour, 39 minutes, and 3 seconds past midnight local time. Same moment, different clock.

11. Adjusting timezone vs changing tzinfo
>Finally, there is an important difference between adjusting timezones and changing the tzinfo directly. You can set the tzinfo directly, using the replace() method. Here we've set the tzinfo to be timezone.utc, a convenient object with zero UTC offset. The clock stays the same, but the UTC offset has shifted. Or, just like before, you can call the astimezone() method. Now if we adjust into UTC with astimezone(timezone.utc), we change both the UTC offset and the clock itself.

12. UTC Offsets
>Now that you have learned about UTC offsets, which allow us to compare times around the world, it's time to practice using them!

In [5]:
# US Eastern Standard time zone
ET = timezone(timedelta(hours=-5))

# Timezone-aware datetime
dt = datetime(2017, 12, 30, 15, 9, 3, tzinfo = ET)
print(dt)
dt

2017-12-30 15:09:03-05:00


datetime.datetime(2017, 12, 30, 15, 9, 3, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=68400)))

In [6]:
# India Standard time zone
IST = timezone(timedelta(hours=5, minutes=30))

# Convert to IST
print(dt.astimezone(IST))
dt.astimezone(IST)

2017-12-31 01:39:03+05:30


datetime.datetime(2017, 12, 31, 1, 39, 3, tzinfo=datetime.timezone(datetime.timedelta(seconds=19800)))

In [7]:
# Adjusting timezone vs changing tzinfo
print('Timezone -5h (Original data)   :', dt)
print('Just timezone replaced         :', dt.replace(tzinfo=timezone.utc))

# Change original to match UTC
print('Same moment, different timezone:', dt.astimezone(timezone.utc))

Timezone -5h (Original data)   : 2017-12-30 15:09:03-05:00
Just timezone replaced         : 2017-12-30 15:09:03+00:00
Same moment, different timezone: 2017-12-30 20:09:03+00:00


# <font color=darkred>3.2 Creating timezone aware datetimes</font> 

In this exercise, you will practice setting timezones manually.

**Instructions**

- Import timezone.
- Set the tzinfo to UTC, without using timedelta.


- Set pst to be a timezone set for UTC-8.
- Set dt's timezone to be pst.


- Set tz to be a timezone set for UTC+11.
- Set dt's timezone to be tz.

**Results**

<font color=darkgreen>Great! Did you know that Russia and France are tied for the most number of time zones, with 12 each? The French mainland only has one timezone, but because France has so many overseas dependencies they really add up!</font>

In [8]:
# October 1, 2017 at 15:26:26, UTC
dt = datetime(2017, 10, 1, 15, 26, 26, tzinfo=timezone.utc)

# Print results
print(dt.isoformat())

#######################################################################
# Create a timezone for Pacific Standard Time, or UTC-8
pst = timezone(timedelta(hours=-8))

# October 1, 2017 at 15:26:26, UTC-8
dt = datetime(2017, 10, 1, 15, 26, 26, tzinfo=pst)

# Print results
print(dt.isoformat())

#######################################################################
# Create a timezone for Australian Eastern Daylight Time, or UTC+11
aedt = timezone(timedelta(hours=11))

# October 1, 2017 at 15:26:26, UTC+11
dt = datetime(2017, 10, 1, 15, 26, 26, tzinfo=aedt)

# Print results
print(dt.isoformat())

2017-10-01T15:26:26+00:00
2017-10-01T15:26:26-08:00
2017-10-01T15:26:26+11:00


# <font color=darkred>3.3 Setting timezones</font> 

Now that you have the hang of setting timezones one at a time, let's look at setting them for the first ten trips that W20529 took.

timezone and timedelta have already been imported. Make the change using .replace()

**Instructions**
- Create edt, a timezone object whose UTC offset is -4 hours.
- Within the for loop:
    - Set the tzinfo for trip['start'].
    - Set the tzinfo for trip['end'].

**Results**

<font color=darkgreen>Awesome! Did you know that despite being over 2,500 miles (4,200 km) wide (about as wide as the continential United States or the European Union) China has only one official timezone? There's a second, unofficial timezone, too. It is used by much of the Uyghurs population in the Xinjiang province in the far west of China.</font>

In [9]:
# Create a timezone object corresponding to UTC-4
edt = timezone(timedelta(hours=-4))

# Before timezone transformation
pprint(onebike_datetimes[:2])
print('\n')

# Loop over trips, updating the start and end datetimes to be in UTC-4
onebike_timezone = [{'start': trip['start'].replace(tzinfo=edt), 
                     'end': trip['end'].replace(tzinfo=edt)} for trip in onebike_datetimes]

# After timezone transformation
pprint(onebike_timezone[:2])

[{'end': datetime.datetime(2017, 10, 1, 15, 26, 26),
  'start': datetime.datetime(2017, 10, 1, 15, 23, 25)},
 {'end': datetime.datetime(2017, 10, 1, 17, 49, 59),
  'start': datetime.datetime(2017, 10, 1, 15, 42, 57)}]


[{'end': datetime.datetime(2017, 10, 1, 15, 26, 26, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000))),
  'start': datetime.datetime(2017, 10, 1, 15, 23, 25, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000)))},
 {'end': datetime.datetime(2017, 10, 1, 17, 49, 59, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000))),
  'start': datetime.datetime(2017, 10, 1, 15, 42, 57, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000)))}]


# <font color=darkred>3.4 What time did the bike leave in UTC?</font> 

Having set the timezone for the first ten rides that W20529 took, let's see what time the bike left in UTC. We've already loaded the results of the previous exercise into memory.

**Instructions**
- Within the for loop, move dt to be in UTC. Use timezone.utc as a convenient shortcut for UTC.

**Results**

<font color=darkgreen>Excellent! Did you know that there is no official time zone at the North or South pole? Since all the lines of longitude meet each other, it's up to each traveler (or research station) to decide what time they want to use.</font>

In [10]:
# Loop over the trips
for trip in onebike_datetimes[:10]:
    # Pull out the start
    dt = trip['start']
    # Move dt to be in UTC
    dt = dt.astimezone(timezone.utc)
  
    # Print the start time in UTC
    print('Original:', trip['start'], '| UTC:', dt.isoformat())

Original: 2017-10-01 15:23:25 | UTC: 2017-10-01T21:23:25+00:00
Original: 2017-10-01 15:42:57 | UTC: 2017-10-01T21:42:57+00:00
Original: 2017-10-02 06:37:10 | UTC: 2017-10-02T12:37:10+00:00
Original: 2017-10-02 08:56:45 | UTC: 2017-10-02T14:56:45+00:00
Original: 2017-10-02 18:23:48 | UTC: 2017-10-03T00:23:48+00:00
Original: 2017-10-02 18:48:08 | UTC: 2017-10-03T00:48:08+00:00
Original: 2017-10-02 19:18:10 | UTC: 2017-10-03T01:18:10+00:00
Original: 2017-10-02 19:37:32 | UTC: 2017-10-03T01:37:32+00:00
Original: 2017-10-03 08:24:16 | UTC: 2017-10-03T14:24:16+00:00
Original: 2017-10-03 18:17:07 | UTC: 2017-10-04T00:17:07+00:00


# <font color=darkred>3.5 Time zone database</font>

1. Time zone database
>Now that you understand how UTC offsets work, it's time to talk about how you use timezones in practice.

2. Time zone database
>This is a picture of all of the different time zones in the world, as of 2017. They cut across countries, and within countries, and sometimes one is even totally surrounded by another one. How could you possibly know all of these when you need to align your data to UTC? Do you need to look up the offset for each one in some big spreadsheet somewhere? Can't a computer help with this?

3. Time zone database
>Thankfully, yes. There is a database called tz, updated 3-4 times a year as timezone rules change. This database is used by computer programs across many programming languages. Because timezone information changes so quickly, it doesn't make sense to bundle it directly into Python. Instead, you will use a package called dateutil.

4. Time zone database
>Let's start by making a timezone object that corresponds to the eastern United States, where our bicycle data comes from. Within tz, time zones are defined first by the continent they are on, and then by the nearest major city. For example, the time zone used on the eastern seaboard of the United States is 'America/New York'. We fetch this timezone by calling tz.gettz(), and passing 'America/New York' as the string.

5. Time zone database
>Here are a few more examples: 'America/Mexico_City'. 'Europe/London'. 'Africa/Accra'.

6. Time zone database
>Let's look at our last ride again. Instead of specifying the UTC offset yourself, you pass the timezone you got from tz. Look at the result, and you can see that it's got the right UTC offset.

7. Time zone database
>Even more excitingly, this same object will adjust the UTC offset depending on the date and time. If we call datetime() with the time of our first ride, and pass in the same timezone info, we see that it gives us a different UTC offset. We will discuss daylight savings time in the next lesson, but suffice to say that in some places the clocks change twice a year. Instead of having to look up when these things change, we just ask the timezone database to know for us. tz includes rules for UTC offsets going all the way back to the late 1960s, and sometimes earlier. If you have data stretching over a long period of time, and you really care about getting the exact hours and minutes correct, you can use tz to put all of your date and timestamps on to a common scale.

8. Time zone database
>Now that you have a basic understanding of using the tz class from dateutil, it's time to practice some examples!

In [11]:
# Eastern time
et = tz.gettz('America/New_York')

# Last ride
last = datetime(2017, 12, 30, 15, 9, 3, tzinfo=et)
print(last)

# First ride
first = datetime(2017, 10, 1, 15, 23, 25, tzinfo=et)
print(first)

2017-12-30 15:09:03-05:00
2017-10-01 15:23:25-04:00


In [12]:
# El Salvador
sv = tz.gettz('America/El_Salvador')

# Last ride
last = datetime(2017, 12, 30, 15, 9, 3, tzinfo=sv)
print(last)

2017-12-30 15:09:03-06:00


In [13]:
# Get all available timezones
print(list(get_zonefile_instance().zones))

['Zulu', 'W-SU', 'Turkey', 'Singapore', 'ROK', 'ROC', 'Portugal', 'Poland', 'PRC', 'Navajo', 'NZ-CHAT', 'NZ', 'Mexico/BajaNorte', 'Mexico/BajaSur', 'Mexico/General', 'Libya', 'Kwajalein', 'Japan', 'Jamaica', 'Israel', 'Iran', 'Iceland', 'Hongkong', 'Greenwich', 'GB-Eire', 'Eire', 'Egypt', 'Cuba', 'Chile/Continental', 'Chile/EasterIsland', 'Canada/Atlantic', 'Canada/Central', 'Canada/Eastern', 'Canada/Mountain', 'Canada/Newfoundland', 'Canada/Pacific', 'Canada/Saskatchewan', 'Canada/Yukon', 'Brazil/Acre', 'Brazil/DeNoronha', 'Brazil/East', 'Brazil/West', 'US/Alaska', 'US/Aleutian', 'US/Arizona', 'US/Central', 'US/East-Indiana', 'US/Eastern', 'US/Hawaii', 'US/Indiana-Starke', 'US/Michigan', 'US/Pacific', 'US/Samoa', 'Arctic/Longyearbyen', 'Factory', 'Etc/GMT+1', 'Etc/GMT+10', 'Etc/GMT+11', 'Etc/GMT+12', 'Etc/GMT+2', 'Etc/GMT+3', 'Etc/GMT+4', 'Etc/GMT+5', 'Etc/GMT+6', 'Etc/GMT+7', 'Etc/GMT+8', 'Etc/GMT+9', 'Etc/GMT-1', 'Etc/GMT-10', 'Etc/GMT-11', 'Etc/GMT-12', 'Etc/GMT-13', 'Etc/GMT-14', 

# <font color=darkred>3.6 Putting the bike trips into the right time zone</font> 

Instead of setting the timezones for W20529 by hand, let's assign them to their IANA timezone: 'America/New_York'. Since we know their political jurisdiction, we don't need to look up their UTC offset. Python will do that for us.

**Instructions**
- Import tz from dateutil.
- Assign et to be the timezone 'America/New_York'.
- Within the for loop, set start and end to have et as their timezone (use .replace()).

**Results**

<font color=darkgreen>Great! Time zone rules actually change quite frequently. IANA time zone data gets updated every 3-4 months, as different jurisdictions make changes to their laws about time or as more historical information about timezones are uncovered. tz is smart enough to use the date in your datetime to determine which rules to use historically.</font>

In [14]:
# Create a timezone object for Eastern Time
et = tz.gettz('America/New_York')

# Loop over trips, updating the datetimes to be in Eastern Time
[{'start': trip['start'].replace(tzinfo=et), 
  'end'  : trip['end'  ].replace(tzinfo=et)} for trip in onebike_datetimes[:10]]

[{'start': datetime.datetime(2017, 10, 1, 15, 23, 25, tzinfo=tzfile('US/Eastern')),
  'end': datetime.datetime(2017, 10, 1, 15, 26, 26, tzinfo=tzfile('US/Eastern'))},
 {'start': datetime.datetime(2017, 10, 1, 15, 42, 57, tzinfo=tzfile('US/Eastern')),
  'end': datetime.datetime(2017, 10, 1, 17, 49, 59, tzinfo=tzfile('US/Eastern'))},
 {'start': datetime.datetime(2017, 10, 2, 6, 37, 10, tzinfo=tzfile('US/Eastern')),
  'end': datetime.datetime(2017, 10, 2, 6, 42, 53, tzinfo=tzfile('US/Eastern'))},
 {'start': datetime.datetime(2017, 10, 2, 8, 56, 45, tzinfo=tzfile('US/Eastern')),
  'end': datetime.datetime(2017, 10, 2, 9, 18, 3, tzinfo=tzfile('US/Eastern'))},
 {'start': datetime.datetime(2017, 10, 2, 18, 23, 48, tzinfo=tzfile('US/Eastern')),
  'end': datetime.datetime(2017, 10, 2, 18, 45, 5, tzinfo=tzfile('US/Eastern'))},
 {'start': datetime.datetime(2017, 10, 2, 18, 48, 8, tzinfo=tzfile('US/Eastern')),
  'end': datetime.datetime(2017, 10, 2, 19, 10, 54, tzinfo=tzfile('US/Eastern'))},
 {'st

# <font color=darkred>3.7 What time did the bike leave? (Global edition)</font> 

When you need to move a datetime from one timezone into another, use .astimezone() and tz. Often you will be moving things into UTC, but for fun let's try moving things from 'America/New_York' into a few different time zones.

**Instructions**
- Set uk to be the timezone for the UK: 'Europe/London'.
- Change local to be in the uk timezone and assign it to notlocal.
- Set ist to be the timezone for India: 'Asia/Kolkata'.
- Change local to be in the ist timezone and assign it to notlocal.
- Set sm to be the timezone for Samoa: 'Pacific/Apia'.
- Change local to be in the sm timezone and assign it to notlocal.

**Results**

<font color=darkgreen>Did you notice the time offset for this one? It's at UTC+14! Samoa used to be UTC-10, but in 2011 it changed to the other side of the International Date Line to better match New Zealand, its closest trading partner. However, they wanted to keep the clocks the same, so the UTC offset shifted from -10 to +14, since 24-10 is 14. Timezones... not simple!</font>

In [15]:
# Create the timezone object
uk = tz.gettz('Europe/London')

# Pull out the start of the first trip
local = onebike_timezone[0]['start']

# What time was it in the UK?
notlocal = local.astimezone(uk)

# Print them out and see the difference
print('America/New_York:', local.isoformat())
print('Europe/London   :', notlocal.isoformat())

America/New_York: 2017-10-01T15:23:25-04:00
Europe/London   : 2017-10-01T20:23:25+01:00


In [16]:
# Create the timezone object
ist = tz.gettz('Asia/Kolkata')

# Pull out the start of the first trip
local = onebike_timezone[0]['start']

# What time was it in India?
notlocal = local.astimezone(ist)

# Print them out and see the difference
print('America/New_York:', local.isoformat())
print('Asia/Kolkata    :', notlocal.isoformat())

America/New_York: 2017-10-01T15:23:25-04:00
Asia/Kolkata    : 2017-10-02T00:53:25+05:30


In [17]:
# Create the timezone object
sm = tz.gettz('Pacific/Apia')

# Pull out the start of the first trip
local = onebike_timezone[0]['start']

# What time was it in Samoa?
notlocal = local.astimezone(sm)

# Print them out and see the difference
print('America/New_York:', local.isoformat())
print('Pacific/Apia    :', notlocal.isoformat())

America/New_York: 2017-10-01T15:23:25-04:00
Pacific/Apia    : 2017-10-02T09:23:25+14:00


# <font color=darkred>3.8 Starting daylight saving time</font>

1. Starting Daylight Saving Time
>Some places change their clocks twice a year to create longer summer evenings. This practice is called daylight saving time, but it would better be called daylight shifting time. In some countries it is called "summer time". Dealing with daylight saving time can be one of the most fiendish challenges in dealing with dates and times. To keep things simple, let's start with the situation where the clocks move forward in the spring. In the next lesson, we'll discuss handling the opposite case, when the clocks move back in the fall.

2. Start of Daylight Saving Time
>Let's look at an example. On March 12, 2017, in Washington, D.C., the clock jumped straight from 1:59 am to 3 am. The clock "springs forward". It never officially struck 2 am anywhere on the East Coast of the United States that day.

3. Start of Daylight Saving Time
>Just like before, to make our clock in Washington, D.C. comparable to clocks in other places, we need to represent it with a UTC offset. Only now the UTC offset is going to change. On this date, at 1 AM in Washington, D.C., we were in Eastern Standard Time. It was 6 AM UTC, a five-hour difference. At 3 AM in Washington, D.C., we were in Eastern Daylight Time. It was 7 AM UTC, a four-hour difference.

4. Start of Daylight Saving Time
>Let's see the same thing in code. To be as clear as possible, let's create the UTC offsets by hand for now instead of using dateutil.tz. We start by creating a datetime object, spring_ahead_159am, for March 12th, at 1:59:59, without any timezone information. We print the results out with isoformat() to check that we have the time right, and we make another object for spring_ahead_3am. We subtract the two datetime objects and ask how much time has elapsed by calling total_seconds(). As expected, they're an hour and one second apart.

5. Start of Daylight Saving Time
>As before, to fix problems with comparing datetimes we start by creating timezone objects. We define Eastern Standard Time, or EST, using the timezone constructor. We set the offset to -5 hours. Similarly, we define Eastern Daylight Time, or EDT, with an offset of -4 hours.

6. Start of Daylight Saving Time
>We assign our first timestamp, at 1:59 am to be in EST. When we call isoformat(), we see it has the correct offset. We assign our second timestamp, at 3:00 am, to be in EDT, and again check the output with isoformat(). When we subtract the two datetime objects, we see correctly that one second has elapsed. Putting things in terms of UTC once again allowed us to make proper comparisons.

7. Start of Daylight Saving Time
>But how do we know when the cutoff is without looking it up ourselves? dateutil to the rescue again. Just like before when it saved us from having to define timezones by hand, dateutil saves us from having to know daylight savings rules. We create a timezone object by calling tz.gettz() and pass our timezone description string. Recall that since Washington, D.C. is in the America/New_York time zone, that's what we use. Once again we create a datetime corresponding to 1:59 am on the day that the east coast of the US springs forward. This time though, we set the tzinfo to eastern time. Similarly, we create a datetime set to 3 am on March 12th, and when we set tzinfo to be eastern time, dateutil figures out for us that it should be in EDT.

8. Daylight Saving
>In this lesson, we covered "spring ahead". Let's try some examples of working with datetimes that handle a switch into daylight saving time.

In [18]:
# Start of Daylight Saving Time - Whithout timezone
spring_ahead_159am = datetime(2017, 3, 12, 1, 59, 59)
print(spring_ahead_159am.isoformat())

spring_ahead_3am = datetime(2017, 3, 12, 3, 0, 0)
print(spring_ahead_3am.isoformat())

display((spring_ahead_3am - spring_ahead_159am).total_seconds())
(spring_ahead_3am - spring_ahead_159am).seconds

2017-03-12T01:59:59
2017-03-12T03:00:00


3601.0

3601

In [19]:
# Start of Daylight Saving Time - Manual timezone
EST = timezone(timedelta(hours=-5))
EDT = timezone(timedelta(hours=-4))

spring_ahead_159am = spring_ahead_159am.replace(tzinfo = EST)
print(spring_ahead_159am.isoformat())

spring_ahead_3am = spring_ahead_3am.replace(tzinfo = EDT)
print(spring_ahead_3am.isoformat())

display((spring_ahead_3am - spring_ahead_159am).total_seconds())
(spring_ahead_3am - spring_ahead_159am).seconds

2017-03-12T01:59:59-05:00
2017-03-12T03:00:00-04:00


1.0

1

In [20]:
# Start of Daylight Saving Time - tz 
# Create eastern timezone
eastern = tz.gettz('America/New_York')

spring_ahead_159am = datetime(2017, 3, 12, 1, 59, 59, tzinfo = eastern)
print(spring_ahead_159am.isoformat())

spring_ahead_3am = datetime(2017, 3, 12, 3, 0, 0, tzinfo = eastern)
print(spring_ahead_3am.isoformat())

display((spring_ahead_3am - spring_ahead_159am).total_seconds())
(spring_ahead_3am - spring_ahead_159am).seconds

2017-03-12T01:59:59-05:00
2017-03-12T03:00:00-04:00


3601.0

3601

# <font color=darkred>3.9 How many hours elapsed around daylight saving?</font> 

Since our bike data takes place in the fall, you'll have to do something else to learn about the start of daylight savings time.

Let's look at March 12, 2017, in the Eastern United States, when Daylight Saving kicked in at 2 AM.

If you create a datetime for midnight that night, and add 6 hours to it, how much time will have elapsed?

**Instructions**
- You already have a datetime called start, set for March 12, 2017 at midnight, set to the timezone 'America/New_York'.
- Add six hours to start and assign it to end. Look at the UTC offset for the two results.
- You added 6 hours, and got 6 AM, despite the fact that the clocks springing forward means only 5 hours would have actually elapsed!
- Calculate the time between start and end. How much time does Python think has elapsed?
- Move your datetime objects into UTC and calculate the elapsed time again.
- Once you're in UTC, what result do you get?

**Results**

<font color=darkgreen>When we compare times in local time zones, everything gets converted into clock time. Remember if you want to get absolute time differences, always move to UTC!</font>

In [21]:
# Start on March 12, 2017, midnight, then add 6 hours
start = datetime(2017, 3, 12, tzinfo = tz.gettz('America/New_York'))
end = start + timedelta(hours=6)
print(start.isoformat() + " to " + end.isoformat())

# How many hours have elapsed?
print((end - start).total_seconds()/(60*60))

# What if we move to UTC?
print((end.astimezone(timezone.utc) - start.astimezone(timezone.utc)).total_seconds()/(60*60))

2017-03-12T00:00:00-05:00 to 2017-03-12T06:00:00-04:00
6.0
5.0


In [22]:
# Testing one more time
awkward_start = datetime(2017, 3, 12, 1, 59, 59, tzinfo = tz.gettz('America/New_York'))
print('Original:', awkward_start, '| UTC:', awkward_start.astimezone(timezone.utc))

awkward_end = datetime(2017, 3, 12, 3, 0, 0, tzinfo = tz.gettz('America/New_York'))
print('Original:', awkward_end, '| UTC:', awkward_end.astimezone(timezone.utc))

# How many hours have elapsed?
print((awkward_end - awkward_start).total_seconds())

# What if we move to UTC?
print((awkward_end.astimezone(timezone.utc) - awkward_start.astimezone(timezone.utc)).total_seconds())

Original: 2017-03-12 01:59:59-05:00 | UTC: 2017-03-12 06:59:59+00:00
Original: 2017-03-12 03:00:00-04:00 | UTC: 2017-03-12 07:00:00+00:00
3601.0
1.0


# <font color=darkred>3.10 March 29, throughout a decade</font> 

Daylight Saving rules are complicated: they're different in different places, they change over time, and they usually start on a Sunday (and so they move around the calendar).

For example, in the United Kingdom, as of the time this lesson was written, Daylight Saving begins on the last Sunday in March. Let's look at the UTC offset for March 29, at midnight, for the years 2000 to 2010.

**Instructions**
- Using tz, set the timezone for dt to be 'Europe/London'.
- Within the for loop:
    - Use the .replace() method to change the year for dt to be y.
    - Call .isoformat() on the result to observe the results.

**Results**

<font color=darkgreen>Nice! As you can see, the rules for Daylight Saving are not trivial. When in doubt, always use tz instead of hand-rolling timezones, so it will catch the Daylight Saving rules (and rule changes!) for you.</font>

In [23]:
# Create starting date
dt = datetime(2000, 3, 29, tzinfo = tz.gettz('Europe/London'))

# Loop over the dates, replacing the year, and print the ISO timestamp
for y in range(2000, 2011):
    print(dt.replace(year=y).isoformat())

2000-03-29T00:00:00+01:00
2001-03-29T00:00:00+01:00
2002-03-29T00:00:00+00:00
2003-03-29T00:00:00+00:00
2004-03-29T00:00:00+01:00
2005-03-29T00:00:00+01:00
2006-03-29T00:00:00+01:00
2007-03-29T00:00:00+01:00
2008-03-29T00:00:00+00:00
2009-03-29T00:00:00+00:00
2010-03-29T00:00:00+01:00


# <font color=darkred>3.11 Ending daylight saving time</font>

1. Ending Daylight Saving Time
>In the previous lesson, we discussed how to handle when the clock "springs ahead" and we enter daylight saving. In the fall, when the clocks are reset back to standard time, an interesting wrinkle occurs. In this lesson, we'll finish our discussion of daylight saving time by showing what happens when we "fall back", and also talk about how to unambiguously handle events which bridge a daylight savings jump.

2. Ending Daylight Saving Time
>Let's look back at our example in Washington, D.C., on the day that daylight saving time ended. On November 5th, 2017, at 2 AM the clocks jumped back an hour. That means there were two 1 AMs! We've represented this by "folding" over our timeline to show the repeat.

3. Ending Daylight Saving Time
>As before, in order to make sense of this situation, we need to map everything back to UTC. The first 1 AM maps to 5 AM UTC. This is the minus 4 hour UTC offset for Eastern Daylight Time we discussed in the previous lesson. At 1:59:59 local time, we're at 5:59:59 UTC. The next moment, our local clock jumps back, but since time has not actually gone backward, the clock continues to tick in UTC. We switch to a UTC offset of minus 5 hours (colored in blue), and the second 1 AM corresponds to 6 AM UTC.

4. Ending Daylight Saving Time
>First, let's make a tzinfo object corresponding to our bike data's timezone. We make a datetime for November 5th and 1 am. Let's check and see if this time is ambiguous, meaning we need to tell it apart somehow. We call tz.datetime_ambiguous(), and see that, yes, this is a time which could occur at two different UTC moments in this timezone. Now we create a second datetime, with the same date and time. This time, we call tz.enfold(), which takes the argument of the datetime we want to mark. enfold says, this datetime belongs to the *second* time the wall clock struck 1 AM this day, and not the first.

5. Ending Daylight Saving Time
>The thing is, enfold by itself doesn't change any of the behavior of a datetime. You can see here that Python doesn't take it into account when doing datetime arithmetic. Fold is just a placeholder, and it's up to further parts of the program to pay attention to fold and do something with it. What are we going to do?! We need to convert to UTC, which is unambiguous. When we really want to make sure that everything is accounted for, putting everything into UTC is the way to do it. Now when we ask Python to take the difference, we see that it correctly tells us these two timestamps are an hour apart. In general, whenever we really want to be sure of the duration between events that might cross a daylight saving boundary, we need to do our math in UTC.

6. Ending Daylight Saving Time
>We've covered how to handle springing forward and falling back, both with hand-coded UTC offsets and with dateutil. Python often tries to be helpful by glossing over daylight saving time difference, and oftentimes that's what you want. However, when you do care about it, use dateutil to set the timezone information correctly and then switch into UTC for the most accurate comparisons between events.

In [24]:
# Ending Daylight Saving Time - First 1 a.m.
eastern = tz.gettz('US/Eastern')

first_1am = datetime(2017, 11, 5, 1, 0, 0, tzinfo = eastern)
print(first_1am)

print(tz.datetime_ambiguous(first_1am))

# Ending Daylight Saving Time - Second 1 a.m.
second_1am = datetime(2017, 11, 5, 1, 0, 0, tzinfo = eastern)
second_1am = tz.enfold(second_1am)
print(second_1am)

print('Difference in local zone:', (first_1am - second_1am).total_seconds(), 'seg.')

first_1am = first_1am.astimezone(tz.UTC)
second_1am = second_1am.astimezone(tz.UTC)
print('Difference in UTC zone  :', (second_1am - first_1am).total_seconds(), 'seg.')

2017-11-05 01:00:00-04:00
True
2017-11-05 01:00:00-05:00
Difference in local zone: 0.0 seg.
Difference in UTC zone  : 3600.0 seg.


# <font color=darkred>3.12 Finding ambiguous datetimes</font> 

At the end of lesson 2, we saw something anomalous in our bike trip duration data. Let's see if we can identify what the problem might be.

The data is loaded as onebike_datetimes, and tz has already been imported from dateutil.

**Instructions**
- Loop over the trips in onebike_datetimes:
    - Print any rides whose start is ambiguous.
    - Print any rides whose end is ambiguous.

**Results**

<font color=darkgreen>Good work! Avoid ambiguous datetimes in practice by storing datetimes in UTC.</font>

In [25]:
ny = tz.gettz('America/New_York')
# Setting the right time zone to data
onebike_timezone = [{'start': trip['start'].replace(tzinfo=ny), 
                     'end'  : trip['end'  ].replace(tzinfo=ny)} for trip in onebike_datetimes]

# Loop over trips
for trip in onebike_timezone:
    # Rides with ambiguous start
    if tz.datetime_ambiguous(trip['start']):
        print("Ambiguous start at " + str(trip['start']))
    # Rides with ambiguous end
    if tz.datetime_ambiguous(trip['end']):
        print("Ambiguous end at " + str(trip['end']))

Ambiguous start at 2017-11-05 01:56:50-04:00
Ambiguous end at 2017-11-05 01:01:04-04:00


In [26]:
# Loop again to get more information
for trip in onebike_timezone:
    ambiguos_start = tz.datetime_ambiguous(trip['start'])
    ambiguos_end   = tz.datetime_ambiguous(trip['end'])
    
    # Rides with ambiguous start
    if ambiguos_start | ambiguos_end:
        print("Start at " + str(trip['start']) + (' (Ambiguous)' if ambiguos_start else ''))
        print("End   at " + str(trip['end'])   + (' (Ambiguous)' if ambiguos_end   else ''))
        print('Elapsed time {} sec.'.format((trip['end'] - trip['start']).total_seconds()))

Start at 2017-11-05 01:56:50-04:00 (Ambiguous)
End   at 2017-11-05 01:01:04-04:00 (Ambiguous)
Elapsed time -3346.0 sec.


# <font color=darkred>3.13 Cleaning daylight saving data with fold</font> 

As we've just discovered, there is a ride in our data set which is being messed up by a Daylight Savings shift. Let's clean up the data set so we actually have a correct minimum ride length. We can use the fact that we know the end of the ride happened after the beginning to fix up the duration messed up by the shift out of Daylight Savings.

Since Python does not handle tz.enfold() when doing arithmetic, we must put our datetime objects into UTC, where ambiguities have been resolved.

onebike_datetimes is already loaded and in the right timezone. tz and timezone have been imported. Use tz.UTC for the timezone.

**Instructions**
- Complete the if statement to be true only when a ride's start comes after its end.
- When start is after end, call tz.enfold() on the end so you know it refers to the one after the daylight savings time change.
- After the if statement, convert the start and end to UTC so you can make a proper comparison.

**Results**

<font color=darkgreen>Good work! Now you know how to handle some pretty gnarly edge cases in datetime data. To give a sense for how tricky these things are: we actually still don't know how long the rides are which only started or ended in our ambiguous hour but not both. If you're collecting data, store it in UTC or with a fixed UTC offset!</font>

In [27]:
trip_durations = []
for trip in onebike_timezone:
    # When the start is later than the end, set the fold to be 1
    if trip['start'] > trip['end']:
        trip['end'] = tz.enfold(trip['end'])
        
    # Convert to UTC
    start = trip['start'].astimezone(tz.UTC)
    end = trip['end'].astimezone(tz.UTC)

    # Subtract the difference
    trip_length_seconds = (end-start).total_seconds()
    trip_durations.append(trip_length_seconds)

# Take the shortest trip duration
print("Shortest trip: " + str(min(trip_durations)))

Shortest trip: 116.0


In [28]:
# Shorter code
trip_durations = [((tz.enfold(trip['end']) if trip['start'] > trip['end'] else trip['end']).astimezone(tz.UTC) - \
                    trip['start'].astimezone(tz.UTC)).total_seconds() \
                  for trip in onebike_timezone]

# Take the shortest trip duration
print("Shortest trip: " + str(min(trip_durations)))

Shortest trip: 116.0


# Aditional material

- Datacamp course: https://learn.datacamp.com/courses/working-with-dates-and-times-in-python