# Putting the bike trips into the right time zone
Instead of setting the timezones for W20529 by hand, let's assign them to their IANA timezone: 'America/New_York'. Since we know their political jurisdiction, we don't need to look up their UTC offset. Python will do that for us.


* Import tz from dateutil.
* Assign et to be the timezone 'America/New_York'.
* Within the for loop, set start and end to have et as their timezone (use .replace()).`


In [1]:
from datetime import datetime

# Sample OneBike data with naive datetimes
onebike_datetimes = [
    {
        'start': datetime(2017, 10, 1, 15, 23),
        'end': datetime(2017, 10, 1, 15, 45)
    },
    {
        'start': datetime(2017, 10, 1, 15, 56),
        'end': datetime(2017, 10, 1, 16, 10)
    },
    {
        'start': datetime(2017, 10, 2, 6, 59),
        'end': datetime(2017, 10, 2, 7, 10)
    },
    {
        'start': datetime(2017, 10, 2, 7, 12),
        'end': datetime(2017, 10, 2, 7, 45)
    },
    {
        'start': datetime(2017, 10, 3, 14, 2),
        'end': datetime(2017, 10, 3, 14, 55)
    }
]


In [2]:
# Import tz
from dateutil import tz

# Create a timezone object for Eastern Time
et = tz.gettz('America/New_York')

# Loop over trips, updating the datetimes to be in Eastern Time
for trip in onebike_datetimes[:10]:
  # Update trip['start'] and trip['end']
  trip['start'] = trip['start'].replace(tzinfo=et)
  trip['end'] = trip['end'].replace(tzinfo=et)

Great! Time zone rules actually change quite frequently. IANA time zone data gets updated every 3-4 months, as different jurisdictions make changes to their laws about time or as more historical information about timezones are uncovered. tz is smart enough to use the date in your datetime to determine which rules to use historical

# What time did the bike leave? (Global edition)
When you need to move a datetime from one timezone into another, use .astimezone() and tz. Often you will be moving things into UTC, but for fun let's try moving things from 'America/New_York' into a few different time zones.



* Set uk to be the timezone for the UK: 'Europe/London'.
* Change local to be in the uk timezone and assign it to notlocal.
* Set ist to be the timezone for India: 'Asia/Kolkata'.
* Change local to be in the ist timezone and assign it to notlocal.
* Set sm to be the timezone for Samoa: 'Pacific/Apia'
* Change local to be in the sm timezone and assign it to notlocal.


In [3]:
# Create the timezone object
sm = tz.gettz('Pacific/Apia')

# Pull out the start of the first trip
local = onebike_datetimes[0]['start']

# What time was it in Samoa?
notlocal = local.astimezone(sm)

# Print them out and see the difference
print(local.isoformat())
print(notlocal.isoformat())

2017-10-01T15:23:00-04:00
2017-10-02T09:23:00+14:00


In [4]:
# Create the timezone object
ist = tz.gettz('Asia/Kolkata')

# Pull out the start of the first trip
local = onebike_datetimes[0]['start']

# What time was it in India?
notlocal = local.astimezone(ist)

# Print them out and see the difference
print(local.isoformat())
print(notlocal.isoformat())

2017-10-01T15:23:00-04:00
2017-10-02T00:53:00+05:30


In [5]:
# Create the timezone object
sm = tz.gettz('Pacific/Apia')

# Pull out the start of the first trip
local = onebike_datetimes[0]['start']

# What time was it in Samoa?
notlocal = local.astimezone(sm)

# Print them out and see the difference
print(local.isoformat())
print(notlocal.isoformat())

2017-10-01T15:23:00-04:00
2017-10-02T09:23:00+14:00


Did you notice the time offset for this one? It's at UTC+14! Samoa used to be UTC-10, but in 2011 it changed to the other side of the International Date Line to better match New Zealand, its closest trading partner. However, they wanted to keep the clocks the same, so the UTC offset shifted from -10 to +14, since 24-10 is 14. Timezones... not simple!

# How many hours elapsed around daylight saving?
Since our bike data takes place in the fall, you'll have to do something else to learn about the start of daylight savings time.

Let's look at March 12, 2017, in the Eastern United States, when Daylight Saving kicked in at 2 AM.

If you create a datetime for midnight that night, and add 6 hours to it, how much time will have elapsed?


* You already have a datetime called start, set for March 12, 2017 at midnight, set to the timezone 'America/New_York'.

* Add six hours to start and assign it to end. Look at the UTC offset for the two results.

In [6]:
# Import datetime, timedelta, tz, timezone
from datetime import datetime, timedelta, timezone
from dateutil import tz

# Start on March 12, 2017, midnight, then add 6 hours
start = datetime(2017, 3, 12, tzinfo = tz.gettz('America/New_York'))
end = start + timedelta(hours = 6)
print(start.isoformat() + " to " + end.isoformat())

2017-03-12T00:00:00-05:00 to 2017-03-12T06:00:00-04:00


You added 6 hours, and got 6 AM, despite the fact that the clocks springing forward means only 5 hours would have actually elapsed!

Calculate the time between start and end. How much time does Python think has elapsed?

In [7]:
# Import datetime, timedelta, tz, timezone
from datetime import datetime, timedelta, timezone
from dateutil import tz

# Start on March 12, 2017, midnight, then add 6 hours
start = datetime(2017, 3, 12, tzinfo = tz.gettz('America/New_York'))
end = start + timedelta(hours=6)
print(start.isoformat() + " to " + end.isoformat())

# How many hours have elapsed?
print((start - end).total_seconds()/(60*60))

2017-03-12T00:00:00-05:00 to 2017-03-12T06:00:00-04:00
-6.0


Move your datetime objects into UTC and calculate the elapsed time again.

Once you're in UTC, what result do you get?

In [8]:
# Import datetime, timedelta, tz, timezone
from datetime import datetime, timedelta, timezone
from dateutil import tz

# Define Eastern Time zone
et = tz.gettz('America/New_York')

# Start on March 12, 2017, midnight in ET, then add 6 hours
start = datetime(2017, 3, 12, tzinfo=et)
end = start + timedelta(hours=6)
print(start.isoformat() + " to " + end.isoformat())

# How many hours have elapsed in local ET time?
print((end - start).total_seconds() / (60 * 60))

# Move both to UTC using astimezone (this correctly adjusts the clock time)
start_utc = start.astimezone(timezone.utc)
end_utc = end.astimezone(timezone.utc)

# How many hours elapsed in UTC?
print((end_utc - start_utc).total_seconds() / (60 * 60))


2017-03-12T00:00:00-05:00 to 2017-03-12T06:00:00-04:00
6.0
5.0


In [9]:
abc = (x for x in range(3))

print(list(abc))
print(list(abc))
print(list(abc))
print(type(abc))


[0, 1, 2]
[]
[]
<class 'generator'>


In [10]:
abc = [x for x in range(3)]

print(list(abc))
print(list(abc))
print(list(abc))
print(type(abc))


[0, 1, 2]
[0, 1, 2]
[0, 1, 2]
<class 'list'>


# March 29, throughout a decade
Daylight Saving rules are complicated: they're different in different places, they change over time, and they usually start on a Sunday (and so they move around the calendar).

For example, in the United Kingdom, as of the time this lesson was written, Daylight Saving begins on the last Sunday in March. Let's look at the UTC offset for March 29, at midnight, for the years 2000 to 2010.


* Using tz, set the timezone for dt to be 'Europe/London'.
* Within the for loop:
* Use the .replace() method to change the year for dt to be y.
* Call .isoformat() on the result to observe the results.

In [11]:
# Import datetime and tz
from datetime import datetime
from dateutil import tz

# Create starting date
dt = datetime(2000, 3, 29, tzinfo = tz.gettz('Europe/London'))

# Loop over the dates, replacing the year, and print the ISO timestamp
for y in range(2000, 2011):
  print(dt.replace(year=y).isoformat())

2000-03-29T00:00:00+01:00
2001-03-29T00:00:00+01:00
2002-03-29T00:00:00+00:00
2003-03-29T00:00:00+00:00
2004-03-29T00:00:00+01:00
2005-03-29T00:00:00+01:00
2006-03-29T00:00:00+01:00
2007-03-29T00:00:00+01:00
2008-03-29T00:00:00+00:00
2009-03-29T00:00:00+00:00
2010-03-29T00:00:00+01:00


# Finding ambiguous datetimes
At the end of lesson 2, we saw something anomalous in our bike trip duration data. Let's see if we can identify what the problem might be.

The data is loaded as onebike_datetimes, and tz has already been imported from dateutil.


* Loop over the trips in onebike_datetimes:
* Print any rides whose start is ambiguous.
* Print any rides whose end is ambiguous.

In [13]:
# Loop over trips
for trip in onebike_datetimes:
  # Rides with ambiguous start
  if tz.datetime_ambiguous(trip['start']):
    print("Ambiguous start at " + str(trip['start']))
  # Rides with ambiguous end
  if tz.datetime_ambiguous(trip['end']):
    print("Ambiguous end at " + str(trip['end']))

# Cleaning daylight saving data with fold
As we've just discovered, there is a ride in our data set which is being messed up by a Daylight Savings shift. Let's clean up the data set so we actually have a correct minimum ride length. We can use the fact that we know the end of the ride happened after the beginning to fix up the duration messed up by the shift out of Daylight Savings.

Since Python does not handle tz.enfold() when doing arithmetic, we must put our datetime objects into UTC, where ambiguities have been resolved.

onebike_datetimes is already loaded and in the right timezone. tz and timezone have been imported. Use tz.UTC for the timezone.


* Complete the if statement to be true only when a ride's start comes after its end.
* When start is after end, call tz.enfold() on the end so you know it refers to the one after the daylight savings time change.
* After the if statement, convert the start and end to UTC so you can make a proper comparison.

In [14]:
trip_durations = []
for trip in onebike_datetimes:
  # When the start is later than the end, set the fold to be 1
  if trip['start'] > trip['end']:
    trip['end'] = tz.enfold(trip['end'])
  # Convert to UTC
  start = trip['start'].astimezone(tz.UTC)
  end = trip['end'].astimezone(tz.UTC)

  # Subtract the difference
  trip_length_seconds = (end-start).total_seconds()
  trip_durations.append(trip_length_seconds)

# Take the shortest trip duration
print("Shortest trip: " + str(min(trip_durations)))

Shortest trip: 660.0
