# Creating datetimes by hand
Often you create datetime objects based on outside data. Sometimes though, you want to create a datetime object from scratch.

You're going to create a few different datetime objects from scratch to get the hang of that process. These come from the bikeshare data set that you'll use throughout the rest of the chapter.

In [1]:
# Import datetime
from datetime import datetime

# Create a datetime object
dt = datetime(2017, 10, 1, 15, 26, 26)

# Print the results in ISO 8601 format
print(dt.isoformat())

2017-10-01T15:26:26


In [2]:
# Import datetime
from datetime import datetime

# Create a datetime object
dt = datetime(2017, 12, 31, 15, 19, 13)

# Print the results in ISO 8601 format
print(dt.isoformat())

2017-12-31T15:19:13


In [3]:
# Import datetime
from datetime import datetime

# Create a datetime object
dt = datetime(2017, 12, 31, 15, 19, 13)

# Replace the year with 1917
dt_old = dt.replace(year=1917)

# Print the results in ISO 8601 format
print(dt_old)

1917-12-31 15:19:13


# Counting events before and after noon
In this chapter, you will be working with a list of all bike trips for one Capital Bikeshare bike, W20529, from October 1, 2017 to December 31, 2017. This list has been loaded as onebike_datetimes.

Each element of the list is a dictionary with two entries: start is a datetime object corresponding to the start of a trip (when a bike is removed from the dock) and end is a datetime object corresponding to the end of a trip (when a bike is put back into a dock).

You can use this data set to understand better how this bike was used. Did more trips start before noon or after noon?

In [4]:
import pandas as pd

captial_onebike = pd.read_csv('../datasets/capital-onebike.csv')

fmt = "%Y-%m-%d %H:%M:%S"

onebike_datetime_strings = list(zip(captial_onebike['Start date'], captial_onebike['End date']))

onebike_datetimes = []

# Loop over all trips
for (start, end) in onebike_datetime_strings:
  trip = {'start': datetime.strptime(start, fmt),
          'end': datetime.strptime(end, fmt)}
  
  # Append the trip
  onebike_datetimes.append(trip)
    
# Create dictionary to hold results
trip_counts = {'AM': 0, 'PM': 0}
  
# Loop over all trips
for trip in onebike_datetimes:
    # Check to see if the trip starts before noon
    if trip['start'].hour < 12:
        # Increment the counter for before noon
        trip_counts['AM'] += 1
    else:
        # Increment the counter for after noon
        trip_counts['PM'] += 1
print(trip_counts)

{'AM': 94, 'PM': 196}


# Turning strings into datetimes
When you download data from the Internet, dates and times usually come to you as strings. Often the first step is to turn those strings into datetime objects.

In this exercise, you will practice this transformation.

| Reference |    |
|-----------|----|
|%Y|4 digit year (0000-9999)|
|%m|2 digit month (1-12)|
|%d|2 digit day (1-31)|
|%H|2 digit hour (0-23)|
|%M|2 digit minute (0-59)|
|%S|2 digit second (0-59)


In [5]:
# Import the datetime class
from datetime import datetime

# Starting string, in YYYY-MM-DD HH:MM:SS format
s = '2017-02-03 00:00:01'

# Write a format string to parse s
fmt = '%Y-%m-%d %H:%M:%S'

# Create a datetime object d
d = datetime.strptime(s, fmt)

# Print d
print(d)

2017-02-03 00:00:01


In [6]:
# Import the datetime class
from datetime import datetime

# Starting string, in YYYY-MM-DD format
s = '2030-10-15'

# Write a format string to parse s
fmt = '%Y-%m-%d'

# Create a datetime object d
d = datetime.strptime(s, fmt)

# Print d
print(d)

2030-10-15 00:00:00


In [7]:
# Import the datetime class
from datetime import datetime

# Starting string, in MM/DD/YYYY HH:MM:SS format
s = '12/15/1986 08:00:00'

# Write a format string to parse s
fmt = '%m/%d/%Y %H:%M:%S'

# Create a datetime object d
d = datetime.strptime(s, fmt)

# Print d
print(d)

1986-12-15 08:00:00


# Parsing pairs of strings as datetimes
Up until now, you've been working with a pre-processed list of datetimes for W20529's trips. For this exercise, you're going to go one step back in the data cleaning pipeline and work with the strings that the data started as.

Explore onebike_datetime_strings in the IPython shell to determine the correct format. datetime has already been loaded for you.

| Reference |    |
|-----------|----|
|%Y|4 digit year (0000-9999)|
|%m|2 digit month (1-12)|
|%d|2 digit day (1-31)|
|%H|2 digit hour (0-23)|
|%M|2 digit minute (0-59)|
|%S|2 digit second (0-59)

In [8]:
import pandas as pd

captial_onebike = pd.read_csv('../datasets/capital-onebike.csv')

captial_onebike

Unnamed: 0,Start date,End date,Start station number,Start station,End station number,End station,Bike number,Member type
0,2017-10-01 15:23:25,2017-10-01 15:26:26,31038,Glebe Rd & 11th St N,31036,George Mason Dr & Wilson Blvd,W20529,Member
1,2017-10-01 15:42:57,2017-10-01 17:49:59,31036,George Mason Dr & Wilson Blvd,31036,George Mason Dr & Wilson Blvd,W20529,Casual
2,2017-10-02 06:37:10,2017-10-02 06:42:53,31036,George Mason Dr & Wilson Blvd,31037,Ballston Metro / N Stuart & 9th St N,W20529,Member
3,2017-10-02 08:56:45,2017-10-02 09:18:03,31037,Ballston Metro / N Stuart & 9th St N,31295,Potomac & M St NW,W20529,Member
4,2017-10-02 18:23:48,2017-10-02 18:45:05,31295,Potomac & M St NW,31230,Metro Center / 12th & G St NW,W20529,Member
...,...,...,...,...,...,...,...,...
285,2017-12-29 14:32:55,2017-12-29 14:43:46,31242,18th St & Pennsylvania Ave NW,31265,5th St & Massachusetts Ave NW,W20529,Member
286,2017-12-29 15:08:26,2017-12-29 15:18:51,31265,5th St & Massachusetts Ave NW,31613,Eastern Market Metro / Pennsylvania Ave & 7th ...,W20529,Casual
287,2017-12-29 20:33:34,2017-12-29 20:38:13,31613,Eastern Market Metro / Pennsylvania Ave & 7th ...,31618,4th & East Capitol St NE,W20529,Member
288,2017-12-30 13:51:03,2017-12-30 13:54:33,31618,4th & East Capitol St NE,31610,Eastern Market / 7th & North Carolina Ave SE,W20529,Member


In [9]:
# Write down the format string
fmt = "%Y-%m-%d %H:%M:%S"

# Initialize a list for holding the pairs of datetime objects
onebike_datetimes = []

onebike_datetime_strings = list(zip(captial_onebike['Start date'], captial_onebike['End date']))

# Loop over all trips
for (start, end) in onebike_datetime_strings:
  trip = {'start': datetime.strptime(start, fmt),
          'end': datetime.strptime(end, fmt)}
  
  # Append the trip
  onebike_datetimes.append(trip)
    
onebike_datetimes

[{'start': datetime.datetime(2017, 10, 1, 15, 23, 25),
  'end': datetime.datetime(2017, 10, 1, 15, 26, 26)},
 {'start': datetime.datetime(2017, 10, 1, 15, 42, 57),
  'end': datetime.datetime(2017, 10, 1, 17, 49, 59)},
 {'start': datetime.datetime(2017, 10, 2, 6, 37, 10),
  'end': datetime.datetime(2017, 10, 2, 6, 42, 53)},
 {'start': datetime.datetime(2017, 10, 2, 8, 56, 45),
  'end': datetime.datetime(2017, 10, 2, 9, 18, 3)},
 {'start': datetime.datetime(2017, 10, 2, 18, 23, 48),
  'end': datetime.datetime(2017, 10, 2, 18, 45, 5)},
 {'start': datetime.datetime(2017, 10, 2, 18, 48, 8),
  'end': datetime.datetime(2017, 10, 2, 19, 10, 54)},
 {'start': datetime.datetime(2017, 10, 2, 19, 18, 10),
  'end': datetime.datetime(2017, 10, 2, 19, 31, 45)},
 {'start': datetime.datetime(2017, 10, 2, 19, 37, 32),
  'end': datetime.datetime(2017, 10, 2, 19, 46, 37)},
 {'start': datetime.datetime(2017, 10, 3, 8, 24, 16),
  'end': datetime.datetime(2017, 10, 3, 8, 32, 27)},
 {'start': datetime.datetime

# Recreating ISO format with strftime()
In the last chapter, you used strftime() to create strings from date objects. Now that you know about datetime objects, let's practice doing something similar.

Re-create the .isoformat() method, using .strftime(), and print the first trip start in our data set.

In [10]:
# Import datetime
from datetime import datetime

# Pull out the start of the first trip
first_start = onebike_datetimes[0]['start']

# Format to feed to strftime()
fmt = "%Y-%m-%dT%H:%M:%S"

# Print out date with .isoformat(), then with .strftime() to compare
print(first_start.isoformat())
print(first_start.strftime(fmt))

2017-10-01T15:23:25
2017-10-01T15:23:25


# Unix timestamps
Datetimes are sometimes stored as Unix timestamps: the number of seconds since January 1, 1970. This is especially common with computer infrastructure, like the log files that websites keep when they get visitors.

In [11]:
# Import datetime
from datetime import datetime

# Starting timestamps
timestamps = [1514665153, 1514664543]

# Datetime objects
dts = []

# Loop
for ts in timestamps:
  dts.append(datetime.fromtimestamp(ts))
  
# Print results
print(dts)

[datetime.datetime(2017, 12, 31, 3, 19, 13), datetime.datetime(2017, 12, 31, 3, 9, 3)]


# Turning pairs of datetimes into durations
When working with timestamps, we often want to know how much time has elapsed between events. Thankfully, we can use datetime arithmetic to ask Python to do the heavy lifting for us so we don't need to worry about day, month, or year boundaries. Let's calculate the number of seconds that the bike was out of the dock for each trip.

Continuing our work from a previous coding exercise, the bike trip data has been loaded as the list onebike_datetimes. Each element of the list consists of two datetime objects, corresponding to the start and end of a trip, respectively.

In [12]:
# Initialize a list for all the trip durations
onebike_durations = []

for trip in onebike_datetimes:
    # Create a timedelta object corresponding to the length of the trip
    trip_duration = trip['end'] - trip['start']
  
    # Get the total elapsed seconds in trip_duration
    trip_length_seconds = trip_duration.total_seconds()
  
    # Append the results to our list
    onebike_durations.append(trip_length_seconds)
    
onebike_durations

[181.0,
 7622.0,
 343.0,
 1278.0,
 1277.0,
 1366.0,
 815.0,
 545.0,
 491.0,
 639.0,
 1678.0,
 406.0,
 709.0,
 514.0,
 492.0,
 1668.0,
 2242.0,
 2752.0,
 735.0,
 330.0,
 518.0,
 1433.0,
 204.0,
 304.0,
 977.0,
 1399.0,
 1244.0,
 658.0,
 800.0,
 1911.0,
 2471.0,
 1344.0,
 435.0,
 271.0,
 920.0,
 851.0,
 209.0,
 453.0,
 841.0,
 142.0,
 1023.0,
 1466.0,
 1636.0,
 3039.0,
 1571.0,
 1410.0,
 386.0,
 1527.0,
 622.0,
 1450.0,
 1422.0,
 991.0,
 1484.0,
 1450.0,
 929.0,
 533.0,
 525.0,
 283.0,
 133.0,
 1106.0,
 952.0,
 553.0,
 659.0,
 297.0,
 357.0,
 989.0,
 979.0,
 760.0,
 1110.0,
 675.0,
 1207.0,
 1593.0,
 768.0,
 1446.0,
 485.0,
 200.0,
 399.0,
 242.0,
 170.0,
 450.0,
 1078.0,
 1042.0,
 573.0,
 748.0,
 735.0,
 336.0,
 76913.0,
 171.0,
 568.0,
 358.0,
 917.0,
 671.0,
 1791.0,
 318.0,
 888.0,
 1284.0,
 11338.0,
 1686.0,
 5579.0,
 8290.0,
 1850.0,
 1810.0,
 870.0,
 436.0,
 429.0,
 494.0,
 1439.0,
 380.0,
 629.0,
 962.0,
 387.0,
 952.0,
 190.0,
 739.0,
 1120.0,
 369.0,
 2275.0,
 873.0,
 1670.0,
 

# Average trip time
W20529 took 291 trips in our data set. How long were the trips on average? We can use the built-in Python functions sum() and len() to make this calculation.

Based on your last coding exercise, the data has been loaded as onebike_durations. Each entry is a number of seconds that the bike was out of the dock.

In [13]:
# What was the total duration of all trips?
total_elapsed_time = sum(onebike_durations)

# What was the total number of trips?
number_of_trips = len(onebike_durations)
  
# Divide the total duration by the number of trips
print(total_elapsed_time / number_of_trips)

1178.9310344827586


# The long and the short of why time is hard
Out of 291 trips taken by W20529, how long was the longest? How short was the shortest? Does anything look fishy?

As before, data has been loaded as onebike_durations.

In [None]:
# Calculate shortest and longest trips
shortest_trip = min(onebike_durations)
longest_trip = max(onebike_durations)

# Print out the results
print("The shortest trip was " + str(shortest_trip) + " seconds")
print("The longest trip was " + str(longest_trip) + " seconds")