# the `datetime` module

*time is weird*. 

we frequently work with dates and timestamps, and it often gets tricky. there are countless issues that can crop up due to differing time formats, calendars, time zones. the every day definitions of the various time measures we use hide obscure complexity. thus, time intervals and date arithmetics are not trivial to compute from first principles. consider for instance:

- years differ in length, a year is not even an integer number of days long. 
- a calendar year is not an even number of weeks.
- the months, and the (financial) quarters in a year are not all equally long. and not an integer multiple of weeks.
- fiscal years differ from calendar years (and are location specific!)
- we might plausibly need to count events during a time period in number of working days (excluding weekends and holidays from the count). 
- given a datestamp, how you can find out what week number does that time stamp take place in? to calculate this from first principles we'd need to to know 
    + what weekday is the first day of the week? (convention varies by location and industry, typically either  sunday or monday)
    + was it in a leap year? february's extra day shifts the weekdays rythm.
    + what timezone was the timestamp recorded and in which timezone are we counting the weeks? (imagine we are aggregating by week as observed in UTC and the time stamp above was recorded in china standard time, many hours ahead. then the timestamp might actually belong to the "next week".)
- what date is 01/02/03? or 08/09/06 (this is a maddeningly common date format.)
- DO NOT get me started on daylight savings time... 

there are two aspects at play here. the first is date and time formating , and the second is date and time arithmetic. the latter is impossible without the former.

luckily python has modules that help us deal with these formats and calculations. we just need to learn to use them. here we will demonstrate the use of the `datetime` module.

### a suggestion:
i suggest that whenever and whereever you need to display a date, no matter what the context is, use the [iso format designed to minimize misunderstanding and confusion](https://www.iso.org/iso-8601-date-and-time-format.html): *YYYY-MM-DD*. 

this is the internationally recognized standard date format (and therefore the best format, bar none) whose adoption will reduce confusion and errors. use it. design dates fields in forms this way. date your notes this way. set your computer's localisation format to this. expect dates in this format by default. python will assume dates are in this format by default.

#### extra credit:
shame those who won't use this format, for their only defense is bigotry which favours their own familiarity over universal clarity and which hides a callous indifference to the manifold unforeseeable errors caused by date confusion.

# this week's exercise:
read in the new york rodent inspection data from last week. parse the date format of the two columns containing datetime information ('INSPECTION_DATE' and 'APPROVED_DATE'). hint: new york is in the united states. then, for each record (row), populate four new columns: 
- one containing the weekday name when the inspection took place (monday, tuesday, wednesday,...). 
- one containing the name of the month the inspection took place in.
- one containing the [iso-week](https://en.wikipedia.org/wiki/ISO_week_date) in which the inspection took place. 
- one column containing how long did it take for an inspection to be approved (in appropriate time units).

## heads up:
next week, we will consider (and possibly answer!) questions such as: 
- which weekday has the longest average wait time for approval in the winter? (let's define the seasons as: (dec-feb is winter, mar-may is spring, jun-aug is summer, sep-nov is fall).
- which weekday has the longest average wait time for approval in the summer?
- which season has the greatest number of inspections? 
- which season has the greatest number of distinct dates ...
    + a) in the data set
    + b) in the calendar? 
- which borough has the greatest difference in the number of inspections in the spring vs in the fall?
- count the number of inspections per [iso-week](https://en.wikipedia.org/wiki/ISO_week_date). find the week with the greatest number of inspections. for that week, and that week only, count the inspections by day-of-week.

In [1]:
import datetime

In [2]:
a_datetime = datetime.datetime(2001, 3, 16)
print('a date time:', a_datetime)
print('or just a date:', a_datetime.date())

a date time: 2001-03-16 00:00:00
or just a date: 2001-03-16


In [3]:
# let us start with taking the current moment as the first datetime object
a_datetime = datetime.datetime.now()
print(type(a_datetime))
print(a_datetime)

<class 'datetime.datetime'>
2018-10-15 10:32:25.568533


In [4]:
a_datetime = datetime.datetime.now()
# we can access the individual elements of a datetime object:
print('the current year',           a_datetime.year)
print('the current month',          a_datetime.month)
print('the current day',            a_datetime.day)
print('the current hour',           a_datetime.hour)
print('the current minute',         a_datetime.minute)
print('the current second',         a_datetime.second)
print('the current microsecond',    a_datetime.microsecond)
print('the current weekday number', a_datetime.weekday())

the current year 2018
the current month 10
the current day 15
the current hour 10
the current minute 32
the current second 25
the current microsecond 574843
the current weekday number 0


In [5]:
# there is also a slightly simpler date object: 
today_date = datetime.date.today()
print('the date is:',         today_date)
print('the year is:',         today_date.year)
print('the month is:',        today_date.month)
print('the day-of-month is:', today_date.day)
print('the day-of-week is:',  today_date.weekday())
print('the iso-weekday is:',  today_date.isoweekday())
(isoyear, isoweek, isoweekday) = today_date.isocalendar()
print('the iso year:', isoyear, ', isoweek:', isoweek, ', and isoweekday:', isoweekday)
try:
    print('the year is:',     today_date.hour)
except:
    print('a "date" object has no "hour" attribute (nor "minute", "second")')
print('the components are:', today_date.timetuple())

the date is: 2018-10-15
the year is: 2018
the month is: 10
the day-of-month is: 15
the day-of-week is: 0
the iso-weekday is: 1
the iso year: 2018 , isoweek: 42 , and isoweekday: 1
a "date" object has no "hour" attribute (nor "minute", "second")
the components are: time.struct_time(tm_year=2018, tm_mon=10, tm_mday=15, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=0, tm_yday=288, tm_isdst=-1)


# datetime format descriptors
in order to convert between datetime objects and plain text string representations of them. the `strftime()` (string - format - time) function reformats time strings. it operates on a datetime object and takes as an argument a "format descriptor", which is itself a string that specifies the time unit. 

a format descriptor string includes any (or all) of these: `'%a', '%A', '%w', '%d', '%b', '%B', '%y', '%Y', '%y', '%H', '%I', '%M', '%S', '%f', '%z', '%Z', '%j', '%U', '%W', '%c', '%x', '%X'`. here is how each of these works ([source](https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior)): 

In [6]:
print('the day is',              a_datetime.strftime("%d"), '(day of month 01-31)')
print('the month is',            a_datetime.strftime("%b"), '(month name, short version)')
print('the month is',            a_datetime.strftime("%B"), '(month name, full version)')
print('the month number is',     a_datetime.strftime("%m"), '(month as a number 01-12)')
print('the year is',             a_datetime.strftime("%y"), '(year, without century)')
print('the year is',             a_datetime.strftime("%Y"), '(year, full version)')

the day is 15 (day of month 01-31)
the month is Oct (month name, short version)
the month is October (month name, full version)
the month number is 10 (month as a number 01-12)
the year is 18 (year, without century)
the year is 2018 (year, full version)


In [7]:
print('today is a',              a_datetime.strftime("%a"), '(weekday, short version)')
print('today is a',              a_datetime.strftime("%A"), '(weekday, full version)')
print('the weekday is',          a_datetime.strftime("%w"), '(weekday as a number 0-6, 0 is Sunday)')
print('the week number is',      a_datetime.strftime("%W"), '(week number of year, monday as the first day of week, 00-53)')
print('the week number is also', a_datetime.strftime("%U"), '(week number of year, sunday as the first day of week, 00-53)')

today is a Mon (weekday, short version)
today is a Monday (weekday, full version)
the weekday is 1 (weekday as a number 0-6, 0 is Sunday)
the week number is 42 (week number of year, monday as the first day of week, 00-53)
the week number is also 41 (week number of year, sunday as the first day of week, 00-53)


In [8]:
print('the 24-hour is',         a_datetime.strftime("%H"), '(Hour 00-23)')
print('the 12-hour is',         a_datetime.strftime("%I"), '(Hour 00-12)')
print('the morning/evening is', a_datetime.strftime("%p"), '(AM/PM)')
print('the minute is',          a_datetime.strftime("%M"), '(Minute 00-59)')
print('the second is',          a_datetime.strftime("%S"), '(Second 00-59)')
print('the microsecond is',     a_datetime.strftime("%f"), '(Microsecond 000000-999999)')
print('the timezone offset is', a_datetime.strftime("%z"), '(UTC offset)')
print('the timeszone is set',   a_datetime.strftime("%Z"), '(Timezone)')
print('the day number is',      a_datetime.strftime("%j"), '(Day number of year 001-366)')

the 24-hour is 10 (Hour 00-23)
the 12-hour is 10 (Hour 00-12)
the morning/evening is AM (AM/PM)
the minute is 32 (Minute 00-59)
the second is 25 (Second 00-59)
the microsecond is 574843 (Microsecond 000000-999999)
the timezone offset is  (UTC offset)
the timeszone is set  (Timezone)
the day number is 288 (Day number of year 001-366)


In [9]:
print('today is', a_datetime.strftime("%c"), '(Local version of date and time)')
print('today is', a_datetime.strftime("%x"), '(Local version of date)')
print('today is', a_datetime.strftime("%X"), '(Local version of time)')

today is Mon Oct 15 10:32:25 2018 (Local version of date and time)
today is 10/15/18 (Local version of date)
today is 10:32:25 (Local version of time)


In [10]:
# a very common, different, way to use strftime():
print(a_datetime.strftime("today is a %a (Weekday, short version)'"))
print(a_datetime.strftime("today is a %A (Weekday, full version)'"))

today is a Mon (Weekday, short version)'
today is a Monday (Weekday, full version)'


# parsing examples
we often need to read a time stamp from a string in some formats, e.g.:

In [11]:
timestamp_str_1 = "Jan 21"
timestamp_str_2 = "23rd of January"
timestamp_str_3 = "2018-03-16"
timestamp_str_4 = "2018-01-21T13:27:10"
timestamp_str_5 = "2018-01-21T15:49:23.3855"
timestamp_str_6 = "2016/02/21 15:49:23"
timestamp_str_7 = "At 15:49 on Thursday, 21 of January '18"
timestamp_str_8 = "Sat 09/08/07 @15:49"
timestamp_str_9 = "Fri 09/08/07 @10:01"

how do we convert these strings to datetime objects that we can compute with? there is a function for that, called `strptime()` (string - parse - time). it takes two arguments, a string to convert and a format  it uses the same format descriptors as its sibling function `strftime()`.

In [12]:
datetime_object_1  = datetime.datetime.strptime(timestamp_str_1, '%b %d')
print(timestamp_str_1, ' -> ', datetime_object_1)
datetime_object_2  = datetime.datetime.strptime(timestamp_str_2, '%drd of %B') # nb! what happens to "21st of January"!
print(timestamp_str_2, ' -> ', datetime_object_2)
datetime_object_3  = datetime.datetime.strptime(timestamp_str_3, '%Y-%m-%d')
print(timestamp_str_3, ' -> ', datetime_object_3)
datetime_object_4  = datetime.datetime.strptime(timestamp_str_4, '%Y-%m-%dT%H:%M:%S')
print(timestamp_str_4, ' -> ', datetime_object_4)
datetime_object_5  = datetime.datetime.strptime(timestamp_str_5, '%Y-%m-%dT%H:%M:%S.%f')
print(timestamp_str_5, ' -> ', datetime_object_5)
datetime_object_6  = datetime.datetime.strptime(timestamp_str_6, '%Y/%m/%d %H:%M:%S')
print(timestamp_str_6, ' -> ', datetime_object_6)
datetime_object_7  = datetime.datetime.strptime(timestamp_str_7, "At %H:%M on %A, %d of %B '%y")
print(timestamp_str_7, ' -> ', datetime_object_7)
datetime_object_8  = datetime.datetime.strptime(timestamp_str_8, "%a %m/%d/%y @%H:%M")
print(timestamp_str_8, ' -> ', datetime_object_8) # does this date exist?
datetime_object_9  = datetime.datetime.strptime(timestamp_str_9, "%a %m/%d/%y @%H:%M")
print(timestamp_str_9, ' -> ', datetime_object_9) # does this date exist?

Jan 21  ->  1900-01-21 00:00:00
23rd of January  ->  1900-01-23 00:00:00
2018-03-16  ->  2018-03-16 00:00:00
2018-01-21T13:27:10  ->  2018-01-21 13:27:10
2018-01-21T15:49:23.3855  ->  2018-01-21 15:49:23.385500
2016/02/21 15:49:23  ->  2016-02-21 15:49:23
At 15:49 on Thursday, 21 of January '18  ->  2018-01-21 15:49:00
Sat 09/08/07 @15:49  ->  2007-09-08 15:49:00
Fri 09/08/07 @10:01  ->  2007-09-08 10:01:00


# datetime arithmetic
given two timestamps, we will often be interested in the interval between them, measured in some unit of timekeeping. python will understand us if we just use the `+` and `-` operators on datetime objects. the key function we will use is `datetime.timedelta()`.

e.g. how many days are there till christmas? how old are you, if we counted our age in weeks? what date will it be when your 90-day warranty expires? for how many minutes did you sleep last night?

In [13]:
numweeks = 12; numdays = 5
# how to use the timedelta function: 
# timedelta(weeks=0, days=0, hours=0, minutes=0 , seconds=0, milliseconds=0, microseconds=0)
# returns a time interval.  
timeinterval_obj = datetime.timedelta(weeks=numweeks, days=numdays)
now_datetime = datetime.datetime.now()
# we can add interval to a timestamp
later_datetime = now_datetime + timeinterval_obj
print('now it is', now_datetime.date(), 
      'but in', numweeks,'weeks and', numdays, 'days it will be', 
      later_datetime.date())

now it is 2018-10-15 but in 12 weeks and 5 days it will be 2019-01-12


In [14]:
# we can also subtract an interval from a timestamp:
earlier_datetime = now_datetime - timeinterval_obj
print('now it is', now_datetime.date(), 
      'but', numweeks,'weeks and', numdays, 'days ago it was', 
      earlier_datetime)

now it is 2018-10-15 but 12 weeks and 5 days ago it was 2018-07-18 10:32:25.643301


In [15]:
# if you subtract datetimes, you get a timedelta object for the interval between the dates
christmas_start = datetime.datetime.strptime('2018-12-25T08:01', '%Y-%m-%dT%H:%M')
time_diff = christmas_start - now_datetime 
print('christmas is only', time_diff, 'away')

christmas is only 70 days, 21:28:34.356699 away


In [16]:
# we can break down intervals into different units:
print('christmas is only', time_diff.total_seconds(), 'seconds away')
print('christmas is only', time_diff.days, 'days and', time_diff.seconds + time_diff.microseconds/1e6, 'seconds away')

christmas is only 6125314.356699 seconds away
christmas is only 70 days and 77314.356699 seconds away


note that there is no `time_diff.hours` or `time_diff.minutes` or `time_diff.milliseconds`. we can retrieve these with a simple calculation from the number of seconds. 

In [17]:
print(
    'christmas is only', 
    time_diff.days, 'days and',
    int(time_diff.seconds/3600), 'hours and', 
    int((time_diff.seconds % 3600)/60), 'minutes and',
    int((time_diff.seconds % 60)) + time_diff.microseconds/1e6, 'seconds away')

christmas is only 70 days and 21 hours and 28 minutes and 34.356699 seconds away


In [18]:
# if we just want to pretty print the interval, there is an easier way!
print('the time difference is', str(time_diff))

the time difference is 70 days, 21:28:34.356699


In [19]:
# it looks prettier to only count the days (note the rounding up!)
date_diff = christmas_start.date() - a_datetime.date() # sets the clock time on both to 00:00:00.000000
print('christmas is only', date_diff.days, 'days away')

christmas is only 71 days away


In [20]:
# again, with pandas life gets a little easier... 
import pandas as pd
# reminder: timestamp_str_3 = "2018-03-16"
enddate = pd.to_datetime(timestamp_str_3) + pd.DateOffset(days=89)
print(enddate)
# but! warning! pandas doesn't handle every timestamp format!

2018-06-13 00:00:00


# conclusion
now you should have all you need to work on the exercise. i expect you will find this one a bit quicker to solve than last week's problem which may have required a lot of experimenting with commands.

in case you do get stuck on the exercise, here are some pointers on how to get started:

# working with dates from a data file
now let us read in a data file with a date field. we saw last week what the easiest (=best) way to do that is:

In [21]:
import pandas as pd
#filename_csv = 'NY_rodent_inspections_sample_small.csv' # for initial testing
filename_csv = 'NY_rodent_inspections_sample_medium.csv' # for further consideration
rodent_df = pd.read_csv(filename_csv) # yeah. that's it. neat, huh?
# inspect the data frame
rodent_df.head()

Unnamed: 0,INSPECTION_TYPE,JOB_TICKET_OR_WORK_ORDER_ID,JOB_ID,JOB_PROGRESS,BBL,BORO_CODE,BLOCK,LOT,HOUSE_NUMBER,STREET_NAME,ZIP_CODE,X_COORD,Y_COORD,LATITUDE,LONGITUDE,BOROUGH,INSPECTION_DATE,RESULT,APPROVED_DATE,LOCATION
0,BAIT,1,PO12965,3,1011470035,1,1147,35,104,WEST 76 STREET,10023,990505,223527,40.780204,-73.977414,Manhattan,10/14/2009 12:00:27 PM,Bait applied,10/14/2009 03:01:46 PM,"(40.7802039792471, -73.9774144709456)"
1,BAIT,2,PO12966,3,1011470034,1,1147,34,102,WEST 76 STREET,10023,990516,223521,40.780188,-73.977375,Manhattan,10/14/2009 12:51:21 PM,Bait applied,10/14/2009 03:02:30 PM,"(40.7801875030438, -73.977374757787)"
2,BAIT,30,PO16966,3,2043370027,2,4337,27,620,THWAITES PLACE,10467,1020110,252216,40.858877,-73.870364,Bronx,11/09/2009 12:59:55 PM,Bait applied,11/10/2009 02:54:52 PM,"(40.8588765781972, -73.8703636422023)"
3,BAIT,31,PO13665,3,2037670077,2,3767,77,1227,WHITEPLAINS ROAD,10472,1022441,242180,40.831321,-73.861994,Bronx,11/09/2009 11:10:16 AM,Bait applied,11/10/2009 02:56:42 PM,"(40.8313209626148, -73.861994089899)"
4,BAIT,38,PO11291,3,1011690057,1,1169,57,2199,BROADWAY,10024,989641,224567,40.783059,-73.980533,Manhattan,11/10/2009 08:40:42 AM,Bait applied,11/17/2009 11:39:11 AM,"(40.7830590725833, -73.9805333640688)"


In [22]:
# the columns of interest is the 'INSPECTION_DATE' and 'APPROVED_DATE'
# inspection by eye gives us the datetime format descriptor
format_descriptor = '%m/%d/%Y %H:%M:%S %p'
for date_str in rodent_df['INSPECTION_DATE']:
    inspection_datetime =datetime.datetime.strptime(date_str, format_descriptor)
    print(date_str, '-->', inspection_datetime) # compare before and after

10/14/2009 12:00:27 PM --> 2009-10-14 12:00:27
10/14/2009 12:51:21 PM --> 2009-10-14 12:51:21
11/09/2009 12:59:55 PM --> 2009-11-09 12:59:55
11/09/2009 11:10:16 AM --> 2009-11-09 11:10:16
11/10/2009 08:40:42 AM --> 2009-11-10 08:40:42
11/10/2009 09:53:06 AM --> 2009-11-10 09:53:06
11/10/2009 09:00:51 AM --> 2009-11-10 09:00:51
11/16/2009 12:59:45 PM --> 2009-11-16 12:59:45
11/16/2009 12:06:24 PM --> 2009-11-16 12:06:24
11/16/2009 11:50:25 AM --> 2009-11-16 11:50:25
11/16/2009 09:55:20 PM --> 2009-11-16 09:55:20
11/16/2009 10:20:14 PM --> 2009-11-16 10:20:14
11/16/2009 09:25:46 PM --> 2009-11-16 09:25:46
11/17/2009 12:00:29 PM --> 2009-11-17 12:00:29
11/17/2009 09:55:24 AM --> 2009-11-17 09:55:24
11/17/2009 10:30:04 AM --> 2009-11-17 10:30:04
11/17/2009 11:02:47 AM --> 2009-11-17 11:02:47
11/17/2009 09:20:23 AM --> 2009-11-17 09:20:23
11/18/2009 09:15:05 AM --> 2009-11-18 09:15:05
11/18/2009 09:55:02 AM --> 2009-11-18 09:55:02
11/20/2009 01:13:12 PM --> 2009-11-20 01:13:12
11/20/2009 09

# a warning!
to illustrate a point with how tricky datetime parsing and conversions can be, i made a mistake in the above and decided to leave it in. 

there is a silly error in the above parsing code. can you find it?

In [23]:
# the columns of interest is the 'INSPECTION_DATE' and 'APPROVED_DATE'
# inspection by eye gives us the datetime format descriptor
format_descriptor = '%m/%d/%Y %I:%M:%S %p'
columns = ['INSPECTION_DATE', 'APPROVED_DATE']

for column in columns:
    print('transcoding the', column, 'column')
    for date_str in rodent_df[column]:
        inspection_datetime = pd.to_datetime(date_str)
        #inspection_datetime = datetime.datetime.strptime(date_str, format_descriptor)
        print(date_str, '-->', inspection_datetime) # compare before and after

# rodent_df['inspection_datetime'] = datetime.datetime.strptime(date_str, format_descriptor)

transcoding the INSPECTION_DATE column
10/14/2009 12:00:27 PM --> 2009-10-14 12:00:27
10/14/2009 12:51:21 PM --> 2009-10-14 12:51:21
11/09/2009 12:59:55 PM --> 2009-11-09 12:59:55
11/09/2009 11:10:16 AM --> 2009-11-09 11:10:16
11/10/2009 08:40:42 AM --> 2009-11-10 08:40:42
11/10/2009 09:53:06 AM --> 2009-11-10 09:53:06
11/10/2009 09:00:51 AM --> 2009-11-10 09:00:51
11/16/2009 12:59:45 PM --> 2009-11-16 12:59:45
11/16/2009 12:06:24 PM --> 2009-11-16 12:06:24
11/16/2009 11:50:25 AM --> 2009-11-16 11:50:25
11/16/2009 09:55:20 PM --> 2009-11-16 21:55:20
11/16/2009 10:20:14 PM --> 2009-11-16 22:20:14
11/16/2009 09:25:46 PM --> 2009-11-16 21:25:46
11/17/2009 12:00:29 PM --> 2009-11-17 12:00:29
11/17/2009 09:55:24 AM --> 2009-11-17 09:55:24
11/17/2009 10:30:04 AM --> 2009-11-17 10:30:04
11/17/2009 11:02:47 AM --> 2009-11-17 11:02:47
11/17/2009 09:20:23 AM --> 2009-11-17 09:20:23
11/18/2009 09:15:05 AM --> 2009-11-18 09:15:05
11/18/2009 09:55:02 AM --> 2009-11-18 09:55:02
11/20/2009 01:13:12 P

01/07/2010 11:31:56 AM --> 2010-01-07 11:31:56
01/07/2010 11:00:28 AM --> 2010-01-07 11:00:28
01/07/2010 10:31:00 AM --> 2010-01-07 10:31:00
01/07/2010 10:12:01 AM --> 2010-01-07 10:12:01
01/07/2010 09:41:02 AM --> 2010-01-07 09:41:02
01/07/2010 09:00:53 AM --> 2010-01-07 09:00:53
01/07/2010 09:26:11 AM --> 2010-01-07 09:26:11
01/07/2010 09:25:08 AM --> 2010-01-07 09:25:08
01/07/2010 11:10:37 AM --> 2010-01-07 11:10:37
01/07/2010 11:40:45 AM --> 2010-01-07 11:40:45
01/07/2010 12:55:26 PM --> 2010-01-07 12:55:26
01/07/2010 12:15:51 PM --> 2010-01-07 12:15:51
01/07/2010 09:50:10 AM --> 2010-01-07 09:50:10
01/07/2010 10:30:26 AM --> 2010-01-07 10:30:26
01/07/2010 12:10:07 PM --> 2010-01-07 12:10:07
01/07/2010 01:09:26 AM --> 2010-01-07 01:09:26
01/07/2010 02:19:21 AM --> 2010-01-07 02:19:21
01/07/2010 12:00:48 PM --> 2010-01-07 12:00:48
01/07/2010 11:10:24 AM --> 2010-01-07 11:10:24
01/07/2010 09:10:40 AM --> 2010-01-07 09:10:40
01/07/2010 03:08:48 PM --> 2010-01-07 15:08:48
01/07/2010 11

01/07/2010 07:21:15 AM --> 2010-01-07 07:21:15
01/07/2010 07:20:16 AM --> 2010-01-07 07:20:16
01/07/2010 07:18:10 AM --> 2010-01-07 07:18:10
01/07/2010 07:18:19 AM --> 2010-01-07 07:18:19
01/07/2010 03:44:22 PM --> 2010-01-07 15:44:22
01/07/2010 03:42:02 PM --> 2010-01-07 15:42:02
01/07/2010 03:42:12 PM --> 2010-01-07 15:42:12
01/07/2010 03:42:22 PM --> 2010-01-07 15:42:22
01/07/2010 03:42:30 PM --> 2010-01-07 15:42:30
01/07/2010 03:42:40 PM --> 2010-01-07 15:42:40
01/07/2010 03:42:53 PM --> 2010-01-07 15:42:53
01/07/2010 03:43:00 PM --> 2010-01-07 15:43:00
01/07/2010 03:44:50 PM --> 2010-01-07 15:44:50
01/07/2010 03:43:12 PM --> 2010-01-07 15:43:12
01/07/2010 03:43:20 PM --> 2010-01-07 15:43:20
01/07/2010 03:45:39 PM --> 2010-01-07 15:45:39
01/07/2010 03:45:49 PM --> 2010-01-07 15:45:49
01/07/2010 09:00:38 AM --> 2010-01-07 09:00:38
01/07/2010 08:59:06 AM --> 2010-01-07 08:59:06
01/07/2010 09:01:55 AM --> 2010-01-07 09:01:55
01/07/2010 09:02:55 AM --> 2010-01-07 09:02:55
01/07/2010 09

In [24]:
def format_iso_week(datetime_obj):
    the_date = datetime_obj.date()
    iso_year, iso_week, iso_weekday = the_date.isocalendar()
    iso_week_str = str(iso_year) + '-W' + str(iso_week)
    return iso_week_str

In [25]:
# test it
format_iso_week(a_datetime)

'2018-W42'

In [26]:
rodent_df['inspection_datetime']   = rodent_df.apply(lambda row: datetime.datetime.strptime(row['INSPECTION_DATE'], format_descriptor), axis=1)
rodent_df['approval_datetime']     = rodent_df.apply(lambda row: datetime.datetime.strptime(row['APPROVED_DATE'],   format_descriptor), axis=1)
rodent_df['wait_time_to_approval'] = rodent_df.apply(lambda row: (row['approval_datetime'] - row['inspection_datetime']).total_seconds(),axis=1)
rodent_df['inspection_month']      = rodent_df.apply(lambda row: row['inspection_datetime'].strftime("%B"), axis=1)
rodent_df['isoweek']               = rodent_df.apply(lambda row: format_iso_week(row['inspection_datetime']), axis=1)
rodent_df['inspection_weekday']    = rodent_df.apply(lambda row: row['inspection_datetime'].strftime("%A"), axis=1)
rodent_df.head()

Unnamed: 0,INSPECTION_TYPE,JOB_TICKET_OR_WORK_ORDER_ID,JOB_ID,JOB_PROGRESS,BBL,BORO_CODE,BLOCK,LOT,HOUSE_NUMBER,STREET_NAME,...,INSPECTION_DATE,RESULT,APPROVED_DATE,LOCATION,inspection_datetime,approval_datetime,wait_time_to_approval,inspection_month,isoweek,inspection_weekday
0,BAIT,1,PO12965,3,1011470035,1,1147,35,104,WEST 76 STREET,...,10/14/2009 12:00:27 PM,Bait applied,10/14/2009 03:01:46 PM,"(40.7802039792471, -73.9774144709456)",2009-10-14 12:00:27,2009-10-14 15:01:46,10879.0,October,2009-W42,Wednesday
1,BAIT,2,PO12966,3,1011470034,1,1147,34,102,WEST 76 STREET,...,10/14/2009 12:51:21 PM,Bait applied,10/14/2009 03:02:30 PM,"(40.7801875030438, -73.977374757787)",2009-10-14 12:51:21,2009-10-14 15:02:30,7869.0,October,2009-W42,Wednesday
2,BAIT,30,PO16966,3,2043370027,2,4337,27,620,THWAITES PLACE,...,11/09/2009 12:59:55 PM,Bait applied,11/10/2009 02:54:52 PM,"(40.8588765781972, -73.8703636422023)",2009-11-09 12:59:55,2009-11-10 14:54:52,93297.0,November,2009-W46,Monday
3,BAIT,31,PO13665,3,2037670077,2,3767,77,1227,WHITEPLAINS ROAD,...,11/09/2009 11:10:16 AM,Bait applied,11/10/2009 02:56:42 PM,"(40.8313209626148, -73.861994089899)",2009-11-09 11:10:16,2009-11-10 14:56:42,99986.0,November,2009-W46,Monday
4,BAIT,38,PO11291,3,1011690057,1,1169,57,2199,BROADWAY,...,11/10/2009 08:40:42 AM,Bait applied,11/17/2009 11:39:11 AM,"(40.7830590725833, -73.9805333640688)",2009-11-10 08:40:42,2009-11-17 11:39:11,615509.0,November,2009-W46,Tuesday


# bonus:
a standard `timestamp` (sometimes referred to as unix epoch) is simply the fractional number of seconds since 1970-01-01 00:00:00.000000 UTC.

In [27]:
import time
# get current time stamp
current_timestamp = time.time()
print(type(current_timestamp))
print(current_timestamp)

<class 'float'>
1539595947.1139998


we can always convert such an object back to a datetime object using the `fromtimestamp()` function

In [28]:
print('converted to regular timestamp:', datetime.datetime.fromtimestamp(current_timestamp))

converted to regular timestamp: 2018-10-15 10:32:27.114000


`timestamp`s are useful because they make it easy to do direct, unambiguous manipulation:

In [29]:
another_timestamp = current_timestamp + 3*24*60*60 # add three days
print('still just a floating point number:', another_timestamp)
print('converted to regular timestamp:', datetime.datetime.fromtimestamp(another_timestamp))

still just a floating point number: 1539855147.1139998
converted to regular timestamp: 2018-10-18 10:32:27.114000
