# Problem #1
Date Calculator
You have joined a science project where a series of experiments are run for which you
need to calculate the number of full days elapsed in between two events.
The first and the last day are considered partial days and never counted. Following this
logic, the distance between two related events on 03/08/2018 and 04/08/2018 is 0,
since there are no fully elapsed days contained in between those, and 01/01/2000 to
03/01/2000 should return 1.
The solution needs to cater for all valid dates between 01/01/1901 and 31/12/2999.
Test cases
1) 02/06/1983 - 22/06/1983 = 19 days
2) 04/07/1984 - 25/12/1984 = 173 days
3) 03/01/1989 - 03/08/1983 = 1979 days
(Please note these dates are formatted DD/MM/YYYY)
Instructions
 - Write a command-line based program that accepts date input from the console.
 - You should not use any existing date libraries for your implementation.
 - You may however use date libraries to test your solution (we encourage it!)
 - Consider other potential input sources & how your app might fit into a bigger system

In [13]:
import re

Testing function to check that the supplied values are within the correct range

In [14]:
def check_bounds(testval,  valrange):
    '''
    Test function to test whether the value falls within an acceptable range
    Args:
        testval (int): the value to be tested
        valrange (tuple): range of values that tested value can take, in the form of min, max (<= min x <= max) 
    Returns:
        boolean: indicates whether the testval is within the range specified
    Examples:
        >>> check_bounds( 15, (10,20))
    '''
    range_min, range_max = valrange
    if( range_min <= testval <=range_max):
        return (True)
    else:
        return(False)

Function used to get the max date possible for a month (initial approach used a dictionary)

In [15]:
def get_max_month_day( month, year):
    '''
    Given a month and year combo return the max date possible for validation
    '''
    if month ==2:
        if (year%4 ==0):
            return(29)
        else:
            return(28)
    elif (month in [4, 6, 9, 11]) :
        return(30)
    else: 
        return(31)

    

Function to take string dates and return list components

In [16]:
def process_dates (date_string, pattern = '(\d{2})/(\d{2})/(\d{4})', US = False):
    '''
    Args:
        date_string (string): string representation of date e.g. '10/11/2020'
        pattern (string): regex pattern to match, defaults to  '(\d{2})/(\d{2})/(\d{4})'
        US (boolean): flag indicating whether to process dates as a boolean
    Returns:
        list : list of integer [year, month, day ]
    Examples:
        >>> process_dates('10/11/2020',  '(\d{2})/(\d{2})/(\d{4})')
    ''' 
    cmp_list = [(1,31),(1,12),(1901,2999)] #list that dates are compared to
    if (US):
        cmp_list[0],cmp_list[1] = cmp_list[1],cmp_list[0]
    date_matched = re.search(pattern ,date_string)
    if( date_matched is None):
        raise Exception('could not parse date: '+date_string)
    else:
        date_components = []
        if (US):
            date_matched[1],date_matched[2] = date_matched[2], date_matched[1]
        for i in range(1,4):
            if (i ==1):
                maxday = get_max_month_day(int(date_matched[i+1]) , int(date_matched[i+2]) )
                test_res = check_bounds( int(date_matched[i]), (1, maxday))
            else:
                test_res = check_bounds( int(date_matched[i]), cmp_list[i-1])
            if (test_res is False):
                message = 'date component out of range: ' + date_matched[i] + ' in '+ date_string
                raise Exception(message)
            else:
                date_components.append(int(date_matched[i]))
    date_components = [int(dt) for dt in date_components]
    return(date_components)


Subtracts two (list components) dates. 

Uses approach of iterating through years adding maximum dates for months and then subtracting start and end

In [17]:
def date_difference(date1, date2, enddate=False):
    '''
    Gets the difference between two dates
    Args:
        date1 (list): list of dates broken up into integer dmy components [d,m,y]
        date2 (list): list of dates broken up into integer dmy components [d,m,y]
        enddate (Boolean): whether or not the enddate is included in the calculation 1 day is added
    Returns:
        Integer difference in days
    '''
 
    startyear = int(date1[2])
    startmonth = int(date1[1])
    startday = int(date1[0])
    endyear = int(date2[2])
    endmonth = int(date2[1])
    endday = int(date2[0])
    
    #calculate the number of days that have elapsed between start and end year
    days_elapsed = 0
    for y in range (startyear, endyear+1):
        for m in range(1, 13):
            days_elapsed += get_max_month_day(m,y)

    #start date minus start of year
    time_since_start = 0
    for m in range(1,(startmonth)):
        time_since_start += get_max_month_day(m,startyear)

    #time_since_start = time_since_start + (-get_max_month_day(startmonth,startyear) + startday)
    time_since_start += startday

    #get extra days at the end of the year
    end_date_extra = 0
    for m in range(endmonth,(13)):
        end_date_extra += get_max_month_day(m,endyear)
    end_date_extra = end_date_extra + ( -endday )
    #remove the extra bits of the date
    days_elapsed= days_elapsed-(end_date_extra+time_since_start)
    if (enddate):
        return(days_elapsed)
    else:
        return(days_elapsed-1 )



In [18]:
#04/07/1984 - 25/12/1984 = 173 days
# 02/06/1983 - 22/06/1983 = 19 days
# 03/01/1989 - 03/08/1983 = 1979 days
date1  = process_dates('03/01/1989',  '(\d{2})/(\d{2})/(\d{4})')
date2  = process_dates('03/08/1983',  '(\d{2})/(\d{2})/(\d{4})')


date_difference(date2, date1)

1979

In [19]:
date2  = process_dates('03/01/1989',  '(\d{2})/(\d{2})/(\d{4})')
date1  = process_dates('04/01/1989',  '(\d{2})/(\d{2})/(\d{4})')

date_difference(date2, date1)

0

Creates the printout

In [20]:
def date_sub_handler(date1, date2):
    '''
    handler for dates
    '''
    date_diff = date_difference(date2, date1)
    if (date_diff == -1):
        date_diff = 0
    message = str(date_diff) +' days'
    return(message)

In [22]:
%%timeit
date2  = process_dates('04/01/1989',  '(\d{2})/(\d{2})/(\d{4})')
date1  = process_dates('04/01/1989',  '(\d{2})/(\d{2})/(\d{4})')
date_sub_handler(date1, date2)

17.8 µs ± 437 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [1]:
import pandas as pd

In [7]:
%%timeit
pd.to_datetime('04/01/1989') - pd.to_datetime('04/01/1989')

544 µs ± 29.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
