## Part 1: The Doomsday Algorithm

The Doomsday algorithm, devised by mathematician J. H. Conway, computes the day of the week any given date fell on. The algorithm is designed to be simple enough to memorize and use for mental calculation.

__Example.__ With the algorithm, we can compute that July 4, 1776 (the day the United States declared independence from Great Britain) was a Thursday.

The algorithm is based on the fact that for any year, several dates always fall on the same day of the week, called the <em style="color:#F00">doomsday</em> for the year. These dates include 4/4, 6/6, 8/8, 10/10, and 12/12.

__Example.__ The doomsday for 2016 is Monday, so in 2016 the dates above all fell on Mondays. The doomsday for 2017 is Tuesday, so in 2017 the dates above will all fall on Tuesdays.

The doomsday algorithm has three major steps:

1. Compute the anchor day for the target century.
2. Compute the doomsday for the target year based on the anchor day.
3. Determine the day of week for the target date by counting the number of days to the nearest doomsday.

Each step is explained in detail below.

### The Anchor Day

The doomsday for the first year in a century is called the <em style="color:#F00">anchor day</em> for that century. The anchor day is needed to compute the doomsday for any other year in that century. The anchor day for a century $c$ can be computed with the formula:
$$
a = \bigl( 5 (c \bmod 4) + 2 \bigr) \bmod 7
$$
The result $a$ corresponds to a day of the week, starting with $0$ for Sunday and ending with $6$ for Saturday.

__Note.__ The modulo operation $(x \bmod y)$ finds the remainder after dividing $x$ by $y$. For instance, $12 \bmod 3 = 0$ since the remainder after dividing $12$ by $3$ is $0$. Similarly, $11 \bmod 7 = 4$, since the remainder after dividing $11$ by $7$ is $4$.

__Example.__ Suppose the target year is 1954, so the century is $c = 19$. Plugging this into the formula gives
$$a = \bigl( 5 (19 \bmod 4) + 2 \bigr) \bmod 7 = \bigl( 5(3) + 2 \bigr) \bmod 7 = 3.$$
In other words, the anchor day for 1900-1999 is Wednesday, which is also the doomsday for 1900.

__Exercise 1.1.__ Write a function that accepts a year as input and computes the anchor day for that year's century. The modulo operator `%` and functions in the `math` module may be useful. Document your function with a docstring and test your function for a few different years.  Do this in a new cell below this one.

In [92]:
# this function takes in the year 
# converts the year into a string 
# take the century of the year 
# converts the strin back into an integer
# and returns the anchor day 
def anchor_day(year):
    c = str(year)
    c = c[:2]
    c = int(c)
    return (5*(c%4)+2)%7

testing if anchor_day function works 

In [93]:
anchor_day(1954)

3

### The Doomsday

Once the anchor day is known, let $y$ be the last two digits of the target year. Then the doomsday for the target year can be computed with the formula:
$$d = \left(y + \left\lfloor\frac{y}{4}\right\rfloor + a\right) \bmod 7$$
The result $d$ corresponds to a day of the week.

__Note.__ The floor operation $\lfloor x \rfloor$ rounds $x$ down to the nearest integer. For instance, $\lfloor 3.1 \rfloor = 3$ and $\lfloor 3.8 \rfloor = 3$.

__Example.__ Again suppose the target year is 1954. Then the anchor day is $a = 3$, and $y = 54$, so the formula gives
$$
d = \left(54 + \left\lfloor\frac{54}{4}\right\rfloor + 3\right) \bmod 7 = (54 + 13 + 3) \bmod 7 = 0.
$$
Thus the doomsday for 1954 is Sunday.

__Exercise 1.2.__ Write a function that accepts a year as input and computes the doomsday for that year. Your function may need to call the function you wrote in exercise 1.1. Make sure to document and test your function.

In [94]:
# this function takes year as an input 
# gets the last two digits 
# calls the function anchor_day and returns the doomsday
import math
def doomsday(year):
    y = year % 100
    a = anchor_day(year)
    return int((y + math.floor(y/4) + a) % 7)
    

In [181]:
doomsday(1978)

2

### The Day of Week

The final step in the Doomsday algorithm is to count the number of days between the target date and a nearby doomsday, modulo 7. This gives the day of the week.

Every month has at least one doomsday:
* (regular years) 1/10, 2/28
* (leap years) 1/11, 2/29
* 3/21, 4/4, 5/9, 6/6, 7/11, 8/8, 9/5, 10/10, 11/7, 12/12

__Example.__ Suppose we want to find the day of the week for 7/21/1954. The doomsday for 1954 is Sunday, and a nearby doomsday is 7/11. There are 10 days in July between 7/11 and 7/21. Since $10 \bmod 7 = 3$, the date 7/21/1954 falls 3 days after a Sunday, on a Wednesday.

__Exercise 1.3.__ Write a function to determine the day of the week for a given day, month, and year. Be careful of leap years! Your function should return a string such as "Thursday" rather than a number. As usual, document and test your code.

In [96]:
# this function returns whether the year is a leap year or not
# 1 means leap year
# 0 means not a leap year
def leap_year(year):
    if (year % 400 == 0) or (year % 4 == 0 and year % 100 != 0):
        return 1
    else:
        return 0
     

In [182]:
leap_year(1978)

0

In [319]:
# this function takes in the input of month, date, and year
# and returns the day of the week the date is 

def day_of_week(month, date, year):
    
    # list out the doomsday for both leap year and non-leap year 
    leap_days = [11, 29, 21, 4, 9, 6, 11, 8, 5, 10, 7, 12]
    non_leap_days = [10, 28, 21, 4, 9, 6, 11, 8, 5, 10, 7, 12]
    
    # list out the days of the week as a string 
    word_of_days = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']
    
    # check if the year is a leap year or not
    leapyear = leap_year(year)
    
    # get the doomsday for the year
    dooms_day = doomsday(year)
    #print dooms_day
    
    # if it is a leap year, use the leap year list
    if leapyear == 1:
        dday = leap_days[month - 1]
        # if the date is the same as the doomsday of that month, return the day for the doomsday of that month 
        if date == dday:
            day_word = word_of_days[dooms_day]
            return day_word
        # if the date is greater than the doomsday of that month, date - doomsday of that month
        if date > dday:
            day = date - dday
            day = day % 7
            day = day + dooms_day
            # if the day is greater than 6, get the remainder to get the day of the week
            if day > 6:
                day = day % 7
                day_word = word_of_days[day]
                return day_word
            else:
                day_word = word_of_days[day]
                return day_word
        # if the date is less than the doomsday of that month, doomsday of that month - date
        if date < dday:
            day = dday - date
            day = day % 7 
            day = dooms_day - day
            day_word = word_of_days[day]
            return day_word
    
    # if it is not a leap year, use the non-leap year list
    if leapyear == 0:
        dday = non_leap_days[month - 1]
        if date == dday:
            day_word = word_of_days[dooms_day]
            return day_word
        if date > dday:
            day = date - dday
            day = day % 7
            day = day + dooms_day
            if day > 6:
                day = day % 7
                day_word = word_of_days[day]
                return day_word
            else:
                day_word = word_of_days[day]
                return day_word
        if date < dday:
            day = dday - date
            day = day % 7 
            day = dooms_day - day
            day_word = word_of_days[day]
            return day_word

# test today's date         
day_of_week(1,21,2017)    

'Saturday'

__Exercise 1.4.__ How many times did Friday the 13th occur in the years 1900-1999? Does this number seem to be similar to other centuries?

In [326]:
# this function takes in two years
# and returns how many Friday the 13th appeared between those two years 

def num_13_friday(year_start, year_end):
    # set the initial count to zero and date to 13
    i = 0 
    date = 13
    # go through the beginning year to the end year
    for year in xrange(year_start, year_end + 1,1):
        # go through each month
        for month in range(1,13):
            # check if the day of the week is equal to Friday
            if day_of_week(month, date, year) == 'Friday':
                # if it is, add it to the initial count
                i = i + 1
                
    return i 

num_13_friday(1900,1999)

172

In [347]:
print num_13_friday(1800, 1899)
print num_13_friday(1700, 1799)
print num_13_friday(1600, 1699)

172
172
172


There were 172 Friday the 13th between the years 1900 and 1999. This seems to be similar in other centuries.

__Exercise 1.5.__ How many times did Friday the 13th occur between the year 2000 and today?

In [328]:
# this function takes in today's date 
# and returns how many Friday the 13th occurred
# from 2000 to today's date 

def num_of_friday_2(today_month, today_date, today_year):
    i = 0
    date = 13
    # get the number of Friday the 13 before the year of input
    for year in range(2000, today_year):
        for month in xrange(1,13,1):
            if day_of_week(month, date, year) == 'Friday':
                i = i + 1
    # get the number of Friday the 13 for the year of input
    
    # if the date is less than 13, then check the previous month that was inputted 
    if today_date < 13:
        for month in xrange(1, today_month, 1):
            if day_of_week(month, 13, today_year) == 'Friday':
                i = i + 1 
    # if the date is greater than 13, then check the month that was inputed 
    if today_date >= 13:
        for month in xrange(1, today_month + 1, 1):
            if day_of_week(month, 13, today_year) == 'Friday':
                i = i + 1
    return i

num_of_friday_2(1,21,2017)

30

Friday the 13th occured 30 times between the year 2000 and today. 

## Part 2: 1978 Birthdays

__Exercise 2.1.__ The file `birthdays.txt` contains the number of births in the United States for each day in 1978. Inspect the file to determine the format. Note that columns are separated by the tab character, which can be entered in Python as `\t`. Write a function that uses iterators and list comprehensions with the string methods `split()` and `strip()` to  convert each line of data to the list format

```Python
[month, day, year, count]
```
The elements of this list should be integers, not strings. The function `read_birthdays` provided below will help you load the file.

In [102]:
def read_birthdays(file_path):
    """Read the contents of the birthdays file into a string.
    
    Arguments:
        file_path (string): The path to the birthdays file.
        
    Returns:
        string: The contents of the birthdays file.
    """
    with open(file_path) as file:
        return file.read()

In [330]:
# read in the birthday.txt
bday = read_birthdays('birthdays.txt')
# get rid of all beginning and the end of the text
bday = bday.strip()
# split by the \n\n 
bday = bday.split('\n\n')[2]
# split it again by the \n
bday = bday.split('\n')


In [331]:
# set up empty dictionarys for the month, day, year, count, and the final format of the birthdays
month = {}
day = {}
year = {}
count = {}
final_format = {}

# go through the list of all the birthdays
for i in range(0, len(bday)):
    # grab the month, day, and year + count from the list splitting on /
    month[i], day[i], year[i] = bday[i].split('/')
    # get the actual year and count from the year + count splitting on \t  
    year[i], count[i] = year[i].split('\t')
    # convert the format to integers
    final_format[i] = [int(month[i]) , int(day[i]) , int(year[i]) , int(count[i])]
final_format

{0: [1, 1, 78, 7701],
 1: [1, 2, 78, 7527],
 2: [1, 3, 78, 8825],
 3: [1, 4, 78, 8859],
 4: [1, 5, 78, 9043],
 5: [1, 6, 78, 9208],
 6: [1, 7, 78, 8084],
 7: [1, 8, 78, 7611],
 8: [1, 9, 78, 9172],
 9: [1, 10, 78, 9089],
 10: [1, 11, 78, 9210],
 11: [1, 12, 78, 9259],
 12: [1, 13, 78, 9138],
 13: [1, 14, 78, 8299],
 14: [1, 15, 78, 7771],
 15: [1, 16, 78, 9458],
 16: [1, 17, 78, 9339],
 17: [1, 18, 78, 9120],
 18: [1, 19, 78, 9226],
 19: [1, 20, 78, 9305],
 20: [1, 21, 78, 7954],
 21: [1, 22, 78, 7560],
 22: [1, 23, 78, 9252],
 23: [1, 24, 78, 9416],
 24: [1, 25, 78, 9090],
 25: [1, 26, 78, 9387],
 26: [1, 27, 78, 8983],
 27: [1, 28, 78, 7946],
 28: [1, 29, 78, 7527],
 29: [1, 30, 78, 9184],
 30: [1, 31, 78, 9152],
 31: [2, 1, 78, 9159],
 32: [2, 2, 78, 9218],
 33: [2, 3, 78, 9167],
 34: [2, 4, 78, 8065],
 35: [2, 5, 78, 7804],
 36: [2, 6, 78, 9225],
 37: [2, 7, 78, 9328],
 38: [2, 8, 78, 9139],
 39: [2, 9, 78, 9247],
 40: [2, 10, 78, 9527],
 41: [2, 11, 78, 8144],
 42: [2, 12, 78, 795

In [333]:
# check the total number of counts of births to double check my work later
total_count = 0
for i in range(0, len(final_format)):
    element = final_format[i]
    counts = element[3]
    total_count = total_count + counts
total_count    

3333239

__Exercise 2.2.__ Which month had the most births in 1978? Which day of the week had the most births? Which day of the week had the fewest? What conclusions can you draw? You may find the `Counter` class in the `collections` module useful.

In [340]:
tmp_num_births = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
months = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10 , 11, 12]
count = 0

# loop through the list of all the birthdays
for i in range(0, len(final_format)):
    # store the ith element of list
    ith_element = final_format[i]
    # store the month and count of the ith element
    the_month = ith_element[0]
    the_count = ith_element[3]
    #if the_month != months[the_month -1]:
        #count = 0 
    # if the month is the same as in the list of months
    if the_month == months[the_month - 1]:
        # for the same month, add the counts up
        count = count + the_count
        # store the counts in a list
        tmp_num_births[the_month - 1] = count

# need to subtract off the previous element of tmp_num_births to get the right num_births for each month       
num_births = [tmp_num_births[0], 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]        
for i in range(1, len(tmp_num_births)):
    num_births[i] = tmp_num_births[i] - tmp_num_births[i-1]
print num_births

# check if the total number in num_births equals the toal number of counts
#print sum(num_births)

[270695, 249875, 276584, 254577, 270812, 270756, 294701, 302795, 293891, 288955, 274671, 284927]


In [341]:
# get the month with the most birthdays 
max_birth = max(num_births)
for i in range(0,12):
    if num_births[i] == max_birth:
        max_birth_month = i + 1
max_birth_month

8

August has the most number of births in 1978. 

In [342]:
# find which day has the most and least number of births

# have a list with all the days of the week 
list_day = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday' ,'Friday', 'Saturday']
# set the count to be equal to zero
count2 = 0
# set a list of just zeros
list_of_days = [0, 0, 0, 0, 0, 0, 0]

# go through the list of birthdays
for i in range(0, len(final_format)):
    # grab the month, date, and count from each element
    ith_element = final_format[i]
    the_month = ith_element[0]
    the_date = ith_element[1]
    the_count = ith_element[3]
    # call the previous function to get the day of the week of 1978
    the_day_of_the_week = day_of_week(the_month, the_date, 1978)
    # go through each day of the week 
    for n in range(0,7):
        # if the day of the week matches the day of the week in the list, add the counts together
        if the_day_of_the_week == list_day[n]:
            count2 = list_of_zeros[n]
            count2 = count2 + the_count
            list_of_days[n] = count2
print list_of_days

# find the day with the most birthdays
max_day_birthdays = max(list_of_days)    
#print max_day_birthdays

# get the index of the max day of births
for i in range(0,7):
    if list_of_days[i] == max_day_birthdays:
        max_day = i
print max_day

min_day_birthdays = min(list_of_days)
#print min_day_birthdays

# get the index of the least day of births 
for i in range(0,7):
    if list_of_days[i] == min_day_birthdays:
        min_day = i
print min_day



[429428, 495155, 513760, 503804, 503326, 510942, 440559]
2
0


Tuesday had the most births and Sunday had the least amount of births. 

Because August had the highest number of births, I can conclude that the holiday season in December has a large effect on having children. 

__Exercise 2.3.__ What would be an effective way to present the information in exercise 2.2? You don't need to write any code for this exercise, just discuss what you would do.

An effective way to present the information above is to print out sentences that answers the question. 