## Can You Figure out When Denise’s Birthday Is?

From http://thescienceexplorer.com/technology/can-you-figure-out-when-denise-s-birthday

In [1]:
import pandas as pd

In [2]:
birthdays = ['17 Feb 2001', '16 Mar 2002', '13 Jan 2003', '19 Jan 2004', '13 Mar 2001', '15 Apr 2002', '16 Feb 2003', '18 Feb 2004', '13 Apr 2001', '14 May 2002', '14 Mar 2003', '19 May 2004', '15 May 2001', '12 Jun 2002', '11 Apr 2003', '14 Jul 2004', '17 Jun 2001', '16 Aug 2002', '16 Jul 2003', '18 Aug 2004']

In [3]:
data = {'month': [month for month in [month[3:6] for month in birthdays]], 
     'day': [day for day in [day[0:2] for day in birthdays]],
     'year': [year for year in [year[7:] for year in birthdays]]}

In [4]:
months_sorted = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]

In [5]:
df = pd.DataFrame(data)

In [6]:
df['month'] = pd.Categorical(df['month'], months_sorted)

In [7]:
def reset_dataframe(dataframe):
    dataframe = dataframe.reset_index(inplace=True, drop=True)

Let’s get started. We know that Albert knows the month of Denise’s birthday, while Bernard knows the date, and Cheryl knows the year. What you have to pay attention to is what each person is saying.

1st Step - Albert says, “I don’t know her birthday, but I know Bernard doesn’t know.” Of course Albert can’t know because every month appears more than once, but how can he know that Bernard doesn’t know? You need to count the number of times a date appears.

When you count them up, you will notice that the dates 11 and 12 occur only once. What this means is that we can remove any dates with 11 and 12: June 12 and April 11. Otherwise, Bernard would know Denise’s birthday. But since Albert knows the month, this also means that the date isn’t in June or April, so we can also get rid of all dates with June and April.

In [8]:
table = pd.crosstab(index=df['month'], columns=df['year'], values=df['day'], aggfunc='sum')
table

year,2001,2002,2003,2004
month,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Jan,,,13.0,19.0
Feb,17.0,,16.0,18.0
Mar,13.0,16.0,14.0,
Apr,13.0,15.0,11.0,
May,15.0,14.0,,19.0
Jun,17.0,12.0,,
Jul,,,16.0,14.0
Aug,,16.0,,18.0


In [9]:
# Select days that only occur once
mins = df.day.value_counts()[(df.day.value_counts() == 1)].index.values

In [10]:
# Index(es) to drop from previous step
mins_index = df.query("day in @mins").index

In [11]:
# Select month(s) that belong to days that occur only once
month_drop = df.iloc[mins_index, 1].values

In [12]:
# Drop days that occur only once from data
df = df.drop(df.index[mins_index])

In [13]:
# Drop month(s) that belong to days that occur only once
df = df[~df['month'].isin(month_drop)]

In [14]:
# Reset dataframe index
reset_dataframe(df)

2nd Step - Next, Bernard says, “I don’t know her birthday, but I know Cheryl doesn’t know.” Both the first and second part of his sentence gives us information. The only way Bernard could know Denise’s birthday is if he has a date that occurs only once, so we can remove all dates that happen just once: 17, and 15.

In [15]:
# Select days that only occur once
mins = df.day.value_counts()[(df.day.value_counts() == 1)].index.values

In [16]:
# Index(es) to drop from previous step
mins_index = df.query("day in @mins").index

In [17]:
# Drop days that occur only once from data
df = df.drop(df.index[mins_index])

In [18]:
# Reset dataframe index
reset_dataframe(df)

3rd Step - But Bernard also says he knows Cheryl doesn’t know, but that could only be true if there is a year with only one date: 2001. But since the date under the year 2001 is May 13, and Bernard knows the date, Denise’s birthday must not be the 13th, so we can also get rid of any date with 13.

In [19]:
# Select year(s) with only one date
year_mins = df.year.value_counts()[(df.year.value_counts() <= 1)].index.values

In [20]:
# Find date under the year from the previous step
repeated_days = df.day[df.query("year in @year_mins").index]

In [21]:
# Find indexes with repeated date
repeated_days_index = df.query("day in @repeated_days").index

In [22]:
# Drop days that are repeated
df = df.drop(df.index[repeated_days_index])

In [23]:
# Reset dataframe index
reset_dataframe(df)

4th Step - Now, Cheryl says, “I don’t know her birthday, but I know Albert doesn’t know.” If Cheryl knows that Albert doesn’t know, that means we can eliminate any year where there is a month that occurs only once in the entire spread. January occurs only once, so we can get rid of the year 2004.

In [24]:
# Select month that occurs only once in the entire spread
months_mins = df.month.value_counts()[(df.month.value_counts() == 1)].index.values

In [25]:
# Select year linked to month from previous step
month_drop = df.loc[df['month'] == months_mins, 'year']

In [26]:
# Index(es) to drop from previous step
month_drop_index = df.loc[df['year'] == month_drop.values[0]].index

In [27]:
# Remove all entries related to year '2004'
df = df.drop(df.index[month_drop_index])

In [28]:
# Reset dataframe index
reset_dataframe(df)

5th Step - Albert exclaims, “Now I know her birthday,” but how can he know? If Albert knows, there can only be one occurrence of that month, so we can get rid of March, which occurs twice.

In [29]:
# Select month that occurs only once in the entire spread
months_mins = df.month.value_counts()[(df.month.value_counts() > 1)].index.values

In [30]:
# Index(es) to drop from previous step
month_drop_index = df.loc[df['month'] == months_mins].index.values

In [31]:
# Remove all entries related to 'March'
df = df.drop(df.index[month_drop_index])

In [32]:
# Reset dataframe index
reset_dataframe(df)

6th Step - Bernard shouts, “I know too.” For this to be true — since Bernard knows the date — it must be a day that appears only once. Luckily for us, the number 16 shows up three times.

In [33]:
# Select days that only occur more than once
mins = df.day.value_counts()[(df.day.value_counts() > 1)].index.values

In [34]:
# Index(es) to drop from previous step
mins_index = df.loc[df['day'] == mins[0]].index.values

In [35]:
# Drop duplicate days
df = df.drop(df.index[mins_index])

In [36]:
# Reset dataframe index
reset_dataframe(df)

In [37]:
print ("Denise's birthday is on {} {}, {}".format(str(df.month[0]), str(df.day[0]), str(df.year[0])))

Denise's birthday is on May 14, 2002
