### Prepping Data Challenge: Birthday Cakes (Week 2)

The Prep School loves any excuse to buy a cake to celebrate and what is better than celebrating one of our student's birthdays? This sounds easy until you realise we have 1,000 students and what do we do about those who's birthday are on a weekend day? We can't have them miss out. 

#### Requirement:

 - Input the data set
   - Removing any unnecessary fields (parental fields) will make this challenge easier to see what is happening at each step
 - Format the pupil's name in First Name Last Name format (ie Carl Allchin)
 - Create the date for the pupil's birthday in calendar year 2022 (not academic year) 
 - Work out what day of the week the pupil's birthday falls on 
   - Remember if the birthday falls on a Saturday or Sunday, we need to change the weekday to Friday
 - Work out what month the pupil's birthday falls within
 - Count how many birthdays there are on each weekday in each month 
 - Output the data

In [1]:
import pandas as pd
import numpy as np

In [2]:
# Input the data set.
df = pd.read_csv('WK2-Input.csv', parse_dates=['Date of Birth'])
df.head()

Unnamed: 0,id,pupil first name,pupil last name,gender,Date of Birth,Parental Contact Name_1,Parental Contact Name_2,Preferred Contact Employer,Parental Contact
0,1,Ronna,Nellies,Female,2013-12-21,Purcell,Ketti,Demizz,1
1,2,Rusty,Andriulis,Male,2012-07-21,Vassili,Rivi,Brainbox,1
2,3,Roberta,Oakeshott,Female,2011-12-04,Lind,Haskell,Centidel,2
3,4,Lola,Rubinfajn,Male,2012-06-29,Elie,Tresa,Edgeblab,2
4,5,Kamila,Benedtti,Female,2012-07-10,Adela,Clevey,Trudoo,1


In [3]:
df = df[['id','pupil first name','pupil last name','gender','Date of Birth']]

In [4]:
#Format the pupil's name in First Name Last Name format (ie Carl Allchin)
df['Pupil Name'] = df['pupil first name'] + ' ' + df['pupil last name']

In [5]:
#Create the date for the pupil's birthday in calendar year 2022 (not academic year)
df["This Year's Birthday"] = '2022'+'-'+ df['Date of Birth'].dt.strftime('%m-%d')
#df["This Year's Birthday"] = df['Date of Birth'].apply{lambda x:x.replace(year=datetime.now().year)}

In [6]:
#Work out what day of the week the pupil's birthday falls on 
df['weekday'] = (pd.to_datetime(df["This Year's Birthday"])).dt.day_name() 

In [7]:
wkends = ['Saturday', 'Sunday']

df['Cake Needed On'] = np.where((df['weekday'].isin(wkends)), 'Friday',
                                df['weekday'])

In [8]:
#Work out what month the pupil's birthday falls within
df['Month'] = df['Date of Birth'].dt.month_name()

In [9]:
#Count how many birthdays there are on each weekday in each month 
df['BDs per Weekday and Month'] = df.groupby(['Month', 'Cake Needed On'])['id'].transform('size')

<div class="alert alert-block alert-info">
    
<strong>Further Reading on .transform</strong> <br>
https://www.analyticsvidhya.com/blog/2020/03/understanding-transform-function-python/ 
    
</div>

In [10]:
df = df[['Pupil Name','Date of Birth',"This Year's Birthday",'Month','Cake Needed On','BDs per Weekday and Month']]

In [11]:
df.head(8)

Unnamed: 0,Pupil Name,Date of Birth,This Year's Birthday,Month,Cake Needed On,BDs per Weekday and Month
0,Ronna Nellies,2013-12-21,2022-12-21,December,Wednesday,9
1,Rusty Andriulis,2012-07-21,2022-07-21,July,Thursday,12
2,Roberta Oakeshott,2011-12-04,2022-12-04,December,Friday,45
3,Lola Rubinfajn,2012-06-29,2022-06-29,June,Wednesday,15
4,Kamila Benedtti,2012-07-10,2022-07-10,July,Friday,49
5,Avery Colebourn,2012-08-30,2022-08-30,August,Tuesday,15
6,Valentino Klimko,2014-12-23,2022-12-23,December,Friday,45
7,Cal Shearwood,2015-01-18,2022-01-18,January,Tuesday,12


In [12]:
#Output the data
df.to_csv('wk2-output.csv',index=False, date_format = '%d/%m/%Y')