### Prepping Data Challenge: Parental Contact Details (week 1)

This week's challenge involves creating contact details for parents. We have the pupil's name, although it's not in the correct format, and will use the pupil's last name to match to the parent's details. We also have the employer for the parental contact so we can form an email address to contact them. 

#### Requirement:

 - Input the csv file  
 - Form the pupil's name correctly for the records in the format Last Name, First Name 
 - Form the parental contact's name in the same format as the pupil's 
 - Create the email address to contact the parent using the format Parent First Name.Parent Last Name@Employer.com
 - Create the academic year the pupils are in 
   - Each academic year starts on 1st September.
   - Year 1 is anyone born after 1st Sept 2014 
   - Year 2 is anyone born between 1st Sept 2013 and 31st Aug 2014 etc
 - Remove any unnecessary columns of data 
 - Output the data 

In [1]:
import pandas as pd
import numpy as np

In [2]:
# Input the csv file.
df = pd.read_csv('WK1-Input.csv', parse_dates =['Date of Birth'])
df.head()

Unnamed: 0,id,pupil first name,pupil last name,gender,Date of Birth,Parental Contact Name_1,Parental Contact Name_2,Preferred Contact Employer,Parental Contact
0,1,Ronna,Nellies,Female,2013-12-21,Purcell,Ketti,Demizz,1
1,2,Rusty,Andriulis,Male,2012-07-21,Vassili,Rivi,Brainbox,1
2,3,Roberta,Oakeshott,Female,2011-12-04,Lind,Haskell,Centidel,2
3,4,Lola,Rubinfajn,Male,2012-06-29,Elie,Tresa,Edgeblab,2
4,5,Kamila,Benedtti,Female,2012-07-10,Adela,Clevey,Trudoo,1


In [3]:
#Form the pupil's name correctly for the records in the format Last Name, First Name 
df['Pupil\'s Name'] = df['pupil last name'] + ', ' + df['pupil first name']

In [4]:
#Form the parental contact's name in the same format as the pupil's
df['Parental Contact Full Name'] = np.where(df['Parental Contact'] == 1,
                                           df['pupil last name'] + ', ' + df['Parental Contact Name_1'],
                                           df['pupil last name'] + ', ' + df['Parental Contact Name_2'])

In [5]:
#Create the email address to contact the parent using the format Parent First Name.Parent Last Name@Employer.com
df['Parental Contact Email Address'] = np.where(df['Parental Contact'] == 1,
                                                df['Parental Contact Name_1']+'.'+df['pupil last name']+'@'+df['Preferred Contact Employer']+'.com',
                                                df['Parental Contact Name_2']+'.'+df['pupil last name']+'@'+df['Preferred Contact Employer']+'.com')

In [6]:
#Create the academic year the pupils are in
df = df.assign(Academic_Year= pd.PeriodIndex(df['Date of Birth'], freq='A-Aug'))
df['Academic_Year'] = df['Academic_Year'].map(lambda x: int(x.strftime('%Y'))) 
df['Academic_Year'] = np.where(df['Academic_Year'] >=2015, 1, (2015 - df['Academic_Year'] )+1)

In [7]:
df.columns

Index(['id', 'pupil first name', 'pupil last name', 'gender', 'Date of Birth',
       'Parental Contact Name_1', 'Parental Contact Name_2',
       'Preferred Contact Employer', 'Parental Contact', 'Pupil's Name',
       'Parental Contact Full Name', 'Parental Contact Email Address',
       'Academic_Year'],
      dtype='object')

In [8]:
df = df[['Academic_Year','Pupil\'s Name','Parental Contact Full Name','Parental Contact Email Address']]

In [9]:
df.head(10)

Unnamed: 0,Academic_Year,Pupil's Name,Parental Contact Full Name,Parental Contact Email Address
0,2,"Nellies, Ronna","Nellies, Purcell",Purcell.Nellies@Demizz.com
1,4,"Andriulis, Rusty","Andriulis, Vassili",Vassili.Andriulis@Brainbox.com
2,4,"Oakeshott, Roberta","Oakeshott, Haskell",Haskell.Oakeshott@Centidel.com
3,4,"Rubinfajn, Lola","Rubinfajn, Tresa",Tresa.Rubinfajn@Edgeblab.com
4,4,"Benedtti, Kamila","Benedtti, Adela",Adela.Benedtti@Trudoo.com
5,4,"Colebourn, Avery","Colebourn, Dalenna",Dalenna.Colebourn@Linktype.com
6,1,"Klimko, Valentino","Klimko, Onofredo",Onofredo.Klimko@Thoughtblab.com
7,1,"Shearwood, Cal","Shearwood, Berne",Berne.Shearwood@Browseblab.com
8,3,"Truswell, King","Truswell, Evvy",Evvy.Truswell@Photospace.com
9,1,"Stichall, Towney","Stichall, Joyann",Joyann.Stichall@Kwimbee.com


In [10]:
df.to_csv('WK1-output.csv', index=False)