# Clean and Analyze Employee Exit Surveys

dataset
- exit surveys of employees from Queensland, Australia
    - Department of Education, Training and Employment (DETE)
    - Technical and Further Education (TAFE)
    - encoded to UTF-8

project goal
- Are employes who only worked for the institutes for a short period of time resigning due to some kind of dissatisfaction?
- What about the employees who have been there longer?
- Are younger employees resigning due to some kind of dissatisfaction?
- What about older employees?

- combine results for both surveys to answer the quetions
- use same survey template, but one customized some of the answers
- no data dictionary available

skills:
- apply(), map()
- fillna(), dropna(), drop()
- melt()
- concat(), merge()

In [2]:
import numpy as np
import pandas as pd

In [5]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## 1. The DETE and TAFE Survey Datasets

`dete_survey.csv`
`ID` participant ID
`SeparationType` reason why employment ended
`Cease Date` year or month employment ended
`DETE Start Date` year employemnt started

`tafe_survey.csv`
`Record ID` participant ID
`Reason for ceasing employment`
`LengthofServiceOverall. Overall Length of Service at Institute (in years)` employment in years

In [6]:
# read in and preview datasets
dete = pd.read_csv('/dete_survey.csv')
tafe = pd.read_csv('/tafe_survey.csv')

### DETE

In [None]:
dete.head()

Unnamed: 0,ID,SeparationType,Cease Date,DETE Start Date,Role Start Date,Position,Classification,Region,Business Unit,Employment Status,...,Kept informed,Wellness programs,Health & Safety,Gender,Age,Aboriginal,Torres Strait,South Sea,Disability,NESB
0,1,Ill Health Retirement,08/2012,1984,2004,Public Servant,A01-A04,Central Office,Corporate Strategy and Peformance,Permanent Full-time,...,N,N,N,Male,56-60,,,,,Yes
1,2,Voluntary Early Retirement (VER),08/2012,Not Stated,Not Stated,Public Servant,AO5-AO7,Central Office,Corporate Strategy and Peformance,Permanent Full-time,...,N,N,N,Male,56-60,,,,,
2,3,Voluntary Early Retirement (VER),05/2012,2011,2011,Schools Officer,,Central Office,Education Queensland,Permanent Full-time,...,N,N,N,Male,61 or older,,,,,
3,4,Resignation-Other reasons,05/2012,2005,2006,Teacher,Primary,Central Queensland,,Permanent Full-time,...,A,N,A,Female,36-40,,,,,
4,5,Age Retirement,05/2012,1970,1989,Head of Curriculum/Head of Special Education,,South East,,Permanent Full-time,...,N,A,M,Female,61 or older,,,,,


In [None]:
# dete.info()

In [None]:
# dete.columns

In [None]:
# dete.isnull()

In [None]:
dete['SeparationType'].value_counts()

Age Retirement                          285
Resignation-Other reasons               150
Resignation-Other employer               91
Resignation-Move overseas/interstate     70
Voluntary Early Retirement (VER)         67
Ill Health Retirement                    61
Other                                    49
Contract Expired                         34
Termination                              15
Name: SeparationType, dtype: int64

In [None]:
dete['Position'].value_counts()

Teacher                                                    324
Teacher Aide                                               137
Public Servant                                             126
Cleaner                                                     97
Head of Curriculum/Head of Special Education                38
Schools Officer                                             24
School Administrative Staff                                 16
Guidance Officer                                            12
Technical Officer                                           11
Professional Officer                                         7
Other                                                        7
School Principal                                             5
School Based Professional Staff (Therapist, nurse, etc)      5
Deputy Principal                                             4
Business Service Manager                                     4
Name: Position, dtype: int64

In [None]:
# dete['Classification'].value_counts()

**`dete`**
- RangeIndex: 822 entries, 0 to 821
- Data columns (total 56 columns)
- Dytpe: ID=int, others=object, bool
- Non-Null: Business Unit, Aboriginal, Torres Strait, South Sea, Disability, NESB

### TAFE

In [None]:
tafe.head()

Unnamed: 0,Record ID,Institute,WorkArea,CESSATION YEAR,Reason for ceasing employment,Contributing Factors. Career Move - Public Sector,Contributing Factors. Career Move - Private Sector,Contributing Factors. Career Move - Self-employment,Contributing Factors. Ill Health,Contributing Factors. Maternity/Family,...,Workplace. Topic:Does your workplace promote a work culture free from all forms of unlawful discrimination?,Workplace. Topic:Does your workplace promote and practice the principles of employment equity?,Workplace. Topic:Does your workplace value the diversity of its employees?,Workplace. Topic:Would you recommend the Institute as an employer to others?,Gender. What is your Gender?,CurrentAge. Current Age,Employment Type. Employment Type,Classification. Classification,LengthofServiceOverall. Overall Length of Service at Institute (in years),LengthofServiceCurrent. Length of Service at current workplace (in years)
0,6.34133e+17,Southern Queensland Institute of TAFE,Non-Delivery (corporate),2010.0,Contract Expired,,,,,,...,Yes,Yes,Yes,Yes,Female,26 30,Temporary Full-time,Administration (AO),1-2,1-2
1,6.341337e+17,Mount Isa Institute of TAFE,Non-Delivery (corporate),2010.0,Retirement,-,-,-,-,-,...,Yes,Yes,Yes,Yes,,,,,,
2,6.341388e+17,Mount Isa Institute of TAFE,Delivery (teaching),2010.0,Retirement,-,-,-,-,-,...,Yes,Yes,Yes,Yes,,,,,,
3,6.341399e+17,Mount Isa Institute of TAFE,Non-Delivery (corporate),2010.0,Resignation,-,-,-,-,-,...,Yes,Yes,Yes,Yes,,,,,,
4,6.341466e+17,Southern Queensland Institute of TAFE,Delivery (teaching),2010.0,Resignation,-,Career Move - Private Sector,-,-,-,...,Yes,Yes,Yes,Yes,Male,41 45,Permanent Full-time,Teacher (including LVT),3-4,3-4


In [None]:
# tafe.info()

In [None]:
# tafe.columns

In [None]:
# tafe.isnull()

In [None]:
tafe['Reason for ceasing employment'].value_counts()

Resignation                 340
Contract Expired            127
Retrenchment/ Redundancy    104
Retirement                   82
Transfer                     25
Termination                  23
Name: Reason for ceasing employment, dtype: int64

In [None]:
tafe['Employment Type. Employment Type'].value_counts()

Permanent Full-time    237
Temporary Full-time    177
Contract/casual         71
Permanent Part-time     59
Temporary Part-time     52
Name: Employment Type. Employment Type, dtype: int64

In [None]:
# tafe['Classification. Classification'].value_counts()

**`tafe`**
- Record ID in scientific notation
- Columns names are long, descriptive, repetitive
- RangeIndex: 702 entries, 0 to 701
- Data columns (total 72 columns)
- Dtype: ID=int, others=object, cessation year=float
- Non-Null: range 400-500 of 700 rows

## Identify Missing Values and Drop Unnecessary Columns

In [7]:
dete = pd.read_csv('/dete_survey.csv', na_values="Not Stated")

In [17]:
#drop columns [28:49] axis=1
dete_35cols = dete.drop(dete.columns[28:49],axis=1)
dete_35cols.head()

Unnamed: 0,ID,SeparationType,Cease Date,DETE Start Date,Role Start Date,Position,Classification,Region,Business Unit,Employment Status,...,Work life balance,Workload,None of the above,Gender,Age,Aboriginal,Torres Strait,South Sea,Disability,NESB
0,1,Ill Health Retirement,08/2012,1984.0,2004.0,Public Servant,A01-A04,Central Office,Corporate Strategy and Peformance,Permanent Full-time,...,False,False,True,Male,56-60,,,,,Yes
1,2,Voluntary Early Retirement (VER),08/2012,,,Public Servant,AO5-AO7,Central Office,Corporate Strategy and Peformance,Permanent Full-time,...,False,False,False,Male,56-60,,,,,
2,3,Voluntary Early Retirement (VER),05/2012,2011.0,2011.0,Schools Officer,,Central Office,Education Queensland,Permanent Full-time,...,False,False,True,Male,61 or older,,,,,
3,4,Resignation-Other reasons,05/2012,2005.0,2006.0,Teacher,Primary,Central Queensland,,Permanent Full-time,...,False,False,False,Female,36-40,,,,,
4,5,Age Retirement,05/2012,1970.0,1989.0,Head of Curriculum/Head of Special Education,,South East,,Permanent Full-time,...,True,False,False,Female,61 or older,,,,,
