## Irvington Volunteer Project 

It's that time of year again! 

In your small town of Irvington, there is an annual festival that brings in visitors from all over the state! It's a huge deal and depends completely on the help of the local townspeople. 

The festival is 100% volunteer-run and operated, and it's your job to make sure all the volunteers have everything they need to get started. It is also your job to make sure volunteers are well organized, well prepared, and ready to rock when the day of the festival arrives! 

In the months before the festival, you collected a lot of information on each of the 1,000 volunteers. This information will help you better organize the volunteer effort. Follow the prompts below to complete your task of leading the volunteer effort!

***

#### For this task, use the Irvington Volunteers dataset. This dataset includes the following columns:

    * First = First name of volunteer
    * Last = Last name of volunteer
    * Gender = Gender of volunteer
    * Age = Age of volunteer         
    * Shift = Shift of volunteer worker; 1 = first shift; 2 = second shift; 3 = third shift; 4 = forth shift
    * Month = Month of year volunteer signed up for task
    * Volunteer Task = Assigned task for volunteer 
    * Task Level = Level of involvement for volunteer task; 1 = beginner, 2 = moderate, 3 = expert
    * Supervisor Number = Supervisor assigned to volunteer
    * Fees Owed = Outstanding volunteer fee's that are owed to community
    * Materials Needed = Does the volunteer still need task-specific materials; Y = yes; N = no
    * Volunteer Hours = Number of hours volunteer has spent preparing for task
    * Hours Pledged = Number of hours volunteer pledged to spend on task
    * Task Training Completed = Did the volunteer complete task-specific training; Y = yes; N = no
    * Recruited Volunteers = Number of other volunteers recruited by this volunteer

1. Import the "Irvington Volunteers" dataset. Check to make sure the dataset looks like what you're expecting

In [4]:
import pandas as pd
import numpy as np

In [5]:
df = pd.read_excel("Irvington_Volunteers.xlsx")
df.head()

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Supervisor Number,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers
0,Jackie,Jackson,F,59,4,12,Fire Safety,2,1,18,Y,19,28,Y,3
1,Mary,Patterson,F,31,4,10,First Aid Booth,1,1,7,N,27,28,N,6
2,Tanya,Adams,F,48,1,3,Concession Stand,3,1,22,N,16,28,Y,2
3,Tanya,Henderson,F,19,2,2,Information Desk,3,1,26,Y,9,28,Y,5
4,Walter,Franklin,M,25,1,2,Concession Stand,1,1,25,Y,11,28,Y,4


2. Confirm all your volunteers are accounted for and all of the information you need is within the dataset. What characteristics can you identify in the dataset?
***

* How many rows are in the dataset?
* How many columns are in the dataset?
* What types of data are within each column?

In [6]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 999 entries, 0 to 998
Data columns (total 15 columns):
 #   Column                   Non-Null Count  Dtype 
---  ------                   --------------  ----- 
 0   First                    999 non-null    object
 1   Last                     999 non-null    object
 2   Gender                   999 non-null    object
 3   Age                      999 non-null    int64 
 4   Shift                    999 non-null    int64 
 5   Month                    999 non-null    int64 
 6   Volunteer Task           999 non-null    object
 7   Task Level               999 non-null    int64 
 8   Supervisor Number        999 non-null    int64 
 9   Fees Owed                999 non-null    int64 
 10  Materials Needed         999 non-null    object
 11  Volunteer Hours          999 non-null    int64 
 12  Hours Pledged            999 non-null    int64 
 13  Task Training Completed  999 non-null    object
 14  Recruited Volunteers     999 non-null    i

3. Before taking this list seriously, we need to check to make sure all the volunteers are eligible to work their specific task. The only eligibility requirement is age-based -- no volunteers can be under the age of 13 and no volunteers can be over the age of 75. Are there any volunteers that fall outside of this range?

In [7]:
df.loc[(df["Age"] < 13) | (df["Age"] > 75)]

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Supervisor Number,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers


4. Did you miss anyone? Did you make sure to record ALL the information for ALL the volunteers? Quickly check if your missing any data in this dataset. 

In [8]:
df.isnull().sum()

First                      0
Last                       0
Gender                     0
Age                        0
Shift                      0
Month                      0
Volunteer Task             0
Task Level                 0
Supervisor Number          0
Fees Owed                  0
Materials Needed           0
Volunteer Hours            0
Hours Pledged              0
Task Training Completed    0
Recruited Volunteers       0
dtype: int64

5. A lot of the information we have organizes the volunteers into specific groups. It's important that none of these groups are mismatched (too many people in one group vs another). Let's check how many volunteers fall into each of the following groups. (Hint: you need to use the value counts function for this question). 
***

* Last (how many volunteers are from the same family?)
* Gender
* Shift
* Month
* Volunteer Task
* Task Level
* Supervisor Number
* Materials Needed
* Task Training Completed

In [9]:
df["Last"].value_counts()

Johnson      49
Carter       39
Turner       39
Rose         38
Kennedy      38
Looner       38
Michaels     38
Henderson    37
Martins      36
Davidson     35
Harrison     34
Walters      34
Kane         34
Baker        34
Moore        34
Patterson    33
Withers      33
Adams        33
Sapp         32
Vaughn       32
Rogers       31
Isaacson     31
Sears        30
Jackson      29
Samuelson    29
Franklin     28
Gerardo      28
Mulgrew      26
Ettienne     25
Monroe       22
Name: Last, dtype: int64

In [10]:
df["Gender"].value_counts()

M    510
F    489
Name: Gender, dtype: int64

In [11]:
df["Shift"].value_counts()

3    258
2    251
4    246
1    244
Name: Shift, dtype: int64

In [12]:
df["Month"].value_counts()

3     96
4     95
2     89
11    89
12    87
7     86
8     86
9     83
5     74
6     73
10    72
1     69
Name: Month, dtype: int64

In [13]:
df["Volunteer Task"].value_counts()

Concession Stand           142
Information Desk           131
First Aid Booth            130
Clean Up                   128
Safety                     124
Fire Safety                120
Booth Set-Up               114
Gardening and Lawn Care    110
Name: Volunteer Task, dtype: int64

In [14]:
df["Task Level"].value_counts()

3    352
2    342
1    305
Name: Task Level, dtype: int64

In [15]:
df["Supervisor Number"].value_counts()

3    212
2    207
5    202
4    195
1    183
Name: Supervisor Number, dtype: int64

In [16]:
df["Materials Needed"].value_counts()

Y    637
N    362
Name: Materials Needed, dtype: int64

In [17]:
df["Task Training Completed"].value_counts()

Y    586
N    413
Name: Task Training Completed, dtype: int64

6. Let's look at these same groups, but from another angle. It's important that the community members are mixing and mingling with members of all ages. What is the average age of volunteers within all the groups listed below? (Hint: you need to use the groupby function for this question). Once you determine the average age of volunteers in the groups below, determine the average number of volunteer hours for all the groups below. 
***

* Gender
* Shift
* Month
* Volunteer Task
* Task Level
* Supervisor Number
* Materials Needed
* Task Training Completed

In [18]:
#df["Cars Sold"].groupby(df["SalesTraining"]).mean()
df["Age"].groupby(df["Gender"]).mean()

Gender
F    42.775051
M    42.270588
Name: Age, dtype: float64

In [19]:
df["Age"].groupby(df["Shift"]).mean()

Shift
1    41.434426
2    43.334661
3    42.275194
4    43.012195
Name: Age, dtype: float64

In [20]:
df["Age"].groupby(df["Month"]).mean()

Month
1     46.420290
2     41.853933
3     41.958333
4     44.347368
5     40.013514
6     37.465753
7     40.546512
8     42.930233
9     43.409639
10    43.805556
11    41.483146
12    45.770115
Name: Age, dtype: float64

In [21]:
df["Age"].groupby(df["Volunteer Task"]).mean()

Volunteer Task
Booth Set-Up               40.947368
Clean Up                   43.648438
Concession Stand           43.450704
Fire Safety                43.166667
First Aid Booth            41.500000
Gardening and Lawn Care    41.309091
Information Desk           44.908397
Safety                     40.709677
Name: Age, dtype: float64

In [22]:
df["Age"].groupby(df["Supervisor Number"]).mean()

Supervisor Number
1    41.956284
2    41.521739
3    42.693396
4    42.169231
5    44.198020
Name: Age, dtype: float64

In [23]:
df["Age"].groupby(df["Materials Needed"]).mean()

Materials Needed
N    42.395028
Y    42.587127
Name: Age, dtype: float64

In [24]:
df["Age"].groupby(df["Task Training Completed"]).mean()

Task Training Completed
N    41.268765
Y    43.397611
Name: Age, dtype: float64

In [25]:
df["Volunteer Hours"].groupby(df["Gender"]).mean()

Gender
F    19.932515
M    19.682353
Name: Volunteer Hours, dtype: float64

In [26]:
df["Volunteer Hours"].groupby(df["Shift"]).mean()

Shift
1    19.520492
2    19.466135
3    21.031008
4    19.146341
Name: Volunteer Hours, dtype: float64

In [27]:
df["Volunteer Hours"].groupby(df["Month"]).mean()

Month
1     19.869565
2     19.764045
3     20.906250
4     20.757895
5     18.932432
6     19.767123
7     19.837209
8     18.744186
9     19.506024
10    21.541667
11    19.000000
12    19.000000
Name: Volunteer Hours, dtype: float64

In [28]:
df["Volunteer Hours"].groupby(df["Volunteer Task"]).mean()

Volunteer Task
Booth Set-Up               19.640351
Clean Up                   19.171875
Concession Stand           20.746479
Fire Safety                20.058333
First Aid Booth            19.584615
Gardening and Lawn Care    19.409091
Information Desk           19.977099
Safety                     19.685484
Name: Volunteer Hours, dtype: float64

In [29]:
df["Volunteer Hours"].groupby(df["Supervisor Number"]).mean()

Supervisor Number
1    19.295082
2    19.729469
3    20.537736
4    19.379487
5    19.985149
Name: Volunteer Hours, dtype: float64

In [30]:
df["Volunteer Hours"].groupby(df["Materials Needed"]).mean()

Materials Needed
N    20.273481
Y    19.538462
Name: Volunteer Hours, dtype: float64

In [31]:
df["Volunteer Hours"].groupby(df["Task Training Completed"]).mean()

Task Training Completed
N    19.900726
Y    19.737201
Name: Volunteer Hours, dtype: float64

7. There are specific groups of volunteers that need to be followed-up with immediately! These are the volunteers that have some outstanding business to handle before the big festival. Locate the group of volunteers that fall within each of these conditions.
***
* Locate the volunteers who still need materials for their volunteer task
* Locate the volunteers who have not yet completed their volunteer training
* Locate the volunteers who are experts are their task (aka Task Level = 3)
* Locate the volunteers who are working fire safety and haven't yet completed their training - these folks need to get trained ASAP!

In [32]:
#Locate the volunteers who still need materials for their volunteer task
df.loc[df["Materials Needed"] == "Y"]

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Supervisor Number,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers
0,Jackie,Jackson,F,59,4,12,Fire Safety,2,1,18,Y,19,28,Y,3
3,Tanya,Henderson,F,19,2,2,Information Desk,3,1,26,Y,9,28,Y,5
4,Walter,Franklin,M,25,1,2,Concession Stand,1,1,25,Y,11,28,Y,4
6,Martin,Ettienne,M,36,2,8,Gardening and Lawn Care,2,1,8,Y,26,28,N,6
8,Denise,Henderson,F,38,1,6,First Aid Booth,2,1,12,Y,23,28,N,5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
993,Walter,Martins,M,67,3,12,Fire Safety,1,5,9,Y,28,28,N,3
994,Walter,Mulgrew,M,63,3,4,Information Desk,2,5,10,Y,25,28,Y,5
996,Larry,Patterson,M,49,1,4,Clean Up,2,5,21,Y,16,28,N,3
997,Victor,Kane,M,39,3,7,Concession Stand,2,5,28,Y,8,28,Y,4


In [33]:
#Locate the volunteers who have not yet completed their volunteer training
df.loc[df["Task Training Completed"] == "Y"]

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Supervisor Number,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers
0,Jackie,Jackson,F,59,4,12,Fire Safety,2,1,18,Y,19,28,Y,3
2,Tanya,Adams,F,48,1,3,Concession Stand,3,1,22,N,16,28,Y,2
3,Tanya,Henderson,F,19,2,2,Information Desk,3,1,26,Y,9,28,Y,5
4,Walter,Franklin,M,25,1,2,Concession Stand,1,1,25,Y,11,28,Y,4
5,Victoria,Ettienne,F,62,4,10,Fire Safety,2,1,21,N,17,28,Y,2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
991,Adam,Turner,M,62,4,3,Concession Stand,2,5,11,N,26,28,Y,3
992,Lisa,Isaacson,F,39,3,2,Booth Set-Up,1,5,28,Y,9,28,Y,3
994,Walter,Mulgrew,M,63,3,4,Information Desk,2,5,10,Y,25,28,Y,5
995,Veronica,Looner,F,63,3,5,Clean Up,2,5,28,N,8,28,Y,4


In [34]:
#Locate the volunteers who are experts are their task (aka Task Level = 3)
df.loc[df["Task Level"] == 3]

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Supervisor Number,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers
2,Tanya,Adams,F,48,1,3,Concession Stand,3,1,22,N,16,28,Y,2
3,Tanya,Henderson,F,19,2,2,Information Desk,3,1,26,Y,9,28,Y,5
7,Peter,Michaels,M,36,1,6,Concession Stand,3,1,7,N,27,28,N,6
10,Sam,Michaels,M,57,4,3,First Aid Booth,3,1,5,Y,32,28,Y,3
11,Jada,Walters,F,20,1,5,Concession Stand,3,1,26,N,10,28,Y,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
980,August,Jackson,F,30,1,11,Clean Up,3,5,26,N,11,28,Y,3
981,Betty,Vaughn,F,65,2,3,Safety,3,5,28,N,10,28,N,2
982,Mary,Kennedy,F,36,3,11,Information Desk,3,5,15,Y,20,28,N,5
985,Martin,Martins,M,34,2,12,Information Desk,3,5,21,N,13,28,Y,6


In [35]:
#Locate the volunteers who are working fire safety and haven't yet completed their training 
#- these folks need to get trained ASAP!
df.loc[(df["Task Training Completed"] == "N") & (df["Volunteer Task"] == "Fire Safety")]

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Supervisor Number,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers
34,Tanya,Harrison,F,44,1,3,Fire Safety,1,1,24,N,11,28,N,5
62,Ronelle,Samuelson,F,55,1,1,Fire Safety,1,1,14,Y,21,28,N,5
74,Roger,Sapp,M,67,4,2,Fire Safety,2,1,9,Y,25,28,N,6
96,Adam,Patterson,M,23,2,2,Fire Safety,1,1,29,Y,7,28,N,4
118,Denise,Moore,F,66,2,10,Fire Safety,2,1,29,Y,9,28,N,2
164,Samantha,Franklin,F,61,3,6,Fire Safety,2,1,24,Y,12,28,N,4
173,Nicole,Kennedy,F,40,4,5,Fire Safety,1,1,6,Y,31,28,N,3
225,Karen,Looner,F,36,3,2,Fire Safety,2,2,9,N,27,28,N,4
227,Betty,Baker,F,17,4,5,Fire Safety,2,2,30,Y,7,28,N,3
230,Adam,Vaughn,M,39,4,9,Fire Safety,3,2,28,Y,8,28,N,4


8. To keep track of volunteers who meet specific conditions, make some additional columns to hold the new information. 
***

* Inexperienced volunteers may get confused or lost. Create a new column called "needsMentor". This column should be assigned a value of "1" if a volunteer has not completed training and is at skill level 1. 
* How many volunteers haven't completed the volunteer hours they promised they would? Create a new column that subtracts the number of hours pledged ("Hours Pledged") from the number of volunteer hours already worked ("Volunteer Hours). Call this column "hoursNeeded". 
* There are some overly committed volunteers - they have already worked all their pledged volunteer hours and are still going! These folks need to get a specific bonus when the festival is over - we need to track them somehow. Create a new column called "overtimeBonus" - if "hoursNeeded" is less than 0, this column should be "1". 

In [36]:
#df.loc[((df["Grades"] > 89) & (df["Passed"] == 1)), "Honor_Role"] = "Yes"

#nexperienced volunteers may get confused or lost. Create a new column called "needsMentor".
#This column should be assigned a value of "1" if a volunteer has not completed training and is at skill level 1.
df.loc[(df["Task Training Completed"] == "N") & (df["Task Level"] == 1),"needsMentor"] = 1
df

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Supervisor Number,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers,needsMentor
0,Jackie,Jackson,F,59,4,12,Fire Safety,2,1,18,Y,19,28,Y,3,
1,Mary,Patterson,F,31,4,10,First Aid Booth,1,1,7,N,27,28,N,6,1.0
2,Tanya,Adams,F,48,1,3,Concession Stand,3,1,22,N,16,28,Y,2,
3,Tanya,Henderson,F,19,2,2,Information Desk,3,1,26,Y,9,28,Y,5,
4,Walter,Franklin,M,25,1,2,Concession Stand,1,1,25,Y,11,28,Y,4,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
994,Walter,Mulgrew,M,63,3,4,Information Desk,2,5,10,Y,25,28,Y,5,
995,Veronica,Looner,F,63,3,5,Clean Up,2,5,28,N,8,28,Y,4,
996,Larry,Patterson,M,49,1,4,Clean Up,2,5,21,Y,16,28,N,3,
997,Victor,Kane,M,39,3,7,Concession Stand,2,5,28,Y,8,28,Y,4,


In [37]:
#How many volunteers haven't completed the volunteer hours they promised they would? 
#Create a new column that subtracts the number of hours pledged ("Hours Pledged") 
#from the number of volunteer hours already worked ("Volunteer Hours).
#Call this column "hoursNeeded".

df["hoursNeeded"] = df["Hours Pledged"] - df["Volunteer Hours"]
df

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Supervisor Number,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers,needsMentor,hoursNeeded
0,Jackie,Jackson,F,59,4,12,Fire Safety,2,1,18,Y,19,28,Y,3,,9
1,Mary,Patterson,F,31,4,10,First Aid Booth,1,1,7,N,27,28,N,6,1.0,1
2,Tanya,Adams,F,48,1,3,Concession Stand,3,1,22,N,16,28,Y,2,,12
3,Tanya,Henderson,F,19,2,2,Information Desk,3,1,26,Y,9,28,Y,5,,19
4,Walter,Franklin,M,25,1,2,Concession Stand,1,1,25,Y,11,28,Y,4,,17
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
994,Walter,Mulgrew,M,63,3,4,Information Desk,2,5,10,Y,25,28,Y,5,,3
995,Veronica,Looner,F,63,3,5,Clean Up,2,5,28,N,8,28,Y,4,,20
996,Larry,Patterson,M,49,1,4,Clean Up,2,5,21,Y,16,28,N,3,,12
997,Victor,Kane,M,39,3,7,Concession Stand,2,5,28,Y,8,28,Y,4,,20


In [38]:
#There are some overly committed volunteers - they have already worked all their pledged volunteer hours and are still going!
#These folks need to get a specific bonus when the festival is over - we need to track them somehow.
#Create a new column called "overtimeBonus" - if "hoursNeeded" is less than 0, this column should be "1".

df.loc[df["hoursNeeded"] < 0, "overtimeBonus"] = 1
df

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Supervisor Number,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers,needsMentor,hoursNeeded,overtimeBonus
0,Jackie,Jackson,F,59,4,12,Fire Safety,2,1,18,Y,19,28,Y,3,,9,
1,Mary,Patterson,F,31,4,10,First Aid Booth,1,1,7,N,27,28,N,6,1.0,1,
2,Tanya,Adams,F,48,1,3,Concession Stand,3,1,22,N,16,28,Y,2,,12,
3,Tanya,Henderson,F,19,2,2,Information Desk,3,1,26,Y,9,28,Y,5,,19,
4,Walter,Franklin,M,25,1,2,Concession Stand,1,1,25,Y,11,28,Y,4,,17,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
994,Walter,Mulgrew,M,63,3,4,Information Desk,2,5,10,Y,25,28,Y,5,,3,
995,Veronica,Looner,F,63,3,5,Clean Up,2,5,28,N,8,28,Y,4,,20,
996,Larry,Patterson,M,49,1,4,Clean Up,2,5,21,Y,16,28,N,3,,12,
997,Victor,Kane,M,39,3,7,Concession Stand,2,5,28,Y,8,28,Y,4,,20,


In [39]:
df.loc[df["overtimeBonus"] == 1]

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Supervisor Number,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers,needsMentor,hoursNeeded,overtimeBonus
10,Sam,Michaels,M,57,4,3,First Aid Booth,3,1,5,Y,32,28,Y,3,,-4,1.0
16,John,Looner,M,21,3,4,Clean Up,1,1,8,N,30,28,Y,2,,-2,1.0
18,Karen,Looner,F,32,2,9,Booth Set-Up,3,1,9,Y,29,28,Y,2,,-1,1.0
20,Victor,Looner,M,22,3,4,Clean Up,2,1,2,Y,32,28,Y,6,,-4,1.0
21,Frank,Patterson,M,43,3,2,Concession Stand,3,1,5,Y,32,28,Y,3,,-4,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
960,Onika,Michaels,F,62,1,7,Safety,3,5,8,Y,29,28,Y,3,,-1,1.0
963,Aaron,Patterson,M,21,2,8,Safety,1,5,4,N,32,28,N,4,1.0,-4,1.0
964,August,Martins,F,62,2,7,Information Desk,2,5,6,Y,29,28,N,5,,-1,1.0
989,Onika,Sears,F,38,1,1,Concession Stand,1,5,3,N,34,28,N,3,1.0,-6,1.0


9. The Mayor wants to make a speech about the different types of community members that are volunteering for the festival. Specifically, the Mayor would like to present some information on the different age groups. Create a new column called "Age Group" and bin the "Age" column to create bins based on the age of the volunteer. Follow the guidance below for the variations in age. (Hint: you can use any code you like to complete this task - np.select is also an option!)
***

* Teen (0 - 17)
* Young Adult (17.1 - 35)
* Adult (35.1 - 65)
* Senior (65+)

In [40]:
df["Age"].max()

68

In [41]:
#df["Letter Grades"] = pd.cut(df["Grades"], bins, labels = bin_labels)

bins = [0, 17, 35, 65, 100]
bin_labels = ["Teen","Young Adult","Adult","Senior"]

df["Age Group"] = pd.cut(df["Age"], bins, labels = bin_labels)
df

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Supervisor Number,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers,needsMentor,hoursNeeded,overtimeBonus,Age Group
0,Jackie,Jackson,F,59,4,12,Fire Safety,2,1,18,Y,19,28,Y,3,,9,,Adult
1,Mary,Patterson,F,31,4,10,First Aid Booth,1,1,7,N,27,28,N,6,1.0,1,,Young Adult
2,Tanya,Adams,F,48,1,3,Concession Stand,3,1,22,N,16,28,Y,2,,12,,Adult
3,Tanya,Henderson,F,19,2,2,Information Desk,3,1,26,Y,9,28,Y,5,,19,,Young Adult
4,Walter,Franklin,M,25,1,2,Concession Stand,1,1,25,Y,11,28,Y,4,,17,,Young Adult
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
994,Walter,Mulgrew,M,63,3,4,Information Desk,2,5,10,Y,25,28,Y,5,,3,,Adult
995,Veronica,Looner,F,63,3,5,Clean Up,2,5,28,N,8,28,Y,4,,20,,Adult
996,Larry,Patterson,M,49,1,4,Clean Up,2,5,21,Y,16,28,N,3,,12,,Adult
997,Victor,Kane,M,39,3,7,Concession Stand,2,5,28,Y,8,28,Y,4,,20,,Adult


10. The Supervisor Number column is all messed up. Someone didn't record is correctly, and now it is meaningless! Drop it from this dataset - it's not providing any meaningful information. 

In [42]:
df.drop("Supervisor Number",axis = 1,inplace = True)

In [43]:
df

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers,needsMentor,hoursNeeded,overtimeBonus,Age Group
0,Jackie,Jackson,F,59,4,12,Fire Safety,2,18,Y,19,28,Y,3,,9,,Adult
1,Mary,Patterson,F,31,4,10,First Aid Booth,1,7,N,27,28,N,6,1.0,1,,Young Adult
2,Tanya,Adams,F,48,1,3,Concession Stand,3,22,N,16,28,Y,2,,12,,Adult
3,Tanya,Henderson,F,19,2,2,Information Desk,3,26,Y,9,28,Y,5,,19,,Young Adult
4,Walter,Franklin,M,25,1,2,Concession Stand,1,25,Y,11,28,Y,4,,17,,Young Adult
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
994,Walter,Mulgrew,M,63,3,4,Information Desk,2,10,Y,25,28,Y,5,,3,,Adult
995,Veronica,Looner,F,63,3,5,Clean Up,2,28,N,8,28,Y,4,,20,,Adult
996,Larry,Patterson,M,49,1,4,Clean Up,2,21,Y,16,28,N,3,,12,,Adult
997,Victor,Kane,M,39,3,7,Concession Stand,2,28,Y,8,28,Y,4,,20,,Adult


11. There is a new policy about volunteer fees. It seems that some of the fees owed were actually for materials that the volunteer paid for out of pocket. Wipe out the debt for those volunteers who owe just a few dollars. If the volunteer owes less than 5 dollars, replace that value with 0!

In [44]:
df.loc[df["Fees Owed"] < 5, "Fees Owed"] = 0
df

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers,needsMentor,hoursNeeded,overtimeBonus,Age Group
0,Jackie,Jackson,F,59,4,12,Fire Safety,2,18,Y,19,28,Y,3,,9,,Adult
1,Mary,Patterson,F,31,4,10,First Aid Booth,1,7,N,27,28,N,6,1.0,1,,Young Adult
2,Tanya,Adams,F,48,1,3,Concession Stand,3,22,N,16,28,Y,2,,12,,Adult
3,Tanya,Henderson,F,19,2,2,Information Desk,3,26,Y,9,28,Y,5,,19,,Young Adult
4,Walter,Franklin,M,25,1,2,Concession Stand,1,25,Y,11,28,Y,4,,17,,Young Adult
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
994,Walter,Mulgrew,M,63,3,4,Information Desk,2,10,Y,25,28,Y,5,,3,,Adult
995,Veronica,Looner,F,63,3,5,Clean Up,2,28,N,8,28,Y,4,,20,,Adult
996,Larry,Patterson,M,49,1,4,Clean Up,2,21,Y,16,28,N,3,,12,,Adult
997,Victor,Kane,M,39,3,7,Concession Stand,2,28,Y,8,28,Y,4,,20,,Adult


In [45]:
df.loc[df["Fees Owed"] == 0]

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers,needsMentor,hoursNeeded,overtimeBonus,Age Group
20,Victor,Looner,M,22,3,4,Clean Up,2,0,Y,32,28,Y,6,,-4,1.0,Young Adult
22,Roger,Sapp,M,22,4,12,Clean Up,3,0,Y,31,28,Y,6,,-3,1.0,Young Adult
35,Tom,Vaughn,M,24,3,12,Information Desk,2,0,Y,30,28,N,6,,-2,1.0,Young Adult
39,Frank,Sapp,M,51,3,7,Information Desk,1,0,N,33,28,Y,3,,-5,1.0,Adult
46,Charles,Walters,M,60,2,8,Clean Up,1,0,Y,33,28,N,4,1.0,-5,1.0,Adult
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
949,Sam,Kane,M,38,2,10,Fire Safety,1,0,N,31,28,Y,6,,-3,1.0,Adult
951,Peter,Franklin,M,40,4,7,Gardening and Lawn Care,3,0,Y,35,28,Y,3,,-7,1.0,Adult
953,Mike,Henderson,M,61,2,4,Information Desk,3,0,N,32,28,Y,6,,-4,1.0,Adult
963,Aaron,Patterson,M,21,2,8,Safety,1,0,N,32,28,N,4,1.0,-4,1.0,Young Adult


12. We need some code that we can keep around for next year -- when we have the same task but new data! Define a few functions that can complete tasks that we will need to repeat down the road. 

***

* Define a function to recode the column "Gender". Instead of "M" and "F" -- have the words spelled out "Male" and "Female". Apply this function to the "Gender" column. 
* Define a function to recode the column "Month". Convert the numeric "Month" to the name of the Month. Apply this function to the column "Month" and create a new column called "Name of Month". 

In [46]:
#Define a function to recode the column "Gender".
#Instead of "M" and "F" -- have the words spelled out "Male" and "Female".
#Apply this function to the "Gender" column.

df["Gender"].replace(["M","F"],["Male","Female"],inplace=True)
df

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers,needsMentor,hoursNeeded,overtimeBonus,Age Group
0,Jackie,Jackson,Female,59,4,12,Fire Safety,2,18,Y,19,28,Y,3,,9,,Adult
1,Mary,Patterson,Female,31,4,10,First Aid Booth,1,7,N,27,28,N,6,1.0,1,,Young Adult
2,Tanya,Adams,Female,48,1,3,Concession Stand,3,22,N,16,28,Y,2,,12,,Adult
3,Tanya,Henderson,Female,19,2,2,Information Desk,3,26,Y,9,28,Y,5,,19,,Young Adult
4,Walter,Franklin,Male,25,1,2,Concession Stand,1,25,Y,11,28,Y,4,,17,,Young Adult
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
994,Walter,Mulgrew,Male,63,3,4,Information Desk,2,10,Y,25,28,Y,5,,3,,Adult
995,Veronica,Looner,Female,63,3,5,Clean Up,2,28,N,8,28,Y,4,,20,,Adult
996,Larry,Patterson,Male,49,1,4,Clean Up,2,21,Y,16,28,N,3,,12,,Adult
997,Victor,Kane,Male,39,3,7,Concession Stand,2,28,Y,8,28,Y,4,,20,,Adult


In [47]:
#Define a function to recode the column "Month".
#Convert the numeric "Month" to the name of the Month.
#Apply this function to the column "Month" and create a new column called "Name of Month".
def nameOfMonth(month):
    if month == 1:
        return "January"
    elif month == 2:
        return "February"
    elif month == 3:
        return "March"
    elif month == 4:
        return "April"
    elif month == 5:
        return "May"
    elif month == 6:
        return "June"
    elif month == 7:
        return "July"
    elif month == 8:
        return "August"
    elif month == 9:
        return "September"
    elif month == 10:
        return "October"
    elif month == 11:
        return "November"
    elif month == 12:
        return "December"
    
df["Name of Month"] = df["Month"].apply(nameOfMonth)

In [48]:
df

Unnamed: 0,First,Last,Gender,Age,Shift,Month,Volunteer Task,Task Level,Fees Owed,Materials Needed,Volunteer Hours,Hours Pledged,Task Training Completed,Recruited Volunteers,needsMentor,hoursNeeded,overtimeBonus,Age Group,Name of Month
0,Jackie,Jackson,Female,59,4,12,Fire Safety,2,18,Y,19,28,Y,3,,9,,Adult,December
1,Mary,Patterson,Female,31,4,10,First Aid Booth,1,7,N,27,28,N,6,1.0,1,,Young Adult,October
2,Tanya,Adams,Female,48,1,3,Concession Stand,3,22,N,16,28,Y,2,,12,,Adult,March
3,Tanya,Henderson,Female,19,2,2,Information Desk,3,26,Y,9,28,Y,5,,19,,Young Adult,February
4,Walter,Franklin,Male,25,1,2,Concession Stand,1,25,Y,11,28,Y,4,,17,,Young Adult,February
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
994,Walter,Mulgrew,Male,63,3,4,Information Desk,2,10,Y,25,28,Y,5,,3,,Adult,April
995,Veronica,Looner,Female,63,3,5,Clean Up,2,28,N,8,28,Y,4,,20,,Adult,May
996,Larry,Patterson,Male,49,1,4,Clean Up,2,21,Y,16,28,N,3,,12,,Adult,April
997,Victor,Kane,Male,39,3,7,Concession Stand,2,28,Y,8,28,Y,4,,20,,Adult,July
