# Make School Summer Academy NPS Data Visualization
##### By Christian Galkowski



The purpose of this examination is to find interesting and actionable trends in data collected during the 2017 Make School Summer Academy, to help boost programming and the quality of education offered at Make School.

The Tools used during this examination were Pandas, Mathplotlib, Seaborn, Numpy and Jupyter Notebook - While the data was collected and provided by Make School.

# Background on NPS

NPS is a score calculated by examining the total responses to student feedback forms, meant to generate an examination of total interest in given fields offered during the Summer Academy. In this scenario a 9 or 10  responses are considered Promoter level (that the student would be more likely to promote the school), 7-8 are considered Passive level (the student would not promote or speak down about the school), and 6 or below would be considered Detractor level (that the student would be more likey to speak poor about the school).

Using those metric, the goal was to use the data provided to calculate accurate analysis of the total promoter levels of each set of data, in the case each week of the Academy and for each field offered. To calculate the total NPS level the formula below was used:

NPS = (Promoters - Detractors)/(Promoters + Passives + Detractors)

The formula calculated NPS by taking the total of the Promoters subtracted by the Detractors and dividing that over the total number of respondants. Further analysis was performed based on the total number of respondants to provide a more accurate view of each field, as each field did not haved the same number of respondants.

# Background on Dataset

The data we were tasked with evaluating, was taken via anonymous feedback durk the 8 week long Summer Academy. our data is formatted in such a way, that every evaluation has an ID, Location, Track, Week, Rating, and Scheduling Pace. Thus we are going take that raw data and use it to determine the final evaluations for each column.


As part of our task to determine if there is any merit to the data provided, we will be apporaching the evaluation of this data with the following questions in mind:

-How many more promoters are there than detractors across our 2017 data?

-Which track boasts the best promoter-to-detractor ratio?

-Does the student experience get better the longer that they are enrolled at the Summer Academy?

-Does student satisfaction vary by location?

-What are things we could find here that could “help the business”?

-What sorts of information does this dataset contain?

-What kinds of questions might we be able to answer with this data?

-What kinds of questions can’t we answer with this data?

-What sorts of information might be actionable?

-How can you present your findings in a way that non-technical employees can understand and use to make decisions?

More specifically, we will be especially looking into the following three: What information is actionable, What are the total NPS for each week and the entire summer, and what questions might we be able to answer with this data. As those three require us to have a complete understanding of the data, or are directly linked to the total evaluation of the data.

-------------

# NPS Evaluation

In [4]:
import pandas as pd
import numpy as np
import matplotlib as plt
import seaborn as sb

In [35]:
df = pd.read_csv('Student-Feedback-Surveys-Superview.csv')
df.head()

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing
0,134,San Francisco,"Apps, Explorer",Week 1,3,Just right
1,36,Los Angeles,Apps,Week 1,4,A little too fast
2,117,San Francisco,Games,Week 1,4,Way too slow
3,253,,,Week 2,4,A little too fast
4,350,New York City,"Apps, Explorer",Week 1,4,Just right


In [54]:
df1 = df[df['Week'] == 'Week 1']
df1

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing,Rating (Integer),NPS Score
0,134,San Francisco,"Apps, Explorer",Week 1,3,Just right,3,Detractor
1,36,Los Angeles,Apps,Week 1,4,A little too fast,4,Detractor
2,117,San Francisco,Games,Week 1,4,Way too slow,4,Detractor
4,350,New York City,"Apps, Explorer",Week 1,4,Just right,4,Detractor
5,23,Redwood City,Apps,Week 1,5,Just right,5,Detractor
...,...,...,...,...,...,...,...,...
1128,1173,Tokyo,Apps,Week 1,7,A little too fast,7,Passive
1129,1174,Tokyo,Apps,Week 1,10,Just right,10,Promoter
1130,1175,Tokyo,Apps,Week 1,10,Just right,10,Promoter
1131,1176,Tokyo,Apps,Week 1,8,Just right,8,Passive


In [55]:
df2=df[df['Week'] == 'Week 2']
df2

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing,Rating (Integer),NPS Score
3,253,,,Week 2,4,A little too fast,4,Detractor
10,157,Redwood City,Apps,Week 2,5,Just right,5,Detractor
11,170,Oakland,Apps,Week 2,5,Just right,5,Detractor
12,255,San Francisco,Apps,Week 2,5,Way too fast,5,Detractor
27,193,Santa Clara,Apps,Week 2,6,Just right,6,Detractor
...,...,...,...,...,...,...,...,...
1195,1241,Tokyo,Apps,Week 2,5,Just right,5,Detractor
1196,1242,Tokyo,Apps,Week 2,10,Just right,10,Promoter
1324,1370,Tokyo,Apps,Week 2,10,A little too slow,10,Promoter
1330,1376,Tokyo,Apps,Week 2,7,A little too slow,7,Passive


In [56]:
df3=df[df['Week'] == 'Week 3']
df3

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing,Rating (Integer),NPS Score
15,307,,,Week 3,5,A little too slow,5,Detractor
16,319,Redwood City,Apps,Week 3,5,Just right,5,Detractor
18,441,San Francisco,"Apps, Explorer",Week 3,5,Just right,5,Detractor
19,482,Santa Clara,Apps,Week 3,5,Just right,5,Detractor
32,314,Redwood City,Apps,Week 3,6,A little too slow,6,Detractor
...,...,...,...,...,...,...,...,...
1412,1459,Tokyo,Apps,Week 3,9,Just right,9,Promoter
1413,1460,Tokyo,Apps,Week 3,9,A little too fast,9,Promoter
1416,1463,Tokyo,Apps,Week 3,10,Just right,10,Promoter
1417,1464,Tokyo,Apps,Week 3,8,Just right,8,Passive


In [57]:
df4=df[df['Week'] == 'Week 4']
df4

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing,Rating (Integer),NPS Score
526,556,Santa Clara,Apps,Week 4,8,Just right,8,Passive
527,557,Santa Clara,"Apps, Explorer",Week 4,8,Just right,8,Passive
528,558,Santa Clara,"Apps, Explorer",Week 4,8,Just right,8,Passive
529,559,Santa Clara,Apps,Week 4,9,Just right,9,Promoter
530,560,Santa Clara,Apps,Week 4,7,Just right,7,Passive
...,...,...,...,...,...,...,...,...
1147,1192,New York City,Apps,Week 4,10,Just right,10,Promoter
1157,1202,,,Week 4,7,Way too slow,7,Passive
1163,1209,New York City,Apps,Week 4,6,Just right,6,Detractor
1164,1210,New York City,Games,Week 4,9,Just right,9,Promoter


In [58]:
df5=df[df['Week'] == 'Week 5']
df5

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing,Rating (Integer),NPS Score
771,809,Redwood City,Apps,Week 5,7,Just right,7,Passive
772,810,San Francisco,VR,Week 5,10,Just right,10,Promoter
773,811,San Francisco,VR,Week 5,10,Just right,10,Promoter
774,812,Redwood City,Apps,Week 5,10,Just right,10,Promoter
775,813,Redwood City,Apps,Week 5,10,Just right,10,Promoter
...,...,...,...,...,...,...,...,...
1338,1385,New York City,"Apps, Explorer",Week 5,9,Just right,9,Promoter
1339,1386,New York City,Apps,Week 5,6,A little too slow,6,Detractor
1347,1394,New York City,Apps,Week 5,6,Just right,6,Detractor
1348,1395,New York City,Apps,Week 5,9,Just right,9,Promoter


In [59]:
df6=df[df['Week'] == 'Week 6']
df6

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing,Rating (Integer),NPS Score
995,1038,Redwood City,Apps,Week 6,7,Just right,7,Passive
996,1039,San Francisco,VR,Week 6,9,Just right,9,Promoter
997,1040,San Francisco,VR,Week 6,10,Just right,10,Promoter
1001,1044,San Francisco,VR,Week 6,10,Just right,10,Promoter
1002,1045,Redwood City,Apps,Week 6,7,Just right,7,Passive
...,...,...,...,...,...,...,...,...
1406,1453,New York City,"Apps, Explorer",Week 6,10,Just right,10,Promoter
1408,1455,New York City,Apps,Week 6,8,Just right,8,Passive
1409,1456,New York City,Apps,Week 6,10,Just right,10,Promoter
1410,1457,,,Week 6,10,Just right,10,Promoter


In [60]:
df7=df[df['Week'] == 'Week 7']
df7

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing,Rating (Integer),NPS Score
6,28,Los Angeles,Apps,Week 7,5,Just right,5,Detractor
1197,1243,San Francisco,VR,Week 7,7,A little too fast,7,Passive
1198,1244,San Francisco,VR,Week 7,10,Just right,10,Promoter
1199,1245,San Francisco,VR,Week 7,10,A little too fast,10,Promoter
1200,1246,San Francisco,VR,Week 7,8,Just right,8,Passive
...,...,...,...,...,...,...,...,...
1447,1494,New York City,"Games, Explorer",Week 7,10,Just right,10,Promoter
1448,1495,New York City,"Apps, Explorer",Week 7,10,Just right,10,Promoter
1449,1496,New York City,"Apps, Explorer",Week 7,8,Just right,8,Passive
1450,1497,New York City,Apps,Week 7,10,Just right,10,Promoter


In [61]:
df8=df[df['Week'] == 'Week 8']
df8

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing,Rating (Integer),NPS Score
1354,1401,San Francisco,VR,Week 8,10,Just right,10,Promoter
1395,1442,Los Angeles,Games,Week 8,7,Way too fast,7,Passive
1396,1443,Los Angeles,Apps,Week 8,9,Just right,9,Promoter
1397,1444,Los Angeles,Games,Week 8,9,Just right,9,Promoter
1398,1445,Los Angeles,Apps,Week 8,10,Just right,10,Promoter
1399,1446,Los Angeles,Games,Week 8,9,A little too slow,9,Promoter
1400,1447,Los Angeles,Apps,Week 8,10,Just right,10,Promoter
1415,1462,Los Angeles,Apps,Week 8,10,A little too fast,10,Promoter
1418,1465,Los Angeles,Games,Week 8,8,A little too slow,8,Passive


# Total NPS

to calculate the total NPS we need to have the rating values as integers we can manipulate, that we then can use to create a new column called Score Evaluation to label what type of score they gave, Promoter, Passive or Detractor

In the data, we find 3 entries that prevent us from performing the necessary calculations:

In [44]:
df[df['Rating (Num)'] == '#ERROR!']

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing
1310,1356,,,Week 2,#ERROR!,
1322,1368,,,Week 3,#ERROR!,
1411,1458,,,Week 3,#ERROR!,


Thus using, drop we are able to remove them and begin our calculations:

In [45]:
df = df.drop([1310, 1322, 1411])

In [46]:
df

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing
0,134,San Francisco,"Apps, Explorer",Week 1,3,Just right
1,36,Los Angeles,Apps,Week 1,4,A little too fast
2,117,San Francisco,Games,Week 1,4,Way too slow
3,253,,,Week 2,4,A little too fast
4,350,New York City,"Apps, Explorer",Week 1,4,Just right
...,...,...,...,...,...,...
1448,1495,New York City,"Apps, Explorer",Week 7,10,Just right
1449,1496,New York City,"Apps, Explorer",Week 7,8,Just right
1450,1497,New York City,Apps,Week 7,10,Just right
1451,1498,New York City,"Apps, Explorer",Week 7,1,A little too slow


In [47]:
df[df['Rating (Num)'] == '#ERROR!']

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing


# Converting Ratings into Integers

In [48]:
df['Rating (Integer)'] = df['Rating (Num)'].apply(lambda x: int(x))

In [49]:
df

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing,Rating (Integer)
0,134,San Francisco,"Apps, Explorer",Week 1,3,Just right,3
1,36,Los Angeles,Apps,Week 1,4,A little too fast,4
2,117,San Francisco,Games,Week 1,4,Way too slow,4
3,253,,,Week 2,4,A little too fast,4
4,350,New York City,"Apps, Explorer",Week 1,4,Just right,4
...,...,...,...,...,...,...,...
1448,1495,New York City,"Apps, Explorer",Week 7,10,Just right,10
1449,1496,New York City,"Apps, Explorer",Week 7,8,Just right,8
1450,1497,New York City,Apps,Week 7,10,Just right,10
1451,1498,New York City,"Apps, Explorer",Week 7,1,A little too slow,1


Once we are able to remove the unneccesary values, we create our function to sort through our data and assing promoter values:

In [50]:
def NPS_Score(rating):
    if rating >= 9:
        return 'Promoter'
    elif rating >= 7:
        return 'Passive'
    else:
        return 'Detractor'

In [51]:
df['NPS Score'] = df['Rating (Integer)'].apply(NPS_Score)

In [52]:
df

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing,Rating (Integer),NPS Score
0,134,San Francisco,"Apps, Explorer",Week 1,3,Just right,3,Detractor
1,36,Los Angeles,Apps,Week 1,4,A little too fast,4,Detractor
2,117,San Francisco,Games,Week 1,4,Way too slow,4,Detractor
3,253,,,Week 2,4,A little too fast,4,Detractor
4,350,New York City,"Apps, Explorer",Week 1,4,Just right,4,Detractor
...,...,...,...,...,...,...,...,...
1448,1495,New York City,"Apps, Explorer",Week 7,10,Just right,10,Promoter
1449,1496,New York City,"Apps, Explorer",Week 7,8,Just right,8,Passive
1450,1497,New York City,Apps,Week 7,10,Just right,10,Promoter
1451,1498,New York City,"Apps, Explorer",Week 7,1,A little too slow,1,Detractor


In [53]:
df['NPS Score'].value_counts()

Promoter     760
Passive      569
Detractor    121
Name: NPS Score, dtype: int64

As Total NPS is caluclated by Promoter-Detractor/total, we can determine that the NPS of the 2017 Summer Academy review set is:

In [32]:
NPS(760,121,569)

0.4406896551724138

Given this Evaluation, we find that our Net Promoter Score is 5.4%, assuming all scores submitted are genuine, regardless of other missing data. (some id's with scores, lack location, track, or other answers)

# Additional Insight

Now that our main task is complete, we can further examine the dataset. We will be taking a closer look at each weeks and each tracks NPS, to further delinate what works and where we can improve, giving us actionable information in the future.

# NPS by Weekly Evaluations

In [33]:
def NPS(Promo, Detract, Passive):
    return (Promo - Detract)/(Promo+Detract+Passive)

In [62]:
df1['NPS Score'].value_counts()

Passive      131
Promoter     129
Detractor     28
Name: NPS Score, dtype: int64

In [67]:
NPS(129, 28, 131)

0.3506944444444444

In [64]:
df2['NPS Score'].value_counts()

Promoter     137
Passive      116
Detractor     23
Name: NPS Score, dtype: int64

In [68]:
NPS(116,23,137)

0.33695652173913043

In [69]:
df3['NPS Score'].value_counts()

Promoter     135
Passive       86
Detractor     20
Name: NPS Score, dtype: int64

In [70]:
NPS(135,86,20)

0.2033195020746888

In [71]:
df4['NPS Score'].value_counts()

Promoter     100
Passive       74
Detractor     19
Name: NPS Score, dtype: int64

In [72]:
NPS(100,19,74)

0.41968911917098445

In [73]:
df5['NPS Score'].value_counts()

Promoter     97
Passive      67
Detractor    15
Name: NPS Score, dtype: int64

In [74]:
NPS(97,15,67)

0.4581005586592179

In [75]:
df6['NPS Score'].value_counts()

Promoter     77
Passive      59
Detractor     8
Name: NPS Score, dtype: int64

In [76]:
NPS(77,8,59)

0.4791666666666667

In [77]:
df7['NPS Score'].value_counts()

Promoter     78
Passive      34
Detractor     8
Name: NPS Score, dtype: int64

In [78]:
NPS(78,34,8)

0.36666666666666664

In [79]:
df8['NPS Score'].value_counts()

Promoter    7
Passive     2
Name: NPS Score, dtype: int64

In [80]:
NPS(7,0,2)

0.7777777777777778

# Data Observations

From our calculations we can derive a few things from the data:

First-Weeks 1-3 had a local high at week 1; decreasing from 35% to 20%;

Second-Weeks 4-7 had a local high at week 6; steady increase from 41% to 48% dropping to 37% at week 7, but picking up again at week 8 at 78%

Third-Total participation in reviews dropped steadily over time, and drastically from week 7 to 8


# Inferences

From this data there are a few inferences that can be made with significant actionable note:

First-Generally, particiaption and promoter status improved as time went on; Seemingly as students began to learn more, they felt more confident in their abilities and thus rated the classes higher; outside a few exceptions including the decrease from weeks 1 through 3, and the drop at week 7

Second-Genearlly, participation fell; Either due to repitition of surveys, leaving the program, or those who performed poorly defaulting to not participating in review, participation over the course of the program dropped, with week 8's 9 person review being likely to a few speculative causes (late survey distribution, last week so most students disregarded the survey, or poor communication of the survey)

Third-Overall the program was a success as a whole; with an average NPS of 44% and strong finishes toward the end of the term

# NPS By Overall Track Evaluations

## Total Response by Track

In [81]:
df['Track'].value_counts()

Apps               871
Apps, Explorer     224
Games              208
VR                  60
Games, Explorer     43
Name: Track, dtype: int64

In [82]:
#Apps Track
df[df['Track'] == 'Apps']['NPS Score'].value_counts()

Promoter     438
Passive      355
Detractor     78
Name: NPS Score, dtype: int64

In [91]:
NPS(438, 78, 355)

0.4133180252583238

In [84]:
#Apps Explorer Track
df[df['Track'] == 'Apps, Explorer']['NPS Score'].value_counts()

Passive      109
Promoter      91
Detractor     24
Name: NPS Score, dtype: int64

In [92]:
NPS(91,24,109)

0.29910714285714285

In [93]:
#Games Track
df[df['Track'] == 'Games']['NPS Score'].value_counts()

Promoter     133
Passive       66
Detractor      9
Name: NPS Score, dtype: int64

In [94]:
NPS(133,9,66)

0.5961538461538461

In [88]:
#VR Track
df[df['Track'] == 'VR']['NPS Score'].value_counts()

Promoter     46
Passive      12
Detractor     2
Name: NPS Score, dtype: int64

In [95]:
NPS(46,2,12)

0.7333333333333333

In [90]:
#Games, Explorer Track
df[df['Track'] == 'Games, Explorer']['NPS Score'].value_counts()

Promoter    31
Passive     12
Name: NPS Score, dtype: int64

In [96]:
NPS(31,0,12)

0.7209302325581395

# Observations

First - All tracks had positive NPS over the entire term.

Second - Apps/Explorer had the lowest NPS over the entire term.

Third - VR and Games/Explorer had the highest NPS, where Apps/Explorer had the lowest NPS.

Fourth - Most Responses were for the Apps Track, with VR and Games/Explorer having second least and least respectively




# Inferences

First - Most students enjoyed their chosen tracks, however Apps/Explorer did not as much; This could be due to most tracks having students where the participants wanted to and knew they would want to learn that material, where Apps/Explorer was a catchall for those who were not sure or fairly new.

Second - Apps/Explorer low NPS could be due to fairly new programmers in a fairly ambiguous catagory, mixed with less guidance in said track due to its ambiguous and catchall nature;

Third - VR and Games/Explorer were the most liked, but least participated tracks; Either students were primarily looking for Apps experience, VR and Games/Explorer had restrictions, or VR and Games/Explorer were not promoted in such a was as to warrent appeal from most students; However, those that did choose that path liked it more than the other options available.

Fourth - Games was the middle man, having a high number of responses, and a postive NPS higher than Apps, however it's smaller size than Apps leaves it in a limbo situation; Perhaps too many students thought the more focuses Games track would be to their liking, or perhaps the track was not designed in such a way to appeal to larger groups as much. Games could possibly reflect the outcome of smaller, focused tracks like VR, in how they would do with a larger pool of students.

# Conclusion

In conclusion, we set out to determine 3 core things, NPS scores for the entire period as well as each week, actionable information useful to Make School, and what can we likely infer from the overall Summer Academy.

Overall NPS Scores were positive, with no extraordinary responses - at least for pools with large sample sizes. Deeper dives into weekly and Track specific scoring saw that over time NPS increased and all but Apps/Explorer track had strong positive score.

Regarding inferences, we found details that might reflect a short downturn in score during the first few weeks, that may be related to programming or some event during that time, as well as a low NPS for apps/explorer, and low turnout for VR and Games/Explorer - with the latter 2 having the highest ratings - that may reflect that Apps was either seen as the main reason for attending the Academy, that is was advertised the heaviest, or that the other programs were not presented in such a way as to attract participants, but that of those who did participate, they had, on average, a better experience than other students.



# Conclusion Cont.

Finally, Actionable Information. From our inferences and data manipulation, if Make School were to continue the Academy or are looking to use this information to inform future programming, than an Apps and Games focus seem to be the most consistant avenues to pursue, However given our inferences on Games and the success of VR and Games/Explorer, it is reasonable to assume that in larger quantities fringe elements of Game developement, like VR, would be as highly revered or more so, if Make School decided to teach classes or offer a track in those fields. Actionable improvements, based on our inferences, would be to closely examine the Apps/Explorer track to determine what exactly caused the negative reviews, and to see what occured from weeks 1 to 3 that would cause a 15% downtick in NPS.

Overall Improving on the quality of Apps/Explorer and broadening learning opportunities to VR and Game development, would likely be the best actionable information based on the 2017 Summer Academy surveys and NPS manipulation.