# Project Summary
For your first major project of DS 1.1, you’ll investigate real-world data from feedback surveys completed during Make School’s very own Summer Academy program! Completing this project will you’ll strengthen your understanding of:
- The overall Data Science process (define, measure, analyze, improve, control)
- Aggregating datasets from multiple files, locations, and types
- The importance of scripting and automating data preprocessing
- Transforming data so that it has the same scale and data type
- Best practices for investigating data and asking interesting questions
- Data Visualization strategies
- Distilling findings down into small, understandable, non-technical (!) presentations


## Description of Problem
Clean and investigate Make School NPS data to find interesting and actionable trends that help inform decision-makers. Create a presentation in a Jupyter Notebook using data visualizations and other techniques that allow non-technical team members to understand your findings. 


## Background on NPS
Every summer, Make School welcomes hundreds of students into the Summer Academy to study software development and build cool stuff. The management wants to make sure that students continue to be satisfied with their experience as the program scales. The main way we measure this is through Net Promoter Score (NPS), which is a tool commonly used to measure customer loyalty and promotion. You’ve seen NPS before if you’ve been asked a question like:
“On a scale of 1 to 10, how likely are you to recommend [X] to a friend or colleague?”

NPS segments all responses between 1 and 10 into three categories based on their sentiment:
- Promoter (9 – 10)
- Passive (7 – 8)
- Detractor (1 – 6)

To calculate NPS, companies follow these steps:
Segment all responses into Promoter, Passive, and Detractor categories.
Calculate the percentage of responses in each category out of the total number of responses to the survey.
Subtract the Detractors percentage from the Promoters percentage. This is the NPS.

In other words, NPS can be calculated with this equation:
- NPS = (Promoters - Detractors)  (Promoters + Passives + Detractors)

NPS can range from –100 (if everyone is a detractor) to +100 (if everyone is a promoter).


## Questions to Consider Answering
In this scenario, you’ve just been given access to this data from your boss, with the instructions to “See if you can find anything in here that can help the business.” – This is a very broad set of instructions.  In order to complete this task well, you may want to consider finding answers to the following questions:

- How many more promoters are there than detractors across our 2017 data?
- Which track boasts the best promoter-to-detractor ratio?
- Does the student experience get better the longer that they are enrolled at the Summer Academy?
- Does student satisfaction vary by location?
- What are things we could find here that could “help the business”?
- What sorts of information does this dataset contain?
- What kinds of questions might we be able to answer with this data?
- What kinds of questions can’t we answer with this data?
- What sorts of information might be actionable?
- How can you present your findings in a way that non-technical employees can understand and use to make decisions?


In [25]:
import pandas as pd


df = pd.read_csv('Student Feedback Surveys-Superview.csv')
df

Unnamed: 0,ID,Location,Track,Week,Rating (Num),Schedule Pacing
0,134,San Francisco,"Apps, Explorer",Week 1,3,Just right
1,36,Los Angeles,Apps,Week 1,4,A little too fast
2,117,San Francisco,Games,Week 1,4,Way too slow
3,253,,,Week 2,4,A little too fast
4,350,New York City,"Apps, Explorer",Week 1,4,Just right
...,...,...,...,...,...,...
1448,1495,New York City,"Apps, Explorer",Week 7,10,Just right
1449,1496,New York City,"Apps, Explorer",Week 7,8,Just right
1450,1497,New York City,Apps,Week 7,10,Just right
1451,1498,New York City,"Apps, Explorer",Week 7,1,A little too slow


In [32]:
df['Rating (Num)'].value_counts()

8          392
9          384
10         376
7          177
6           59
5           35
4           13
3            8
#ERROR!      3
1            2
0            2
2            2
Name: Rating (Num), dtype: int64

In [51]:
score_10 = len(df[df['Rating (Num)'] == '10'])
score_10

376

In [52]:
score_9 = len(df[df['Rating (Num)'] == '9'])
score_9

384

In [53]:
promoter = score_10 + score_9
promoter

760

In [54]:
score_8 = len(df[df['Rating (Num)'] == '8'])
score_8

392

In [55]:
score_7 = len(df[df['Rating (Num)'] == '7'])
score_7

177

In [56]:
passive = score_8 + score_7
passive

569