# Net promoter system
Many organisations across the world use the net promoter system.  At its simplest, NPS measures customer advocacy; how likely are you to recommend. 

Individual customers give a score from 0 - 10 based on how likely they are to recommend the brand to their friends and family.  Customers who score 9 and 10 are classified as promoters, 0 to 6 detractors and 7 and 8 are passives.  To calculate NPS, the proportion of detractors is subtracted from the proportion of promoters. 

![](https://trustmary.com/wp-content/uploads/2021/11/Trustmary-NPS-2.png)

This dataset is based off real distribution of scores from customers in a variety of countries.  First let's read in the data and inspect the columns



In [4]:
import plotly.graph_objs as go
import plotly.express as px
import numpy as np
import pandas as pd
from plotly.offline import plot, iplot, init_notebook_mode
init_notebook_mode(connected=True)

In [5]:
colourlist = ['#871e71','#b42479','#da3877','#eb676e','#f79473','#fbd395']

In [6]:
df1 = pd.read_csv('./NPStimeseries.csv')
print(df1.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total 7 columns):
 #   Column         Non-Null Count  Dtype 
---  ------         --------------  ----- 
 0   ID             5000 non-null   int64 
 1   Market         5000 non-null   object
 2   Survey date    5000 non-null   object
 3   Customer Name  5000 non-null   object
 4   Month          5000 non-null   int64 
 5   Quarter        5000 non-null   int64 
 6   NPS            5000 non-null   int64 
dtypes: int64(4), object(3)
memory usage: 273.6+ KB
None


There are 5000 rows of data.  Each customers has an ID, their name, the score they gave, and the date.  The month and quarter are pre-calulated to make aggregating by dates easier.  To make processing easier, first we will determine whether a customer is a promoter / passive / detractors.

In [7]:
conditions = [
    (df1['NPS'] <= 6),
    ((df1['NPS'] >6 )& (df1['NPS'] < 9)),
    (df1['NPS'] >= 9)
]

values = ['Detractor', 'Passive', 'Promoter']

df1['NPS category'] = np.select(conditions, values)
print(df1.head())

     ID Market Survey date    Customer Name  Month  Quarter  NPS NPS category
0  1000     US  01/09/2021  Krista Richards      9        3   10     Promoter
1  1001    MEX  07/11/2021      Monica King     11        4    9     Promoter
2  1002     UK  25/12/2021  Ricky Armstrong     12        4    0    Detractor
3  1003     UK  01/10/2021     Andrea Foley     10        4   10     Promoter
4  1004     UK  01/03/2021     Jerry Garcia      3        1    8      Passive



# NPS by market
First we want to understand, is there any difference in NPS by market?

In [8]:
dfmarket = df1.groupby(['Market','NPS category'])['ID'].count()
dfmarket = dfmarket.unstack()
dfmarket['Total'] = dfmarket.sum(axis = 1)
dfmarket = dfmarket.reset_index()
dfmarket['NPS score'] = (dfmarket['Promoter']/dfmarket['Total'] - dfmarket['Detractor']/dfmarket['Total'])*100
dfmarket = dfmarket.drop(columns = ['Detractor','Passive','Promoter','Total'])
fig = px.bar(dfmarket, x = 'Market', y = 'NPS score', title = 'NPS score by market', template= 'plotly_white', color = 'Market',color_discrete_map = {'MEX': '#871e71','UK':'#da3877' ,'US': '#fbd395'})
fig.show()



From the graph, we can see that the NPS for Mexico is higher than the UK, but not by much.  Next let's see if the scores have changed over time.


# NPS over time
Next we want to understand if NPS is getting better or worse over time.

In [14]:
dfmex = df1[df1['Market'] == 'MEX']
dfmex = dfmex.groupby(['Month','NPS category'])['ID'].count()
dfmex = dfmex.unstack()
dfmex['Total'] = dfmex.sum(axis = 1)
dfmex['NPS score'] = (dfmex['Promoter']/dfmex['Total'] - dfmex['Detractor']/dfmex['Total'])*100
dfmex = dfmex.drop(columns = ['Detractor','Passive','Promoter','Total'])
fig2 = px.line(dfmex, template = 'plotly_white', title = 'NPS for Mexico over time')
fig2.update_layout(showlegend = False)
fig2.update_traces(line = {'color':'#871e71'})
fig2.update_yaxes(title = 'NPS')
fig2.show()

We can see that the NPS has gone up over time for Mexico.

In [12]:
dfuk = df1[df1['Market'] == 'UK']
dfuk = dfuk.groupby(['Month','NPS category'])['ID'].count()
dfuk = dfuk.unstack()
dfuk['Total'] = dfuk.sum(axis = 1)
dfuk['NPS score'] = (dfuk['Promoter']/dfuk['Total'] - dfuk['Detractor']/dfuk['Total'])*100
dfuk = dfuk.drop(columns = ['Detractor','Passive','Promoter','Total'])
fig2 = px.line(dfuk, template = 'plotly_white', title = 'NPS for UK over time')
fig2.update_layout(showlegend = False)
fig2.update_traces(line = {'color':'#871e71'})
fig2.update_yaxes(title = 'NPS')
fig2.show()

In [13]:
dfus = df1[df1['Market'] == 'US']
dfus = dfus.groupby(['Month','NPS category'])['ID'].count()
dfus = dfus.unstack()
dfus['Total'] = dfus.sum(axis = 1)
dfus['NPS score'] = (dfus['Promoter']/dfus['Total'] - dfus['Detractor']/dfus['Total'])*100
dfus = dfus.drop(columns = ['Detractor','Passive','Promoter','Total'])
fig2 = px.line(dfus, template = 'plotly_white', title = 'NPS for US over time')
fig2.update_layout(showlegend = False)
fig2.update_traces(line = {'color':'#871e71'})
fig2.update_yaxes(title = 'NPS')
fig2.show()

In [15]:
import seaborn as sns

In [17]:
df1.head()

Unnamed: 0,ID,Market,Survey date,Customer Name,Month,Quarter,NPS,NPS category
0,1000,US,01/09/2021,Krista Richards,9,3,10,Promoter
1,1001,MEX,07/11/2021,Monica King,11,4,9,Promoter
2,1002,UK,25/12/2021,Ricky Armstrong,12,4,0,Detractor
3,1003,UK,01/10/2021,Andrea Foley,10,4,10,Promoter
4,1004,UK,01/03/2021,Jerry Garcia,3,1,8,Passive
