The Titan Insurance Company has just installed a new incentive payment scheme for its life policy sales force. It wants to have an early view of the success or failure of the new scheme. Indications are that the sales force is selling more policies, but sales always vary in an unpredictable pattern from month to month and it is not clear that the scheme has made a significant difference.

Life Insurance companies typically measure the monthly output of a salesperson as the total sum assured for the policies sold by that person during the month. For example, suppose salesperson X has, in the month, sold seven policies for which the sums assured are £1000, £2500, £3000, £5000, £10000, £35000. X's output for the month is the total of these sums assured, £61,500. Titan's new scheme is that the sales force receives low regular salaries but are paid large bonuses related to their output (i.e. to the total sum assured of policies sold by them). The scheme is expensive for the company, but they are looking for sales increases which more than compensate. The agreement with the sales force is that if the scheme does not at least break even for the company, it will be abandoned after six months.

The scheme has now been in operation for four months. It has settled down after fluctuations in the first two months due to the changeover.

To test the effectiveness of the scheme, Titan have taken a random sample of 30 salespeople measured their output in the penultimate month prior to changeover and then measured it in the fourth month after the changeover (they have deliberately chosen months not too close to the changeover). The outputs of the salespeople are shown in Table 1

Questions:

    1. Find the mean of old scheme and new scheme column. (5 points)
    2. Use the five percent significance test over the data to determine the p value to check new scheme has significantly raised outputs? (10 points)
    3. What conclusion does the test (p-value) lead to? (2.5 points)
    4. Suppose it has been calculated that in order for Titan to break even, the average output must increase by £5000 in the scheme compared to the old scheme. If this figure is alternative hypothesis, what is:
        a) The probability of a type 1 error? (2.5 points)
        b) What is the p- value of the hypothesis test if we test for a difference of $5000? (10 points)
        c) Power of the test (5 points)

In [1]:
import numpy as np
import pandas as pd

In [2]:
titan = pd.read_csv("Titan_Insurance_Sales_Performance.csv")

In [3]:
titan.head()

Unnamed: 0,Sales_Person,Old_Scheme_Sales,New_Scheme_Sales
0,1,57,62
1,2,103,122
2,3,59,54
3,4,75,82
4,5,84,84


In [4]:
titan.describe().transpose()

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Sales_Person,30.0,15.5,8.803408,1.0,8.25,15.5,22.75,30.0
Old_Scheme_Sales,30.0,68.033333,20.45598,28.0,54.0,67.0,81.5,110.0
New_Scheme_Sales,30.0,72.033333,24.062395,32.0,55.0,74.0,85.75,122.0


1. Find the mean of old scheme and new scheme column. (5 points)

In [5]:
print("The mean value of the Old Scheme is",titan.Old_Scheme_Sales.mean())

The mean value of the Old Scheme is 68.03333333333333


In [6]:
print("The mean value of the New Scheme is ",titan.New_Scheme_Sales.mean())

The mean value of the New Scheme is  72.03333333333333


2. Use the five percent significance test over the data to determine the p value to check new scheme has significantly raised outputs? (10 points)

Answer:

In [7]:
from scipy.stats import ttest_rel
t_statistic, p_value = ttest_rel(titan.Old_Scheme_Sales,titan.New_Scheme_Sales)
# since the ttest_rel is meant for 2 tailed testing
# the p-value must be divided by two to apply 2tailed test to this scenario
print("The p-value of the test is",p_value/2)

import statsmodels.stats.power as p
nobs = titan.Old_Scheme_Sales.count() # number of observations is 30
alpha = 0.05
ef_n = titan.New_Scheme_Sales.mean() - titan.Old_Scheme_Sales.mean()
ef_d = (((30-1)*titan.Old_Scheme_Sales.var())+((30-1)*titan.New_Scheme_Sales.var()))/(30+30-1-1)
effect_size = ef_n / np.sqrt(ef_d)
print("The power of the above ttest is",p.ttest_power(effect_size,30,0.05,alternative = 'larger'))

The p-value of the test is 0.06528776980668831
The power of the above ttest is 0.24615579359381035


3. What conclusion does the test (p-value) lead to? (2.5 points)

Answer:

The null hypothesis is that the old scheme and the new scheme have similar variance.
The p-value is 0.0652877698, which is greater than the level of significance [0.05 or 5%].
Hence from the t_test we conclude that there is not enough evidence to reject the null hypothesis.
i.e. we are not rejecting the null hypothesis yet.
i.e. the old and the new schemes have similar variances (or) similar means and that
the new scheme has not significantly raised the outputs.



Also note that the power of this test is only 0.246 where as the norm for a model to be accepted is atleast 0.8.
This means that the confidence with which we donot reject the null hypothesis when it should not be rejected is only 0.246

4. Suppose it has been calculated that in order for Titan to break even, the average output must increase by £5000 in the scheme compared to the old scheme. If this figure is alternative hypothesis, what is:

a) The probability of a type 1 error? (2.5 points)

Answer: Since we have set the level of significance to 0.05, the probability of committing a Type 1 error is also 0.05

b) What is the p- value of the hypothesis test if we test for a difference of £5000? (10 points)

In [None]:
# The null hypothesis is that for Titan to break even the new scheme sales must be £5000 more than the old scheme sales
# i.e. New_Scheme_Sales.mean() - Old_Scheme_Sales.mean() = £5000
# The alternative hypothesis New_Scheme_Sales.mean() - Old_Scheme_Sales.mean() <> £5000
# Since the comparison is between a sample and a constant mean value, 1 sample testing suits this sccenario
# Since the data source table has values in short hand (in thousands but represented without zeroes)
# the constant value of mean is used as 5 instead of 5000
from scipy.stats import ttest_1samp
t_statistic, p_value = ttest_1samp(titan.New_Scheme_Sales-titan.Old_Scheme_Sales,5)
# The 1 Sample Test is a 2 tailed test. Applying that to a 1 tailed test requires dividing the p-value by 2
print("The p-value of the test for a difference of 5k is",p_value/2)
print("Conclusion:")
if p_value/2 > 0.05:
    print("Since the p-value of the test for a difference of 5k is greater than the level of significance we cannot reject the null hypothesis")
else:
    print("Since the p-value of the test for a difference of 5k is less than the level of significance we reject the null hypothesis")

c) Power of the test (5 points)

In [None]:
import statsmodels.stats.power as p

# Actual difference of mean [True Mean] - Expected difference of mean[Hypothesised Mean]
ef_n = (np.mean(titan.New_Scheme_Sales-titan.Old_Scheme_Sales)) - 5

# Standard Deviation of the difference of mean
ef_d = (titan.New_Scheme_Sales - titan.Old_Scheme_Sales).std()

effect_size = ef_n / ef_d

print("The power of the above ttest is",p.ttest_power(effect_size,30,0.05,alternative = 'larger'))

Conclusion:

Our model rejects the null hypothesis correctly with an accuracy of 0.0214.

But the accepted norms for the power of a model is 0.8. This indicates that the model is poorly built and that the confidence with which we donot reject the null hypothesis, when it should not be rejected is only 2%