# Numeracy Intervention Analysis File

This file is the companion of the manuscript titled "Numeracy intervention for remedying the impact of school mobility", submitted to the University of Sunderland as assessment of module EDE314 Experiences of Teaching".

In [35]:
import pandas as pd
import numpy as np
import statsmodels.api as sm

# Mapping from concepts to questions in the survey
labels2q = {'Success': (1, 2), 'Interest': (3, 4, 5), 'Utility': (6, 13, 7), 'Mastery': (8, 9), 'Performance': (10, 11)}
# questions that have a reversed scale (1-10)
reverse = 4

df = pd.read_csv('Survey Learning.csv')
questions = list(df.columns.values)
#reverse score of questions with the reversed scale
df.loc[:,questions[reverse]] = 10 - df[[reverse]]

# calculate the total scores for each group concept.
groups = pd.DataFrame()
for name, q in labels2q.items():
    print('Concept {} with {} questions'.format(name, len(q)))
    for qs in q:
        print(questions[qs].rjust(110,'*'))
    totalScore = df[list(q)].sum(axis=1)
    groups[name] = totalScore   


Concept Performance with 2 questions
*******************************************It is important for me to do well compared to others in this class.
********************************************************I just want to avoid getting a low grade in this class
Concept Mastery with 2 questions
********The most important thing for me in this course is to understand the content as thoroughly as possible.
****************************************Mastering the material in Introduction to robotics is important to me.
Concept Success with 2 questions
****************************************************************************I expect to do well in this class.
****************Considering the difficulty of this course and my skills, I think I will do well in this class.
Concept Interest with 3 questions
******************************************************************I think the field of robotics is interesting
***********************************************To be honest, I just don’t find science

In [36]:
# Show the concepts and scores
groups

Unnamed: 0,Performance,Mastery,Success,Interest,Utility
0,7,10,12,20,15
1,11,11,9,18,17
2,14,13,12,22,19
3,12,13,12,23,21
4,2,12,11,22,18
5,12,12,11,20,19
6,11,14,11,20,17
7,14,14,12,23,21
8,14,14,11,22,18
9,7,14,14,23,21


## Calculate Cronbach's Alpha

Cronbach's alpha ($\alpha$) is also include to assess the consistency of these values under the implicit assumption that the averaged correlation of a set of self-reported values is an accurate estimator of the set of items that belong to a certain construct. A Cronbach's alpha value close to one indicates that the set of values from the survey correspond to the same concept. Conversely, a small $\alpha$ value suggest that the survey has a low number of questions or poor interrelatedness \parencite{tavakol2011making}. $\alpha$ is calculated as follows:
$$\alpha = \frac{K}{K-1} \left(1-\frac{\sum_{i=1}^K \sigma^2_{Y_i}}{ \sigma^2_X}\right)$$

In [76]:
for name, q in labels2q.items():
    sigmaY = df[list(q)].var(axis=0)
    sigmaX = df[list(q)].sum(axis=1).var()
    K = len(q)
    alpha = K/(K-1)*(1-sigmaY.sum()/sigmaX)
    print('Concept: {:15} with Cronbach\' alpha of {:.3}'.format(name,alpha))


Concept: Performance     with Cronbach' alpha of 0.492
Concept: Mastery         with Cronbach' alpha of 0.798
Concept: Success         with Cronbach' alpha of 0.564
Concept: Interest        with Cronbach' alpha of 0.795
Concept: Utility         with Cronbach' alpha of 0.777


## Multiple regression modeling

The multiple regression model describes the score values as a weighted sum of the predictors. For example, the model for the two variables Success and Mastery and how they predict the scores is:

$Interest=\beta_0+\beta_1\cdot Success+\beta_2\cdot Mastery$

In [77]:
X = groups[['Success', 'Mastery']]
Y = groups[['Interest']]
## fit a OLS model with intercept on TV and Radio
X = sm.add_constant(X)
est = sm.OLS(Y, X).fit()

est.summary()

  "anyway, n=%i" % int(n))


0,1,2,3
Dep. Variable:,Interest,R-squared:,0.691
Model:,OLS,Adj. R-squared:,0.639
Method:,Least Squares,F-statistic:,13.41
Date:,"Wed, 21 Dec 2016",Prob (F-statistic):,0.000872
Time:,10:36:57,Log-Likelihood:,-19.785
No. Observations:,15,AIC:,45.57
Df Residuals:,12,BIC:,47.69
Df Model:,2,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
const,6.2352,2.957,2.109,0.057,-0.206 12.677
Success,0.5622,0.217,2.594,0.023,0.090 1.034
Mastery,0.6600,0.228,2.899,0.013,0.164 1.156

0,1,2,3
Omnibus:,0.016,Durbin-Watson:,1.51
Prob(Omnibus):,0.992,Jarque-Bera (JB):,0.233
Skew:,0.008,Prob(JB):,0.89
Kurtosis:,2.39,Cond. No.,201.0
