<h3> Course evaluations analysis </h3>

This notebook takes in course evaluation statistical data from two semesters and compares to see if there is a statistically significant difference between them. The notebook implements [Welch's t-test](https://en.wikipedia.org/wiki/Welch%27s_t-test) to test for significance.

In [64]:
from scipy.stats import t

def tStatistic(mean1, sd1, n1, mean2, sd2, n2):
    # Import the mean, standard deviation, and
    # number of samples from two data sets and
    # report the t-score and the degrees of freedom
    # derived from Welch-Satterthwaite.

    tScore = (mean1 - mean2) / (sd1**2/n1 + sd2**2/n2)**.5
    s1bar = sd1/n1**.5
    s2bar = sd2/n2**.5
    
    df = (s1bar**2 + s2bar**2)**2 / (s1bar**4/(n1-1) + s2bar**4/(n2-1))
    df = int(df)

    return [tScore, df]

def pValue(mean1, sd1, n1, mean2, sd2, n2):
    [tScore, df] = tStatistic(mean1, sd1, n1, mean2, sd2, n2)
    p = t.cdf(tScore, df)
    return min(p, 1 - p)
    

In [68]:
def test(level, springMeans, springSDs, springNs,
        fallMeans, fallSDs, fallNs):
    
    print(f'Testing for significance at level {level}\n')

    # Loop over the questions.
    for q in range(len(springMeans)):

        # Load the data
        mean1 = springMeans[q]
        mean2 = fallMeans[q]
        sd1 = springSDs[q]
        sd2 = fallSDs[q]
        n1 = springNs[q]
        n2 = fallNs[q]

        # No change: continue to the next iteration
        if mean1 == mean2:
            print(f'Question {q+1}: no change')
            continue

        # Otherwise, compute the p value
        p = pValue(mean1, sd1, n1, mean2, sd2, n2)

        # Decide if score went up or down
        if mean1 > mean2: 
            verb = 'increase'
        else:
            verb = 'decrease'

        # Decide if significant or not
        sig = ''
        if p < level: 
            sig = ' --  significant'
    
        # Print the output with p-value truncated to
        # two decimal places. Handle the printing 
        # differently if the p-value is too close to 0
        if p > 0.005:
            print(f'Question {q+1}: {verb}, p = {p:.2f} {sig}')
        else:
            print(f'Question {q+1}: {verb}, p < 0.01 {sig}')

In [69]:
# Data input goes here. There should be
# 14 entries in each array for the entire
# report, but as long as the same number
# is in each it will generate a report.

springMeans = [4, 5, 6]
springSDs = [1.2, 1.3, 1.5]
springNs = [35, 45, 20]

fallMeans = [5, 4.8, 5.5]
fallSDs = [1.6, 1.2, 1.0]
fallNs = [73, 100, 40]

test(0.10, springMeans, springSDs, springNs,
             fallMeans, fallSDs, fallNs)

Testing for significance at level 0.1

Question 1: decrease, p < 0.01  --  significant
Question 2: increase, p = 0.19 
Question 3: increase, p = 0.09  --  significant
