In [38]:
import pandas as pd
from scipy.stats import shapiro, mannwhitneyu

# Load the Excel file
file_path = './Programming Skills and LLM Impact (Responses).xlsx'
xls = pd.ExcelFile(file_path)

# Load the pretest and posttest data
pretest_data = pd.read_excel(xls, 'Pretest')
posttest_data = pd.read_excel(xls, 'Posttest')

In [39]:
# Remove the timestamp column
pretest_data.drop(columns=['Timestamp'], inplace=True)
posttest_data.drop(columns=['Timestamp'], inplace=True)
columns = pretest_data.columns.to_list()

# Ensure the number of rows match
assert len(pretest_data) == len(posttest_data), "Mismatch in number of subjects between pretest and posttest data"


## Display Descriptive Statistics
Only run the following two cells if you are interested in the detailed statistics.

In [53]:
# Extract descriptive statistics from Pretest and Posttest
pretest_means = [round(x,2) for x in pretest_data.mean().to_list()]
pretest_stds = [round(x,2) for x in pretest_data.std().to_list()]

posttest_means = [round(x,2) for x in posttest_data.mean().to_list()]
posttest_stds = [round(x,2) for x in posttest_data.std().to_list()]

# Build Pandas containing the descriptive statistics
desc_df = pd.DataFrame()
desc_df['Question'] = columns
desc_df['Pretest Mean'] =  pretest_means
desc_df['Pretest Std'] =  pretest_stds
desc_df['Posttest Mean'] =  posttest_means
desc_df['Posttest Std'] =  posttest_stds

# Display the descriptive statistics
desc_df.head(20)

Unnamed: 0,Question,Pretest Mean,Pretest Std,Posttest Mean,Posttest Std
0,Rate your current comfort level with Python pr...,2.61,0.85,3.17,1.2
1,I understand the basic syntax and constructs o...,3.28,1.32,3.72,0.83
2,I am comfortable with object-oriented programm...,2.72,1.18,3.33,1.03
3,I have a solid understanding of error handling...,2.39,1.14,3.06,0.94
4,I am confident in my ability to manipulate and...,2.61,1.29,3.5,1.04
5,I can effectively visualize data using Python ...,2.78,1.4,3.44,1.15
6,I understand and can implement data structures...,2.17,1.25,2.83,1.42
7,I am skilled in writing efficient and optimize...,2.11,1.37,3.0,1.41
8,I feel confident in applying Python programmin...,2.67,1.24,3.44,1.04
9,I can effectively use Python for web scraping ...,1.67,0.91,3.72,1.07


## Test for Normality
While a pretest-posttest analysis does not necessarily assume normality, we need to determine whether to use a parametric or not parametric test.

In [43]:
# Function to perform Shapiro-Wilk test for normality on each column
def normality_test(dataframe):
    normality_results = {}
    for column in dataframe.columns[1:]:  # Skipping the timestamp column
        stat, p_value = shapiro(dataframe[column])
        normality_results[column] = (stat, p_value)
    return normality_results

def get_normality_test_results(df):
    tests = normality_test(df)
    for test in tests.keys():
      question = test[:40]
      p_value = round(pretest_normality[test][1],4)
      significance = 'Not Normally Distributed' if p_value < 0.05 else 'Normally Distributed'
      print(f'Q:{question} - p_value: {p_value} - {significance}')

In [44]:
get_normality_test_results(pretest_data)
print()
get_normality_test_results(posttest_data)


Q:I understand the basic syntax and constr - p_value: 0.0255 - Not Normally Distributed
Q:I am comfortable with object-oriented pr - p_value: 0.1533 - Normally Distributed
Q:I have a solid understanding of error ha - p_value: 0.0021 - Not Normally Distributed
Q:I am confident in my ability to manipula - p_value: 0.0688 - Normally Distributed
Q:I can effectively visualize data using P - p_value: 0.0345 - Not Normally Distributed
Q:I understand and can implement data stru - p_value: 0.0033 - Not Normally Distributed
Q:I am skilled in writing efficient and op - p_value: 0.0011 - Not Normally Distributed
Q:I feel confident in applying Python prog - p_value: 0.0292 - Not Normally Distributed
Q:I can effectively use Python for web scr - p_value: 0.0 - Not Normally Distributed

Q:I understand the basic syntax and constr - p_value: 0.0255 - Not Normally Distributed
Q:I am comfortable with object-oriented pr - p_value: 0.1533 - Normally Distributed
Q:I have a solid understanding of error ha - p

Most of the pretest data **does not follow a normal distribution**, which justifies the use of non-parametric tests like the Mann-Whitney U test for the pretest-posttest analysis.

## Executing Pretest-Posttest Analysis
Due to anonymization of the surveys we could not pair the responses received during the Pretest and Posttest.

Because **$n_{Pretest} = 20$** and **$n_{Posttest} = 18$** we removed, from the Pretest dataset, the subjects with the highest and minimum average score to balance the sample sizes. This method maintains the integrity of the data by focusing on the central tendency of the responses.

**NOTE:** This step was executed manually prior the analysis of the data. See the original dataset in which we highlighted the two records from the 'Pretest' worksheet selected for removal.

In [58]:
# Perform Mann-Whitney U test for each question
def mannwhitney_test(pretest_df, posttest_df):
    mannwhitney_results = {}
    for column in pretest_df.columns:
        if column in posttest_df.columns:
            stat, p_value = mannwhitneyu(pretest_df[column], posttest_df[column], alternative='two-sided')
            mannwhitney_results[column] = (stat, p_value)
    return mannwhitney_results

# Conduct Mann-Whitney U test for pretest and posttest data
mannwhitney_results = mannwhitney_test(pretest_data, posttest_data)

# Display Mann-Whitney U test results
print("\nMann-Whitney U Test Results:")
for test in mannwhitney_results.keys():
      question = test[:60]
      stat_value = round(mannwhitney_results[test][0],4)
      p_value = round(mannwhitney_results[test][1],4)
      significance = 'Posttest significantly different.' if p_value < 0.05 else 'No Difference'
      print(f'Q:{question} - Stat: {stat_value} - p_value: {p_value} - {significance}')



Mann-Whitney U Test Results:
Q:Rate your current comfort level with Python programming on a - Stat: 113.0 - p_value: 0.111 - No Difference
Q:I understand the basic syntax and constructs of Python, incl - Stat: 127.0 - p_value: 0.259 - No Difference
Q:I am comfortable with object-oriented programming concepts i - Stat: 111.5 - p_value: 0.1001 - No Difference
Q:I have a solid understanding of error handling and debugging - Stat: 98.0 - p_value: 0.0345 - Posttest significantly different.
Q:I am confident in my ability to manipulate and analyze data  - Stat: 96.0 - p_value: 0.0335 - Posttest significantly different.
Q:I can effectively visualize data using Python libraries, wit - Stat: 117.0 - p_value: 0.147 - No Difference
Q:I understand and can implement data structures like linked l - Stat: 116.5 - p_value: 0.1389 - No Difference
Q:I am skilled in writing efficient and optimized Python code, - Stat: 102.0 - p_value: 0.0521 - No Difference
Q:I feel confident in applying Python programmi