In [6]:
import pandas as pd
import scipy.stats as stats
import statsmodels.api as sm
from statsmodels.formula.api import ols

In [3]:
IPS_data = pd.read_excel("IPS_réponses.xlsx")
PSI_data = pd.read_excel("PSI_réponses.xlsx")
IPS_data.head(1)


Unnamed: 0,Horodateur,What year of study are you in currently ?,What is your section?,"Before this activity, how familiar are you with topics like electoral districts, redistricting or gerrymandering?",How confident are you in your understanding of electoral redistricting and gerrymandering before this activity?,Which of the following best describes gerrymandering?,"Suppose a country has two parties, A and B, and each gets 50% of the total votes. Which statement is most plausible?",Which of the following is not a strategy used in gerrymandering ?,"Imagine a region where Party X has 60% of the votes and Party Y has 40%. After drawing districts, Party Y wins 60% of the seats. Which explanation is most likely?",Which of the following descriptions best matches a map that you might suspect is gerrymandered?,What is the main goal of the partisan player in the game?,"In the simulation, how does the fair player's approach differ from the partisan player's approach?",A party’s opponents are spread across many districts so that they are always a minority and almost never win. Which strategy best describes this?,Which statement best describes gerrymandering?,What is a typical effect of gerrymandering on political representation?,What would be an example of a district map that could be considered gerrymandered?,Explain how you would create a district map that represents voters fairly. What strategies would you use? (In 1-2 sentences),"After completing this activity, how confident are you in your understanding of gerrymandering and redistricting ?",Colonne 19
0,2025-12-10 17:04:51.832,BA5,Computer Science,1,1,I have never heard this term / I don’t know.,I don't know,I don't know,I don’t know.,I don’t know.,To minimize the impact of party affiliation in...,The fair player tries to ensure their party wi...,Packing,Letting voters choose which district they belo...,It can lead to disproportionate representation...,A map where districts are compact and geograph...,Packing,1,


In [4]:
PSI_data.head(1)  

Unnamed: 0,Horodateur,What year of study are you in currently ?,What is your section?,"Before this activity, how familiar are you with topics like electoral districts, redistricting or gerrymandering?",How confident are you in your understanding of electoral redistricting and gerrymandering before this activity?,Which of the following best describes gerrymandering?,"Suppose a country has two parties, A and B, and each gets 50% of the total votes. Which statement is most plausible?",Which of the following is not a strategy used in gerrymandering ?,"Imagine a region where Party X has 60% of the votes and Party Y has 40%. After drawing districts, Party Y wins 60% of the seats. Which explanation is most likely?",Which of the following descriptions best matches a map that you might suspect is gerrymandered?,What is the main goal of the partisan player in the game?,"In the simulation, how does the fair player's approach differ from the partisan player's approach?",A party’s opponents are spread across many districts so that they are always a minority and almost never win. Which strategy best describes this?,Which statement best describes gerrymandering?,What is a typical effect of gerrymandering on political representation?,What would be an example of a district map that could be considered gerrymandered?,Explain how you would create a district map that represents voters fairly. What strategies would you use? (In 1-2 sentences),"After completing this activity, how confident are you in your understanding of gerrymandering and redistricting ?"
0,2025-12-12 10:27:55.156,MA3,Computer Science,1,1,I have never heard this term / I don’t know.,One party could still win more seats if the di...,I don't know,District boundaries were drawn in a way that f...,Districts are compact and each party’s seats a...,To maximize the number of districts won by the...,The fair player attempts to divide the map wit...,Cracking,Deliberately designing district boundaries to ...,It can lead to disproportionate representation...,A map with odd-shaped districts that maximize ...,Create the map without looking at voters distr...,3


In [11]:
PSI_data['Group'] = 'PSI'
IPS_data['Group'] = 'IPS'

combined_df = pd.concat([PSI_data, IPS_data], ignore_index=True)

correct_answers = {
    # Pre-Activity
    combined_df.columns[5]: "Drawing district lines to favor a specific political party or group.",
    combined_df.columns[6]: "One party could still win more seats if the district boundaries are drawn in a certain way.",
    combined_df.columns[7]: "Random district creation",
    combined_df.columns[8]: "District boundaries were drawn in a way that favors Party Y.",
    combined_df.columns[9]: "Districts have very irregular shapes, and one party wins many more seats than its share of votes.",
    # Post-Activity
    combined_df.columns[10]: "To maximize the number of districts won by their party",
    combined_df.columns[11]: "The fair player attempts to divide the map without any bias, while the partisan player tries to skew it in their favor.",
    combined_df.columns[12]: "Cracking",
    combined_df.columns[13]: "Deliberately designing district boundaries to benefit a particular party or group.",
    combined_df.columns[14]: "It can lead to disproportionate representation for one party.",
    combined_df.columns[15]: "A map with odd-shaped districts that maximize the influence of one party."
}

def calculate_score(row):
    score = 0
    for col, correct in correct_answers.items():
        if row[col] == correct:
            score += 1
    return score

combined_df['Total_Score'] = combined_df.apply(calculate_score, axis=1)


# One-way Anova based on the 2 groups
group_psi_scores = combined_df[combined_df['Group'] == 'PSI']['Total_Score']
group_ips_scores = combined_df[combined_df['Group'] == 'IPS']['Total_Score']

f_stat, p_val = stats.f_oneway(group_psi_scores, group_ips_scores)

print("One-Way ANOVA (PSI vs IPS)")
print(f"F-statistic: {f_stat:.4f}")
print(f"P-value: {p_val:.4f}")
if p_val < 0.05:
    print("Result: Significant difference between PSI and IPS.")
else:
    print("Result: No significant difference found.")
print("Here a bigger p_value means that there is no significant difference (p>0.5)\n")

# Two-way ANOVA (on familiarity and group)
col_familiarity = combined_df.columns[3] # Index 3 is Familiarity
combined_df = combined_df.rename(columns={col_familiarity: 'Familiarity'})

# Create the model: Score depends on Group + Familiarity
model = ols('Total_Score ~ C(Group) + C(Familiarity)', data=combined_df).fit()
anova_table = sm.stats.anova_lm(model, typ=2)

print("Two-Way ANOVA Table")
print(anova_table)
print("Here C(Group) means that there is no significant difference between both groups tested\n")
print("C(Familiarity) being high means that the initial knowledge does play a huge part. Which is logical.\n")
print("F(1,15) = 1.32, p=0.27 and F(3,15) = 3.71, p= 0.035 \n")

One-Way ANOVA (PSI vs IPS)
F-statistic: 1.6698
P-value: 0.2126
Result: No significant difference found.
Here a bigger p_value means that there is no significant difference (p>0.5)

Two-Way ANOVA Table
                   sum_sq    df         F    PR(>F)
C(Group)         5.323867   1.0  1.322476  0.268153
C(Familiarity)  44.806695   3.0  3.710071  0.035382
Residual        60.385224  15.0       NaN       NaN
Here C(Group) means that there is no significant difference between both groups tested

C(Familiarity) being high means that the initial knowledge does play a huge part. Which is logical.

F(1,15) = 1.32, p=0.27 and F(3,15) = 3.71, p= 0.035 

