# Skeleton code for comparing group means

This notebook has basic code for running statistical tests that allow you to compare the mean of two groups (independent and paired) and of more than two groups.

The data relate to attitudes among members of the *contrade* (neighbourhoods) of Siena, Italy; see Section 4.2 in https://doi.org/10.31235/osf.io/hxyvc for more details.

In [None]:
# key modules
import pandas as pd
import numpy as np
# one of the following will import the relevant package
import scipy.stats as sp
# from scipy import stats
# from scipy.stats import ttest_ind

## load dataframe
df = pd.read_csv("https://raw.githubusercontent.com/adamrkenny/contrada-primer/refs/heads/main/data/attitudes/data-redacted.csv")
df.head()

## Comparing means of two groups (paired)

Let's look at the attitudes towards the in-group and the out-group among members of one *contrada* (called *Torre* or "Tower"). First, filter the relevant data, and take a look at the descriptive statistics.

In [None]:
# filter for one contrada
df_tor = df[df.contrada == "tor"]

# descriptive statistics
df_tor[["att_tor", "att_oca"]].describe()
# NB you can change att_oca to any other group

Each participant has an attitude towards the ingroup and the outgroup. Therefore we require a paired test (i.e. `ttest_rel` from `scipy`).

In [None]:
# compare ingroup (tor) to rival outgroup (oca)
t_test = sp.ttest_rel(df_tor["att_tor"], df_tor["att_oca"]) # , alternative = "two-sided")
t_test

In [None]:
# look at the helpfile for more detail
help(sp.ttest_rel)

## Comparing means of two groups (independent)

Let's look at the attitudes towards the in-group among members of two *contrade* (*Torre* or "Tower" and *Civetta* or "Owl"). First, filter the relevant data, and take a look at the descriptive statistics.

In [None]:
# filter for the other contrada
df_civ = df[df.contrada == "civ"]

# descriptive statistics
df_civ[["att_civ"]].describe()
# NB compare with df_tor[["att_tor"]]

Every participant is either from *Torre* or *Civetta*. Therefore we require an independent test (i.e. `ttest_ind` from `scipy`).

In [None]:
# compare two ingroups
t_test = sp.ttest_ind(df_tor["att_tor"], df_civ["att_civ"])
t_test

## Comparing means of multiple groups

Let's look at the attitudes towards the in-group members of three *contrade* (the two above and *Leocorno* or "Unicorn"). Let's take a look at the descriptive statistics.

In [None]:
# filter for the other contrada
df_leo = df[df.contrada == "leo"]

# descriptive statistics
pd.concat(
    [df_tor[["att_tor"]].describe(), 
     df_civ[["att_civ"]].describe(), 
     df_leo[["att_leo"]].describe()
    ],
    axis=1
)

In [None]:
# compare three (neutral) outgroups
anova = sp.f_oneway(df_tor["att_tor"], df_civ["att_civ"], df_leo["att_leo"])
anova

We have participants from three (independent) groups. Therefore we require an  ANOVA (i.e. `f_oneway` from `scipy`).