# Comparing groups

In [1]:
import numpy as np
import pandas as pd

from scipy import stats

Health professionals warn that transmission of infectious diseases may occur during the traditional handshake greeting.
Two alternative methods of greeting (popularized in sports) are the high five and the first bump.
Researchers compared the hygiene of these alternative greetings in a designed study and reported the results in the American Journal of Infection Control (Aug. $2014$).
A sterile-gloved hand was dipped into a culture of bacteria, then made contact for three seconds with another sterile-gloved hand via either a handshake, high five, or fist bump.
The researchers then counted the number of bacteria present on the second, recipient, gloved hand.
This experiment was replicated five times for each contact method.
Simulated data (recorded as a percentage relative to the mean of the handshake), based on information provided by the journal article, are provided in the table.

In [2]:
X = pd.DataFrame({
    'Handshake': [131, 74, 129, 96, 92],
    'High five': [44, 70, 69, 43, 53],
    'Fist bump': [15, 14, 21, 29, 21]
})

X

Unnamed: 0,Handshake,High five,Fist bump
0,131,44,15
1,74,70,14
2,129,69,21
3,96,43,29
4,92,53,21


In [3]:
n, _ = X.shape
n

5

(a)
The researchers reported that more bacteria were transferred during a handshake compared with a high five.
Use a $95 \%$ confidence interval to support this statement statistically.

In [4]:
Y = X['Handshake'] - X['High five']
Y

0    87
1     4
2    60
3    53
4    39
dtype: int64

In [5]:
Y.mean(), Y.std()

(48.6, 30.435177016077958)

Let $\alpha = 0.05$, then $0.95 \% = 100 (1 - \alpha) \%$.

In [6]:
alpha = 0.05

Our $100 (1 - \alpha) \%$ CI reads ...

In [7]:
T = Y.mean()
V = Y.std() / np.sqrt(n)

bounds = [T + sign * stats.norm.ppf(1 - alpha / 2) * V for sign in [-1, 1]]
bounds

[21.922881318969367, 75.27711868103063]

We could also write the bounds as

In [8]:
print(f'{T} +/- {stats.norm.ppf(1 - alpha / 2) * V}.')

48.6 +/- 26.677118681030635.


(b)
The researchers also reported that the first bump gave a lower transmission transmission of bacteria than the high five.
Use a $95 \%$ confidence interval to support this statement statistically.

In [9]:
Y = X['Fist bump'] - X['High five']
Y

0   -29
1   -56
2   -48
3   -14
4   -32
dtype: int64

In [10]:
Y.mean(), Y.std()

(-35.8, 16.52876280911551)

Our $100 (1 - \alpha) \%$ CI reads ...

In [11]:
T = Y.mean()
V = Y.std() / np.sqrt(n)

bounds = [T + sign * stats.norm.ppf(1 - alpha / 2) * V for sign in [-1, 1]]
bounds

[-50.287833170033636, -21.31216682996636]

We could also write the bounds as

In [12]:
print(f'{T} +/- {stats.norm.ppf(1 - alpha / 2) * V}.')

-35.8 +/- 14.487833170033639.


(c)
Based on the results, parts (a) and (b), which greeting method would you recommend as being the most hygienic?

One greeting is more hygenic than another, iff less bacteria are transferred.
According to (b), the fist bump is more hygenic than the high five, which, according to (a) is more hygenic than the handshake.
Surprise, surprise:
Indeed the fist bump is the most hygenic greeting.