# Case Study 7. Crest Toothpaste

Shuchman and Riesz conducted a marketing study aimed at characterising
the purchasers and non-purchasers of Crest toothpaste. Purchasers were
defined as those households that converted to Crest following its
endorsement by the American Dental Association in August 1960, and
remained ''loyal'' to Crest until at least April, 1963. Non-purchasers
were defined as households that did not convert during the same period.
Using demographic data from 499 purchasers and 499 non-purchasers, they
demonstrated that household size (number of persons) and mean household
income were significantly larger for purchasers than non-purchasers. A
similar study utilised random samples of size 20 on the age of the
householder primarily responsible for toothpaste purchases. The
variables measured were:

\begin{align*}
\textbf{purchasers} &\quad\quad \textrm{the age of the person in the household responsible for purchases of Crest} \\
\textbf{nonpurchasers} &\quad\quad \textrm{the age of the person in the household responsible for purchases of other brands of toothpaste}
\end{align*}

In [None]:
install.packages("s20x")
library(s20x)
library(repr)
options(repr.plot.width=8, repr.plot.height=6)

In [None]:
data(toothpaste.df)
head(toothpaste.df)

In [None]:
# Combine the two columns into one vector of ages
ages = with(toothpaste.df, c(purchasers, nonpurchasers))
# Generate a vector of whether a person purchased Crest
buy = rep(c("Yes", "No"), c(20, 20))
# Rewrite, treating buy as a factor.
buy = factor(buy)

In [None]:
onewayPlot(ages ~ buy)

In [None]:
summaryStats(ages ~ buy)

In [None]:
# ages and buy are not in a dataframe but are column vectors in their own right, so we can
# refer to them directly
normcheck(lm(ages ~ buy))

In [None]:
eovcheck(ages ~ buy)

In [None]:
t.test(ages ~ buy, var.equal = TRUE)

## Methods and Assumption Checks

We have a numerical measurement made on two distinct groups, so we
should do a two-sample $t$-test.

We assume the customers are independent of one another. The equality of
variance and Normality assumptions looks to be satisfied (but there
seems to be slight evidence of left skewness). We can use the standard
two-sample $t$-test.

The model fitted is
${\tt ages}_{ij} = \mu + \alpha_i + \varepsilon_{ij}$, where
$\alpha_i$ is the effect of whether the person buys Crest toothpaste
or not, either changed or present, and
$\varepsilon_{ij} {\overset{\text{iid}}{\sim}} N(0, \sigma^2)$.

## Executive Summary

These data were collected to assess whether there was a difference,
between purchasers of Crest toothpaste compared to purchasers of other
brands, with respect to the age of the person responsible for household
toothpaste purchases.

We have observed that the age of the person in the household primarily
responsible for toothpaste purchases is, on average, younger for Crest
purchasers than for non-purchasers.

We estimate that the mean age for purchasers of Crest toothpaste is up
to 15.1 years younger than the mean age for non-purchasers of Crest
toothpaste.