# CHAPTER 16. Metric-Predicted Variable on One or Two Groups

# Contents

* 16.1. Estimating the Mean and Standard Deviation of a Normal Distribution
* 16.2. Outliers and Robust Estimation : The t Distribution
* 16.3. Two Groups
* 16.4. Other Noise Distributions and Transforming Data
* 16.5. EXERCISES

In this chapter, we consider a situation in which we have a metric-predicted variable that is observed for items from one or two groups.

For example, we could measure the blood pressure (i.e., a metric variable) for people randomly sampled from first-year university students (i.e., a single group).

In this case, we might be interested in how much the group’s typical blood pressure differs from the recommended value for people of that age as published by a federal agency.

 As another example, we could measure the IQ (i.e., a metric variable) of people randomly sampled from everyone self-described as vegetarian (i.e., a single group). In this case, we could be interested in how much this group’s IQ differs from the general population’s average IQ of 100.

In the context of the generalized linear model (GLM) introduced in the previous chapter, this chapter’s situation involves the most trivial cases of the linear core of the GLM, as indicated in the left cells of Table 15.1 (p. 434), with a link function that is the

identity along with a normal distribution for describing noise in the data, as indicated in the first row of Table 15.2 (p. 443). We will explore options for the prior distribution on parameters of the normal distribution, and methods for Bayesian estimation of the parameters. We will also consider alternative noise distributions for describing data that have outliers.

<img src="figures/tbl15.1.png" width=600 />

<img src="figures/tbl15.2.png" width=600 />

# 16.1. Estimating the Mean and Standard Deviation of a Normal Distribution

* 16.1.1 Solution by mathematical analysis
* 16.1.2 Approximation by MCMC in JAGS

<img src="figures/eq16.1.png" width=600 />

To get an intuition for the normal distribution as a likelihood function, consider
three data values y1 = 85, y2 = 100, and y3 = 115, which are plotted as large dots
in Figure 16.1. 

Figure 16.1 shows p(D|μ, σ ) for different values of μ and σ . As you can see, there are values of μ and σ that make the data most probable, but other nearby values also accommodate the data reasonably well

<img src="figures/fig16.1.png" width=600 />

The question is, given the data, how should we allocate credibility to combinations of μ and σ?

<img src="figures/eq16.2.png" width=600 />

The prior, p(μ,σ), specifies the credibility of each combination of μ,σ values in the two-dimensional joint parameter space, without the data.

Bayes’ rule says that the posterior credibility of each combination of μ, σ values is the prior credibility times the likelihood, normalized by the marginal likelihood.

Our goal now is to evaluate Equation 16.2 for reasonable choices of the prior distribution, p(μ, σ ).

## 16.1.1 Solution by mathematical analysis

we take a short algebraic tour before moving on to MCMC implementations.

In [None]:
When σ is fixed, 

It is convenient first to consider the case in which the standard deviation of the likelihood function is fixed at a specific value. In other words, the prior distribution on σ is a spike over that specific value. We’ll denote that fixed value as σ = Sy.  

<img src="figures/eq16.3.png" width=600 />

<img src="figures/eq16.4.png" width=600 />

<img src="figures/eq16.5.png" width=600 />

<img src="figures/eq16.6.png" width=600 />

<img src="figures/cap16.1.png" width=600 />

<img src="figures/cap16.2.png" />

## 16.1.2 Approximation by MCMC in JAGS

<img src="figures/fig16.2.png" width=600 />

<img src="figures/fig16.3.png" width=600 />

# 16.2. Outliers and Robust Estimation : The t Distribution

* 16.2.1 Using the t distribution in JAGS
* 16.2.2 Using the t distribution in Stan

<img src="figures/fig16.4.png" width=600 />

<img src="figures/fig16.5.png" width=600 />

<img src="figures/fig16.6.png" width=600 />

## 16.2.1 Using the t distribution in JAGS

<img src="figures/fig16.7.png" width=600 />

<img src="figures/fig16.8.png" width=600 />

## 16.2.2 Using the t distribution in Stan

<img src="figures/fig16.9.png" width=600 />

<img src="figures/fig16.10-1.png" width=600 />

<img src="figures/fig16.10-2.png" width=600 />

# 16.3. Two Groups

* 16.3.1 Analysis by NHST

<img src="figures/fig16.11.png" width=600 />

## 16.3.1 Analysis by NHST

<img src="figures/fig16.12.png" width=600 />

# 16.4. Other Noise Distributions and Transforming Data

# 16.5. EXERCISES

### Exercise 16.1.

#### [Purpose: Practice using different data files in the high-level script, with an interesting real example about alcohol preference of sexually frustrated males.]

<img src="figures/fig16.13.png" width=600 />

# 참고자료