# Bayesian Treatment and Response - Part II

In the initial notebook, we got the basic ideas going. Now, we make the problem a little more complex by adding a dichotomous treatment. This introduces some nuances into the problem that highlight some of the points of interest of a Bayesian approach. So, let's discuss the new model. 

## Treatment and Control Groups

WE now discuss a case in which there are a treatment and control group. We might think about this more generally as an endogenous switching regression, where agents select into different groups in part based on things we observe about them. Let's take treatment to be a dichotomous variable $z = \{0,1\}$, and we observe $z=1$ if some underlying latent variable $z^*$, is greater than zero, and $z=0$ otherwise. 

As a practical matter, we think of the selection-into-treatment equation as a Probit model, where:
$$
z^* = \eta W + u_c + \epsilon_z
$$
And 
$$ \begin{array}{ccc}
z = 1 & \textrm{if} & \eta W + u_c + e_z > 0 \\
z = 0 & \textrm{otherwise} &
\end{array}
$$

Moreover, if $z=1$, the following equation explains the outcome $y$:
$$
y_1 = X\beta_1 + \pi_{1r}u_r + \pi_{1c}u_c + e_1
$$
and if $z=0$,
$$
y_0 = X\beta_0 + \pi_{0r}u_r + \pi_{0c}u_c + e_0
$$

The terms $u_r$, and $u_c$ are once again specified as standard normal random variables. These, once again, are unobserved latent factors that we shall consider parameters like all the other parameters in the model. Note that I have assumed there are two such factors. One, $u_c$ (the c is a mnemonic for "common"), induces correlation across outcomes and treatment. The other $u_r$ appears only in the outcome equations (the r is a mnemonic for "result"), and induces correlation across outcomes. 

Point of interest: in observational data, we never actually observe both outcomes for a given individual. I.e., we don't see what happens to a patient if he or she both takes and does not take a medication. But a Bayesian method allows us to consider this as well - the idea is to treat the unobserved outcome as yet another latent variable. 

To see how model parameters capture correlation in outcomes, note that the (unconditional) variance matrix of the outcomes and treatment is:

$$
\Sigma = \left[
\begin{array}{ccc}
\sigma_1^2 + \pi_{1r}^2+\pi_{1c}^2 & \pi_{1r}\pi_{0r}+\pi_{1c}\pi_{0c} & \pi_{1c} \\
                                   & \sigma_0^2 + \pi_{0r}^2+\pi_{0c}^2 & \pi_{0c} \\
                                   &      &  \sigma_z^2 + 1 
\end{array}
\right]
$$
Usually, we have to also assume that $\sigma_z=1$ because of the indeterminacy of the scale parameter in a probit model. 






So, given all of this, let's write the likelihood of everything in the model as follows:

$$
L = \frac{e^{-\frac{(y_1-X\beta_1-\pi_{r1}u_r-\pi_{c1}u_c)^2}{2\sigma_1^2}}}{\sqrt{2\pi \sigma_1^2}} \times \frac{e^{-\frac{(y_0-X\beta_0-\pi_{r0}u_r-\pi_{c0}u_c)^2}{2\sigma_0^2}}}{\sqrt{2\pi \sigma_0^2}} \times \\
\frac{e^{-\frac{(z_1-W\eta-u_c)^2}{2}}}{\sqrt{2\pi}} \times \textrm{Prior}
$$

As written, the likelihood above assumes that everything is known. Of course, many things in the problem are not, such as the unobserved outcome and also the actual value of the latent varaible $z$. But, in a Bayesian analysis, we just draw these variables along with everything else. Let's consider another `Stata` implementation of this model, where we draw unobserved outcomes along with everything else. 