# Instrumental Variables

The relation between the demand for and the price of commodities is a simple yet widespread problem in economics. Health economics is concerned with the study of how health-affecting behavior of individuals is influenced by the health-care system and regulation policy. Probably the most prominent example in public policy debates is smoking as it is related to many illnesses and negative externalities.

It is plausible that cigarette consumption can be reduced by taxing cigarettes more heavily. The question is by how much taxes must be increased to reach a certain reduction in cigarette consumption. Economists use elasticities to answer this kind of question. Since the price elasticity for the demand of cigarettes is unknown, it must be estimated. An OLS regression of log quantity on log price cannot be used to estimate the effect of interest since there is simultaneous causality between demand and supply. Instead, IV regression can be used.

We use the data set CigarettesSW which comes with the package AER. It is a panel data set that contains observations on cigarette consumption and several economic indicators for all 48 continental federal states of the U.S. from 1985 to 1995. Following the book we consider data for the cross section of states in 1995 only.

We're interested in estimating $\beta_1$ in: 

$\log(Q_i^{cigarettes}) = \beta_0 + \beta_1 \log(P_i^{cigarettes}) + u_i$

where $Q_i^{cigarettes}$  is the number of cigarette packs per capita sold and $P_i^{cigarettes}$ is the after-tax average real price per pack of cigarettes in state $i$.

The instrumental variable we are going to use for instrumenting the endogenous regressor $log(P_i^{cigarettes})$ is `SalesTax`, the portion of taxes on cigarettes arising from the general sales tax. SalesTax is measured in dollars per pack. The idea is that SalesTax is a relevant instrument as it is included in the after-tax average price per pack. Also, it is plausible that SalesTax is exogenous since the sales tax does not influence quantity sold directly but indirectly through the price.

We perform some transformations in order to obtain deflated cross section data for the year 1995.

This is the code to to it in R:

```R
# compute real per capita prices
CigarettesSW$rprice <- with(CigarettesSW, price / cpi)

#  compute the sales tax
CigarettesSW$salestax <- with(CigarettesSW, (taxs - tax) / cpi)

# generate a subset for the year 1995
c1995 <- subset(CigarettesSW, year == "1995")

```

### (a) Correlation

Calculate the correlation between sales tax and price.  
What does this suggest about sales tax being an adequate instrument for price?

### (b) IV regression

Run an Instrumental Variable Regression to estimate the effect of cigarette price on cigarette consumption using sales tax as an instrument for cigarette price.