# POLSCI 3

## Week 7, Activity 1: $p$-values in experiments

In this notebook, we will practice using the `estimatr` package further. Let's start by loading in the package and subsetting the data to the control or lobby groups. 

In [None]:
#RUN THIS CELL
library(testthat)
library(estimatr)

data <- read.csv('ps3_lobbying.csv')
head(data)

Here is a quick reminder of what each column means:

- `caseid`: Number that identifies each legislator/district
- `supportgroup`: This is the *outcome*. It is a measure of whether the legislator agreed to list their name publicly as a "sponsor" of the bill.
- `treat`: This is the *treatment*. It has several possible values:
    - `"control"`: the office received no contact from the lobbyist
    - `"officelobby"`: the legislator was asked to meet to discuss the bill in their office
    - `"sociallobby"`: the legislator was asked to meet to discuss the bill at a social location (a restaurant or bar)
- `ally`: The authors thought that social lobbying might be especially effective among legislators who had supported the group's priorities in the past. To measure this, they asked the lobbyist: "In your opinion, how well does the phrase ‘ally of the interest group’ describe the legislator?" This is therefore the lobbyists' rating of whether the legislator is an ally of the interest group (values 0, 1/3, 2/3, and 1).
- `female` : legislator gender, 1 = legislator is female; 0 = not

Again, here is what `treat` looks like:

In [None]:
table(data$treat)

-------------

The researchers who did this experiment wanted to argue that social lobbying is more effective than office-based lobbying. Let's see what the experiment says about this.

**Question 1.** Use the `difference_in_means` function to estimate the effect of social lobbying **relative to office-based lobbying** (that is, with office-based lobbying as the baseline). Make the legislators assigned to social lobbying the treatment group and the legislators assigned to office lobbying the baseline group.


In [None]:
social.vs.office <- NULL # YOUR CODE HERE
social.vs.office

---

**Question 2.** What does the $p$-value you calculated in Question 1 mean?

- `'a'`: The probability that social lobbying is more effective than office lobbying
- `'b'`: The probability that social lobbying is not more effective than office lobbying
- `'c'`: The probability that we would see an estimate as large or larger than the one we did in the experiment if social lobbying and office lobbying were equally effective
- `'d'`: The probability that we would see an estimate as large or larger than the one we did in the experiment if social lobbying were more effective than office lobbying

Replace the `...` below with your answer. For example, to answer `'a'`, below you would write `q2.answer <- 'a'`.


In [None]:
# Put your answer where the ... is below. Leave the quotes there.
# For example, to answer a, your answer would look like this: 'a'
q2.answer <- '...'

-------

**Question 3.** Is the p-value you calculated in Question 1 "statistically significant" (when using the conventional definition of statistical significance we covered in class)? Enter `TRUE` or `FALSE` below.


In [None]:
p.val.is.stat.sig <- NULL # YOUR CODE HERE

------

**Question 4.** Dr. Luvs To Lobby is Stanford's Director of Government Affairs. He was rejected from Berkeley many years ago, and has hated Berkeley ever since. And now, Berkeley was even ranked as the #1 college, above Stanford! He's outraged. 😡 He wants to lobby the California state legislature to cut Berkeley's budget.

Dr. Lobby is trying to decide how to go about lobbying California state legislators. He sees the results of the study on social lobbying you computed in Question 1 and writes this email to his colleagues:

> I just read an analysis of data from a new study that shows that social lobbying is no more effective than office lobbying. I know this because, when the researchers compared social lobbying to office lobbying, they found no statistically significant difference between the two. When we go to lobby the legislature this week, this study shows that we should lobby them in their offices, and *avoid social lobbying* if possible.

Is his interpretation correct? If so, why or why not?

- `'a'`: Dr. Lobby is correct: the insignificant $p$-value shows the researcher's hypothesis that social lobbying works is wrong
- `'b'`: Dr. Lobby is correct: the insignificant $p$-value shows that office lobbying is equally effective as social lobbying
- `'c'`: Dr. Lobby is incorrect: the insignificant $p$-value shows that we cannot disprove a skeptic of social lobbying, but our best guess is still that social lobbying is more effective than office lobbying
- `'d'`: Dr. Lobby is incorrect: because results from experiments are always ambiguous, we can't learn anything from the experiment about whether social lobbying is more effective

Replace the `...` below with your answer. For example, to answer `'a'`, below you would write `q4.answer <- 'a'`.


In [None]:
# Put your answer where the ... is below. Leave the quotes there.
# For example, to answer a, your answer would look like this: 'a'
q4.answer <- '...'

------

**YOU DO NOT NEED TO ANSWER QUESTION 5 IN THIS INDIVIDUAL ASSIGNMENT, BUT YOU WILL ANSWER IT IN TODAY'S GROUP ASSIGNMENT. TODAY'S GROUP ASSIGNMENT WILL LAST 10 MINUTES LONGER THAN NORMAL TO GIVE YOU TIME TO ANSWER QUESTION 5.**

**Question 5.** After graduation, you get a job at a law firm that specializes in constitutional law. The lawyer you are working with is getting ready to argue a case before the Supreme Court about a new law that restricts lobbying. But the lawyer has never taken a statistics class, so needs your help: the lawyer wants you to summarize the results of several recent studies for them, including the Grose et al. study we analyzed this week.

To help you, here's a reminder about all the findings we've calculated from this dataset across this and the previous notebooks:

In [None]:
# This is the estimated effect of social lobbying relative to control (no lobbying)
difference_in_means(supportgroup ~ treat, data, condition1 = 'control', condition2 = 'sociallobby')

In [None]:
# This is the estimated effect of office lobbying relative to control (no lobbying)
difference_in_means(supportgroup ~ treat, data, condition1 = 'control', condition2 = 'officelobby')

In [None]:
# This is the estimated effect of social lobbying relative to office lobbying, 
# this is what you did above.
social.vs.office

<!-- BEGIN QUESTION -->

Based on the results you've seen, summarize the findings of the study for the lawyer: what does the study say about the effectiveness of social lobbying, office-based lobbying, and any differences in the effectiveness of each? Remember that the lawyer hasn't taken statistics, so you need to try to explain your answer in a way an average person might understand.

**Please limit your answer to 3-5 sentences.**


_Type your answer here, replacing this text._

<!-- END QUESTION -->

----

## Submission

Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. The cell below will generate a zip file for you to submit.

In [None]:
ottr::export("Week7_Activity1.ipynb", pdf = TRUE, force_save = TRUE)