# Practice! 

We've just gone through a basic description of how *random assignment* might assign people to be in the treatment or the control groups. 

We want you to take this point home: 

> Even when there is no treatment effect at all, random assignment means that the what we estimate in any particular experient might be larger or smaller than the truth.

This doesn't mean that the estimator doesn't *work*, per se, but rather than when we're in a world with finite data, in any particular instance we might be slightly high or low in the values that our **estimator** produces. 

## Define Functions 

Below, we copy versions of the functions that you just worked through. 

In [1]:
randomize <- function(size) { 
    ## this is a dead-simple randomizer 
    ## ---   
    ## inputs: the size of the population (must be even)
    ## ---
    ## output: a vector of 1s and 0s 
    
    list_of_assignments <- c(rep(0, size/2), rep(1, size/2))
    randomized_assignments <- sample(list_of_assignments)
    
    return(randomized_assignments)
} 

est_ate <- function(outcome, treat) { 
    ## this is a dead-simple estimator
    ## --- 
    ## inputs: a vector of outcomes and a vector of
    ## treatment assignments 
    ## --- 
    ## output: A difference in means estimate of the ATE
    
    mean(outcome[treat==1]) - mean(outcome[treat==0])
}

**QUESTION OF UNDERSTANDING**: 

1. Look at the way that we have defined the randomizer, called `randomize` above. Think (or read) back to section 2.5 that we previously read in _FEDAI_. Would you say that this randomizer is an example of:
  1. Complete Random Assignment; or, 
  2. Simple Random Assignment?

## Define Data 

As well, we include the definition of data from this section. 

In [2]:
d <- data.frame(
    group = c(rep("Man", 20), rep("Woman", 20)), 
    po_control = c(1:20, 51:70),
    po_treat   = c(1:20, 51:70)
    )
head(d)

group,po_control,po_treat
Man,1,1
Man,2,2
Man,3,3
Man,4,4
Man,5,5
Man,6,6


While the randomizer that we've used to this point might *work*, perhaps it is a little bit more limited than what we would like. 

In the next cell, look into how you might use random draws from a binomial distribution to create the control (0) and treatment (1) indicators. 

In [3]:
# ?rbinom

Use the `rbinom` function to do the following: 

- Create a vector that has length 40, and a 50% chance of being in treatment. Assign this into a variable on the data frame `d` called `treat_50`. 
- Create a vector that has length 40, and a 65% chance of being in treatment. Assign this into a variable on the data frame `d` called `treat_65`. 

In [4]:
d$treat_50 <- ""
d$treat_65 <- "" 

Suppose that you're living in Bill Murray's *Groundhog Day* movie and you can run the same experiment, on the same set of people, a possibly limitless number of times. 

In one of those *possible* experiments you use the `treat_50` indicator as your treatment assignment. In the other of those experiments you put 50% of people in treatment. In the other of those experiments you put 65% of people in treatment. 

- Are both of these assignments, **random**? 
- In any one *particular* experiment that we run, should the treatment effect under the two assginments produce the same estimate of the treatment effect? Why or why not? 
- Use the data that you've just produced in the data frame `d` to look at this. 

In [5]:
ate_50 <- "" # est_ate()
ate_65 <- "" # est_ate()

- Across a large number of these experiments, should the treatment effects under the two assigments produce estimates that are *in expectation* the same? Why or why not? 