# Module 7: Random Sampling

In this module, you will learn how to randomly generate numbers in some range. This is useful for selecting a simple random sample in observational studies and for assigning individuals to treatments in experiments.

## Generating Random Numbers

R has many ways of generating random numbers. In this module, we focus on the "sample()" function. This function draws some number of values from a list.

The "sample()" function has two required inputs: "x" and "size". The "x" input is the list of values that we want to sample from. The "size" input is the number of values we want to sample. 

It is common to sample values between 1 and n, where n is some number. That means that we want "x" to equal a list containing all the numbers between 1 and n. We can generate this list in R using ":". The ":" function **(":" is really an operator rather than a function, but I think calling it that would be unnecessarily confusing)**.

Let's make a list of all the values from 1 to 100, then randomly sample 20 of them.

In [None]:
our.list = 1:100
sample(x=our.list, size=20)

Try re-running the last cell a few times. You will get a different answer each time. This is reasonable, but sometimes we want to be able to repeat our random sampling. We are able to do this because all randomness in R is actually pseudo-random. The procedure that R uses to generate pseudo-random numbers is actually not random at all, just really complicated. R uses a 'random seed' to determine where to start this complicated procedure for generating pseudo-random numbers. If we fix the random seed, then we will get the same 'random' numbers every time. We can fix the random seed in R using the "set.seed()" function. 

The "set.seed()" function has only one input: "seed". The "seed" input tells R what random seed to use, and it can be any number (typically people don't use more than eight digits). Remember that if a function has only one input we do not need to include the name of that input.

Let's again generate 20 numbers between 1 and 100, but this time we will set the random seed to 12345678.

In [None]:
set.seed(12345678)
sample(x=our.list, size=20)

You can run this cell as many times as you want; you will get the same answer every time. The "set.seed()" function allows us to make sure our work is reproducable, by making sure that we get the can get the same sample every time.