# Lesson 4.6: Intro to Statistics & Distributions for random variables

### Lesson Duration: 3 hours

> Purpose: The purpose of this lesson is to introduce some concepts from statistics, such as _population_, _sample_, _random samples_, _random variables_, _bias_, and _variance_. We will then develop the idea of _continuous and discrete distributions from random variables_.

---

### Setup

- All previous set up

### Learning Objectives

After this lesson, students will be able to:

- Infer probability calculations and its logic
- Incorporate statistics technical vocabulary, including: _population_, _sample_, _random samples_, and _random variables (continuous and discrete)_
- Interpret continuous distributions and discrete distributions

---

### Lesson 1 key concepts

> :clock10: 20 min

- Population
- Samples and random samples
- Random variables

      - Continuous random variables
      - Discrete random variables

<details>
  <summary> Discussion: Statistics  </summary>

- A population is an aggregate/collection of creatures, things, cases, and so on. A population commonly contains too many individuals to study conveniently, so an investigation is often restricted to one or more samples drawn from it.
- The relation between the sample and the population is such that it allows inferences to be made about a population from that sample.
- A random sample means that the observations from the population are picked randomly and not without any bias.
- Differences between the population mean and the sample mean, population standard deviation, and sample standard deviation. etc.

      - The population mean and standard deviation are fixed (assumed to be fixed) and are called population parameters.
      - The sample mean, sample std. deviation varies every time we calculate them as a random sample will have different values every time. They are called sample statistics.

- A random variable, usually written `X`, is a variable whose possible values are numerical outcomes of a random phenomenon (usually the thing under observation/the thing that we are trying to measure); for eg. height people in the US, marks scored in a test, etc. This random variable can be either continuous or discrete in nature.

      - *Discrete random variable*: The set of values that his random variable can take are discrete (usually but not necessarily counts)
      - *Continuous random variable*: The set of values that his random variable can take are continuous

</details>

---

:coffee: **BREAK**

---

#### :pencil2: Check for Understanding - Class activity/quick quiz

> :clock10: 10 min (+ 10 min Review)

<details>
  <summary> Click for Instructions: Activity 1 </summary>

- Link to [activity 1](https://github.com/ironhack-edu/data_4.06_activities/blob/master/4.06_activity_1.md).

</details>

<details>
  <summary>Click for Solution: Activity 1 solutions</summary>

- Link to [activity 1 solution](https://gist.github.com/ironhack-edu/72416224343383b430e388e7701ff223).

</details>
    
---

:coffee: **BREAK**

---

### Lesson 2 key concepts

> :clock10: 20 min

- Introduction to **probability**

      - Experiment
      - Sample space
      - Random variables
      - Probability distributions

- Discrete distributions

      - Bernoulli's distribution
      - Binomial distribution

<details>
  <summary> Click for Description: Intro to probability </summary>

:exclamation: Note to instructor: You can use the examples of flipping a coin and rolling a dice to explain these concepts.

- **Probability theory** is concerned with determining the likelihood that a certain event will occur during a given random experiment.
- **Experiment** is any situation that involves observation or measurement. Random experiments are those which can have different outcomes regardless of the initial conditions and will be heretofore referred to simply as experiments.
- **Sample space** - The results obtained from an experiment are known as the outcomes. Sample space is the set of all possible outcomes for that experiment.
- **Events** - We can create a subset of the sample space called an event. We then enumerate all outcomes in the event. Each time the experiment is run, a given event A either occurs or does not occur. Intuitively, you should think of an event as a meaningful statement about the experiment.
- **Random Variable** is a real valued function on the sample space. It is a measurement of interest in the context of the random experiment. A random variable `X` is random in the sense that its value depends on the outcome of the experiment, which can't be predicted with certainty before the experiment is run. As mentioned before, the random variables can be either discrete or continuous.
- **Calculating probabilities** is the ratio of an event to the entire sample space.

</details>

<details>
  <summary> Click for Description: Discrete Distributions </summary>

- In the field of probability, a **distribution function** is a function that maps numerical values to probabilities.
- Discrete distributions arise from discrete random variables. Discrete random variables can take discrete values (usually but not necessarily counts).
- This means that a discrete probability distribution is characterized by having a finite or countably infinite number of outcomes in the sample space. The sum of probabilities for all outcomes in the sample space must add up to 1.

- **Bernoulli distribution** describes the outcome of a single yes or no event. An example of this distribution is a coin toss. We do not have to use a fair coin (a coin where there is a 50% chance of heads and 50% chance of tails). We can describe the probability of heads as `p` and say that the probability of tails is `1-p`.

- **Binomial distribution** - When n independent Bernoulli's experiments are conducted, it gives rise to a binomial distribution. Each of the `n` experiments is either a success or a failure with a probability of success `p`.

</details>

---

#### :pencil2: Check for Understanding - Class activity/quick quiz

> :clock10: 10 min (+ 10 min Review)

<details>
  <summary> Click for Instructions: Activity 2 </summary>

- Link to [activity 2](https://github.com/ironhack-edu/data_4.06_activities/blob/master/4.06_activity_2.md).

</details>

<details>
  <summary>Click for Solution: Activity 2 solutions</summary>

- Link to [activity 2 solution](https://gist.github.com/ironhack-edu/08250afabe8d960107b1d36eef8180a3).

</details>

---

:coffee: **BREAK**

---

### Lesson 3 key concepts

> :clock10: 20 min

- Discrete distributions

      - Geometric distribution

- Continuous Distributions I

      - Normal distribution
      - Properties of normal distribution
      - Standard normal distribution

<details>
  <summary> Description: Geometric distribution  </summary>

- Discuss Geometric distribution, its parameters, domain (set of values of `x` on which it is defined) and properties.

</details>

<details>
  <summary> Description: Normal distribution  </summary>

- Discuss normal distribution, its parameters, domain (set of values of `x` on which it is defined) and properties.
- Standard normal distribution.

</details>

---

:coffee: **BREAK**

---

### :pencil2: Check for Understanding - Class activity/quick quiz

> :clock10: 30 min

<details>
  <summary> Click for Instructions: Activity 3 </summary>

- Link to [activity 3](https://github.com/ironhack-edu/data_4.06_activities/blob/master/4.06_activity_3.md).

</details>

<details>
  <summary>Click for Solution: Activity 3 solutions</summary>

- Link to [activity 3 solution](https://gist.github.com/ironhack-edu/bd9215d3a7290c2d6fa4cd47178f3af8).

</details>

---

### Lesson 4 key concepts

> :clock10: 20 min

- Continuous Distributions II

      - Exponential distribution
      - Uniform distribution

<details>
  <summary> Description: Exponential distribution  </summary>

- Explain expo distribution, its parameters, domain (set of values of `x` on which it is defined) and properties.

</details>

<details>
  <summary> Description: Uniform distribution  </summary>

- Explain uniform distribution, its parameters, domain (set of values of `x` on which it is defined) and properties.

</details>

#### :pencil2: Check for Understanding - Class activity/quick quiz

> :clock10: 10 min (+ 10 min Review)

<details>
  <summary> Click for Instructions: Activity 4 </summary>

- Link to [activity 4](https://github.com/ironhack-edu/data_4.06_activities/blob/master/4.06_activity_4.md).

</details>

<details>
  <summary>Click for Solution: Activity 4 solutions</summary>

- Link to [activity 4 solution](https://gist.github.com/ironhack-edu/076c5a55572c2302f31410dd77bdf7a6).

</details>

---

### :pencil2: Practice on key concepts - Lab

> :clock10: 30 min

<details>
  <summary> Click for Instructions: Lab </summary>

- Link to the lab: [https://github.com/ironhack-labs/lab-random-variable-distributions](https://github.com/ironhack-labs/lab-random-variable-distributions)

</details>

<details>
  <summary>Click for Solution: Lab solutions</summary>

- Link to the [lab solution](https://gist.github.com/ironhack-edu/c2256544ac23383cedbbf878991ff11e).

</details>

---

### Additional Resources
