C3 week1 (#35)

* new changes
akkefa · Nov 1, 2022 · eec78f0 · eec78f0
1 parent cb97816
commit eec78f0
Show file tree

Hide file tree

Showing 5 changed files with 257 additions and 73 deletions.
diff --git a/docs/mathematical_notation.md b/docs/mathematical_notation.md
@@ -6,19 +6,19 @@
 ## Probability and Statistics
 
 ```{list-table}
-:widths: 20 20 60
+:widths: 20 70 10
 :header-rows: 1
 :align: "center"
 
 * - Symbol
   - Formula
-  - Meaning
+  - Article
 * - $\mu$
-  - $\sum_{x} k P(X=x)$
-  - Mean | Expected Value | Waited Average | First Moment Generating Function
+  - $\sum_{x} k P(X=x) = \int_{-\infty}^{\infty} x f(x) d x$
+  - [🔗](expected-value)
 * - $V(X)$ or $\sigma^2$ 
-  - $E[(X - \mu)^2]$
-  - Variance of X
+  - $E[(X - E[X])^2] = E[(X - \mu)^2]  = E[X^2] - E[X]^2$
+  - [🔗](variance-link)
 * - $\sigma$
   - $\sqrt{V(X)}$
   - Standard deviation
@@ -29,22 +29,4 @@
   - The sample 
   - The sample mean is an average value
 
-```
-
-
-## Linear Algebra
-
-| Symbol        | Meaning                            |
-|---------------|------------------------------------|
-| x             | A single number, lowercase, italic |
-| $x$           | A vector, bold, lowercase, italic  |
-| $X$           | A matrix, bold, uppercase, italic  |
-| $\textbf{X}$  | A tensor, bold, uppercase          |
-| $X^T$         | Transpose of matrix X              |
-| $X^{-1}$      | Inverse of X                       |
-| $I$           | Identity matrix                    |
-| $X*Y$         | Element-wise product of X and Y    |
-| $X \otimes Y$ | Kronecker product of X and Y       |
-| $x \cdot y$   | Dot product of x and y             |
-| $tr(X)$       | Trace of X                         |
-| $det(X)$      | Determinant of X                   |
+```
diff --git a/docs/probability/continuous_distributions.md b/docs/probability/continuous_distributions.md
@@ -412,7 +412,7 @@ The QQ Plot allows us to see deviation of a normal distribution much better than
 The normal distribution with parameter values $\mu$ = 0 and $\sigma^2$ = 1 is called the standard normal
 distribution.
 
-A rv with the standard normal distribution is customarily denoted by $Z \sim N(0, 1)$
+A rv with the standard normal distribution is denoted by $Z \sim N(0, 1)$
 
 If $X \sim N\left(\mu, \sigma^2\right)$ then
 
@@ -430,12 +430,17 @@ $$
 
 $f_{Z}(x)=\frac{1}{\sqrt{2 \pi}} e^{-x^{2} / 2} \text { for }-\infty<x<\infty$
 
-### CDF
-
+### Cumulative distribution function
 We use special notation to denote the cdf of the standard normal curve
 
 $F(z)=\Phi(z)=P(Z \leq z)=\int_{-\infty}^{z} \frac{1}{\sqrt{2 \pi}} e^{-x^{2} / 2} d x$
 
+```{image} https://cdn.mathpix.com/snip/images/0pNOOMfnNhB8v3JJyGL6KB4SuVh3NhdSqz2oQJsiQTA.original.fullsize.png
+:align: center
+:alt: Cumulative distribution function for Normal distribution
+:width: 80%
+```
+
 ### Properties
 
 1. The standard normal density function is symmetric about the y axis.
@@ -451,6 +456,9 @@ $\text { If } X \sim N\left(\mu, \sigma^{2}\right), \text { then } \frac{X-\mu}{
 $\frac{X-\mu}{\sigma}$ Shifted by $\mu$ or (Centered at zero) and scaled by $\frac{1}{\sigma}$ that
 will give us variance of 1.
 
+
+$\mathrm{Z} \sim \mathrm{N}(0,1) \Rightarrow \sigma \mathrm{Z}+\mu \sim \mathrm{N}\left(\mu, \sigma^2\right)$
+
 ### Proving this proposition
 
 For any continuous random variable. Suppose we have Y rv, with Desnity function $f_{Y}(y)$
@@ -513,6 +521,14 @@ plt.show()
 
 ```
 
+$$
+\begin{aligned} \mathrm{P}(\mathrm{X} \leq 2) &=\mathrm{P}\left(\frac{\mathrm{X}-\mu}{\sigma} \leq \frac{2-3}{\sqrt{2}}\right) \\
+&=\mathrm{P}(\mathrm{Z} \leq 1.21) \\
+& \approx 0.30 \end{aligned}
+$$
+
+R code: pnorm(1.2)
+
 #### Interval between variables
 To find the probability of an interval between certain variables, you need to subtract cdf from another cdf.
 

diff --git a/docs/probability/hypothesis_testing.md b/docs/probability/hypothesis_testing.md
@@ -8,39 +8,43 @@ kernelspec:
 ```
 
 # Hypothesis Testing
-
-## What is Hypothesis Testing?
+Statistical inference is the process of learning about characteristics of a population based
+on what is observed in a relatively small sample from that population. A sample will never give us the
+entire picture though, and we are bound to make incorrect decisions from time to time. We
+will learn how to derive and interpret appropriate tests to manage this error and how to
+evaluate when one test is better than another. we
+will learn how to construct and perform principled hypothesis tests for a wide range of
+problems and applications when they are not.
+
+## What is Hypothesis
 - Hypothesis testing is an act in statistics whereby an analyst tests an assumption regarding a population parameter.
 - Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is most
   often used by scientists to test specific predictions, called hypotheses, that arise from theories.
 
+:::{note}
+Due to random samples and randomness in the problem, we can different errors in our hypothesis testing. These errors are
+  called Type I and Type II errors.
+:::
 
-### Steps
-
-There are 5 main steps in hypothesis testing:
-
-1. State your research hypothesis as a null hypothesis and alternate hypothesis (Ho) and (Ha or H1).
-2. Collect data in a way designed to test the hypothesis.
-3. Perform an appropriate statistical test.
-4. Decide whether to reject or fail to reject your null hypothesis.
-5. Present the findings in your results and discussion section.
+## Type of hypothesis testing
+Let $X_1, X_2, \ldots, X_n$ be a [random sample](random-sample) from the normal distribution with mean $\mu$ and
+variance $\sigma^2$
 
-#### State your null and alternate hypothesis
-After developing your initial research hypothesis it is important to restate it as a null ($H_0$) and alternate ($H_1$)
-hypothesis so that you can test it mathematically.
+```{code-cell}
+import torch
+import matplotlib.pyplot as plt
+import seaborn as sns
+from scipy.stats import norm
 
-- The alternate hypothesis is usually your initial hypothesis that predicts a relationship between variables.
-- The null hypothesis is a prediction of no relationship between the variables you are interested in.
 
-You want to test whether there is a relationship between gender and height. Based on your knowledge of human physiology,
-you formulate a hypothesis that men are, on average, taller than women. To test this hypothesis, you restate it as:
+sns.set_theme(style="darkgrid")
+sample = torch.normal(mean = 8, std = 16, size=(1,1000))
 
-Ho: Men are, on average, not taller than women.
-Ha: Men are, on average, taller than women.
-
-##### Example
-Let $X_1, X_2, \ldots, X_n$ be a random sample from the normal distribution with mean $\mu$ and variance $\sigma^2$
+sns.displot(sample[0], kde=True, stat = 'density',)
+plt.axvline(torch.mean(sample[0]), color='red', label='mean')
 
+plt.show()
+```
 Example of random sample after it is observed:
 
 $$
@@ -60,6 +64,20 @@ or \\
 
 $$
 
+This is below 3 , but can we say that $\mu<3$ ?
+
+This seems awfully dependent on the random sample we happened to get!
+Let's try to work with the most generic random sample of size 8:
+
+$$
+X_1, X_2, X_3, X_4, X_5, X_6, X_7, X_8
+$$
+
+Let $\mathrm{X}_1, \mathrm{X}_2, \ldots, \mathrm{X}_{\mathrm{n}}$ be a random sample of size $\mathrm{n}$ from the $\mathrm{N}\left(\mu, \sigma^2\right)$ distribution.
+
+$$
+\mathrm{X}_1, \mathrm{X}_2, \ldots, \mathrm{X}_{\mathrm{n}} \stackrel{\text { iid }}{\sim} \mathrm{N}\left(\mu, \sigma^2\right)
+$$
 
 The Sample mean is 
 
@@ -71,18 +89,162 @@ $$
 - We're going to tend to think that $\mu>3$ when $\bar{X}$ is "significantly" larger than 3.
 - We're never going to observe $\bar{X}=3$, but we may be able to be convinced that $\mu=3$ if $\bar{X}$ is not too far away.
 
+
+**How do we formalize this stuff, We use hypothesis testing**
+
 Hypotheses:
 
-$\mathrm{H}_0: \mu \leq 3$
-$\mathrm{H}_1: \mu>3 \quad$ alternate
+$\mathrm{H}_0: \mu \leq 3$ <- Null hypothesis \
+$\mathrm{H}_1: \mu>3 \quad$ Alternate hypothesis
+
+### Null hypothesis
+The null hypothesis is assumed to be true. 
 
-hypothesis
-- The null hypothesis is assumed to be true.
-- The alternate hypothesis is what we are out to show.
+### Alternate hypothesis
+The alternate hypothesis is what we are out to show.
 
-Conclusion is either:
+**Conclusion is either**:\
 Reject $\mathrm{H}_0 \quad$ OR $\quad$ Fail to Reject $\mathrm{H}_0$
 
-#### Errors in Hypothesis Testing
 
-##### Type I Error
+#### simple hypothesis
+A simple hypothesis is one that completely specifies the distribution. Do you know the exact distribution.
+
+#### composite hypothesis
+You don't know the exact distribution.\
+Means you know the distribution is normal but you don't know the mean and variance.
+
+#### Critical values
+
+```{image} https://cdn.mathpix.com/snip/images/VhPT2BPUY6gNGGTSOLvZuK6iXJSLNFeOwMU3aI8Droc.original.fullsize.png
+:align: center
+:alt: Critical values in Hypothesis Testing
+:width: 80%
+```
+
+```{image} https://cdn.mathpix.com/snip/images/M8w97dpXZ9nyvOgbPEuDObaVI9gS7Qmrt9gW7GHZeYs.original.fullsize.png
+:align: center
+:alt: Critical values example
+:width: 80%
+```
+
+## Errors in Hypothesis Testing
+
+Let $X_1, X_2, \ldots, X_n$ be a random sample from the normal distribution with mean $\mu$ and variance $\sigma^2=2$
+
+$$
+H _0: \mu \leq 3 \quad H _1: \mu>3
+$$
+
+**Idea**: Look at $\bar{X}$ and reject $H_0$ in favor of $H _1$ if $\overline{ X }$ is "large".\
+i.e. Look at $\bar{X}$ and reject $H_0$ in favor of $H _1$ if $\overline{ X }> c$ for some value $c$.
+
+
+```{image} https://cdn.mathpix.com/snip/images/JeCsNYRlM6qG5RBLyuckje_opt6MoxGFvrmOe5QyfT0.original.fullsize.png
+:align: center
+:alt: Errors in Hypothesis Testing
+:width: 80%
+```
+
+```{image} https://cdn.mathpix.com/snip/images/CQje4JfzfdpSnlrWFvGHbbIsWFMq67TI7pIRUiyzTF4.original.fullsize.png
+:align: center
+:alt: Errors in Hypothesis Testing
+:width: 80%
+```
+
+You are a potato chip manufacturer and you want to ensure that the mean amount in 15 ounce bags is at least 15 ounces.
+$\mathrm{H}_0: \mu \leq 15 \quad \mathrm{H}_1: \mu>15$
+
+### Type I Error
+The true mean is $\leq 15$ but you concluded i was $>15$. You are going to save some money because you won't be adding
+chips but you are risking a lawsuit!
+
+### Type II Error
+The true mean is $> 15$ but you concluded it was $\leq 15$ . You are going to be spending money increasing the amount
+of chips when you didn’t have to.
+
+## Developing a Test
+Let $X_1, X_2, \ldots, X_n$ be a random sample from the normal distribution with mean $\mu$ and known variance $\sigma^2$.
+
+Consider testing the simple versus simple hypotheses
+
+$$
+\begin{aligned}
+& H _0: \mu=5 \\
+& H _1: \mu=3
+\end{aligned}
+$$
+
+### level of significance
+
+Let $\alpha= P$ (Type I Error) \
+$= P \left(\right.$ Reject $H _0$ when it's true $)$ \
+$= P \left(\right.$ Reject $H _0$ when $\left.\mu=5\right)$
+
+$\alpha$ is called the level of significance of the test. It is also sometimes referred to as the size of the test.
+
+### Step One
+Choose an estimator for μ.
+
+$$ 
+\widehat{\mu}=\bar{X}
+$$
+
+### Step Two
+
+Choose a test statistic or Give the “form” of the test.
+
+- We are looking for evidence that $H _1$ is true.
+- The $N \left(3, \sigma^2\right)$ distribution takes on values from $-\infty$ to $\infty$.
+- $\overline{ X } \sim N \left(\mu, \sigma^2 / n \right) \Rightarrow \overline{ X }$ also takes on values from $-\infty$ to $\infty$.
+- It is entirely possible that $\bar{X}$ is very large even if the mean of its distribution is 3.
+- However, if $\bar{X}$ is very large, it will start to seem more likely that $\mu$ is larger than 3.
+- Eventually, a population mean of 5 will seem more likely than a population mean of 3.
+
+Reject $H _0$, in favor of $H _1$, if $\overline{ X }< c$ for some c to be determined.
+
+### Step Three
+
+Find c.
+
+- If $c$ is too large, we are making it difficult to reject $H _0$. We are more likely to fail to reject when it should be rejected.
+- If $c$ is too small, we are making it to easy to reject $H _0$.  We are more likely reject when it should not be rejected.
+
+This is where $\alpha$ comes in.
+
+$$
+\alpha&= P(Type I Error) \\
+&=P( \text{Reject } H_0 \text{ when true}) \\
+&=P (\overline{ X }< c \text{ when } \mu=3)
+$$
+
+### Step Four
+
+Give a conclusion!
+
+$0.05= P ($ Type I Error) \
+$= P \left(\right.$ Reject $H _0$ when true $)$ \
+$= P (\overline{ X }< c$ when $\mu=5)$
+
+
+$ = P \left(\frac{\overline{ X }-\mu_0}{\sigma / \sqrt{ n }}<\frac{ c -5}{2 / \sqrt{10}}\right.$ when $\left.\mu=5\right)
+
+
+```{image} https://cdn.mathpix.com/snip/images/A2zQa5iD99VnS5sLbiZ947KpZWH7i7xSbnJ6IZ88j2w.original.fullsize.png
+:align: center
+:alt: Errors in Hypothesis Testing
+:width: 80%
+```
+
+
+```{image} https://cdn.mathpix.com/snip/images/Q5ADdylsMg5__QGyDBeVgUtKCf5dpp5b24ur5L0phO4.original.fullsize.png
+:align: center
+:alt: Errors in Hypothesis Testing
+:width: 80%
+```
+
+```{image} https://cdn.mathpix.com/snip/images/T3f91rQbmLPwPT_cU3z8y51z-xQ8jdb9PtGskQ2pa3c.original.fullsize.png
+:align: center
+:alt: Errors in Hypothesis Testing
+:width: 80%
+```
diff --git a/docs/probability/random_variable.md b/docs/probability/random_variable.md
@@ -8,6 +8,18 @@ kernelspec:
 ```
 
 # Random Variables
+The first step to understand random variable is to do a fun experiment. Go outside in front of your house with a pen
+and paper. Take note of every person you pass and their hair color & height in centimeters. Spend about 10 minutes doing
+this.
+
+Congratulations! You have conducted your first experiment! Now you will be able to answer some questions such as:
+
+- How many people walked past you?
+- Did many people who walked past you have blue hair?
+- How tall were the people who walked past you on average?
+
+You pass 10 people in this experiment, 3 of whom have blue hair, and their average height may be 165.32 cm.
+In each of these questions, there was a number; a measurable quantity was attached.
 
 ## Definition
 
@@ -316,6 +328,7 @@ $\text { Let } x=g^{-1}(y) \text {. Then } d x=\frac{d}{d y} g^{-1}(y) d y$
 
 $E[g(X)]=\int_{-\infty}^{\infty} g(x) f_{X}(x)) d x$
 
+(variance-link)=
 ## Variance
 
 - Measures how far we expect our random variable to be from the mean.
@@ -476,6 +489,7 @@ $\text{Indicator function}_{A}(X) = \mathbf{1}_A(x) =\begin{cases} 1, & \text {
 
 Notation= $\mathbb{1} _{A}(x)$
 
+(random-sample)=
 ## Random Sample
 
 A collection of random variables is independent and identically distributed if each random variable has the same