From eec78f0c221ce8523ae2499a9b07e7f7152349ec Mon Sep 17 00:00:00 2001 From: Ikram Ali Date: Tue, 1 Nov 2022 22:12:50 +0500 Subject: [PATCH] C3 week1 (#35) * new changes --- docs/mathematical_notation.md | 32 +-- docs/probability/continuous_distributions.md | 22 +- docs/probability/hypothesis_testing.md | 224 ++++++++++++++++--- docs/probability/random_variable.md | 14 ++ docs/probability/what_is_probability.md | 38 ++-- 5 files changed, 257 insertions(+), 73 deletions(-) diff --git a/docs/mathematical_notation.md b/docs/mathematical_notation.md index 62d5598..d464dec 100644 --- a/docs/mathematical_notation.md +++ b/docs/mathematical_notation.md @@ -6,19 +6,19 @@ ## Probability and Statistics ```{list-table} -:widths: 20 20 60 +:widths: 20 70 10 :header-rows: 1 :align: "center" * - Symbol - Formula - - Meaning + - Article * - $\mu$ - - $\sum_{x} k P(X=x)$ - - Mean | Expected Value | Waited Average | First Moment Generating Function + - $\sum_{x} k P(X=x) = \int_{-\infty}^{\infty} x f(x) d x$ + - [🔗](expected-value) * - $V(X)$ or $\sigma^2$ - - $E[(X - \mu)^2]$ - - Variance of X + - $E[(X - E[X])^2] = E[(X - \mu)^2] = E[X^2] - E[X]^2$ + - [🔗](variance-link) * - $\sigma$ - $\sqrt{V(X)}$ - Standard deviation @@ -29,22 +29,4 @@ - The sample - The sample mean is an average value -``` - - -## Linear Algebra - -| Symbol | Meaning | -|---------------|------------------------------------| -| x | A single number, lowercase, italic | -| $x$ | A vector, bold, lowercase, italic | -| $X$ | A matrix, bold, uppercase, italic | -| $\textbf{X}$ | A tensor, bold, uppercase | -| $X^T$ | Transpose of matrix X | -| $X^{-1}$ | Inverse of X | -| $I$ | Identity matrix | -| $X*Y$ | Element-wise product of X and Y | -| $X \otimes Y$ | Kronecker product of X and Y | -| $x \cdot y$ | Dot product of x and y | -| $tr(X)$ | Trace of X | -| $det(X)$ | Determinant of X | +``` \ No newline at end of file diff --git a/docs/probability/continuous_distributions.md b/docs/probability/continuous_distributions.md index 03158ab..a2f02a2 100644 --- a/docs/probability/continuous_distributions.md +++ b/docs/probability/continuous_distributions.md @@ -412,7 +412,7 @@ The QQ Plot allows us to see deviation of a normal distribution much better than The normal distribution with parameter values $\mu$ = 0 and $\sigma^2$ = 1 is called the standard normal distribution. -A rv with the standard normal distribution is customarily denoted by $Z \sim N(0, 1)$ +A rv with the standard normal distribution is denoted by $Z \sim N(0, 1)$ If $X \sim N\left(\mu, \sigma^2\right)$ then @@ -430,12 +430,17 @@ $$ $f_{Z}(x)=\frac{1}{\sqrt{2 \pi}} e^{-x^{2} / 2} \text { for }-\infty3$ when $\bar{X}$ is "significantly" larger than 3. - We're never going to observe $\bar{X}=3$, but we may be able to be convinced that $\mu=3$ if $\bar{X}$ is not too far away. + +**How do we formalize this stuff, We use hypothesis testing** + Hypotheses: -$\mathrm{H}_0: \mu \leq 3$ -$\mathrm{H}_1: \mu>3 \quad$ alternate +$\mathrm{H}_0: \mu \leq 3$ <- Null hypothesis \ +$\mathrm{H}_1: \mu>3 \quad$ Alternate hypothesis + +### Null hypothesis +The null hypothesis is assumed to be true. -hypothesis -- The null hypothesis is assumed to be true. -- The alternate hypothesis is what we are out to show. +### Alternate hypothesis +The alternate hypothesis is what we are out to show. -Conclusion is either: +**Conclusion is either**:\ Reject $\mathrm{H}_0 \quad$ OR $\quad$ Fail to Reject $\mathrm{H}_0$ -#### Errors in Hypothesis Testing -##### Type I Error \ No newline at end of file +#### simple hypothesis +A simple hypothesis is one that completely specifies the distribution. Do you know the exact distribution. + +#### composite hypothesis +You don't know the exact distribution.\ +Means you know the distribution is normal but you don't know the mean and variance. + +#### Critical values + +```{image} https://cdn.mathpix.com/snip/images/VhPT2BPUY6gNGGTSOLvZuK6iXJSLNFeOwMU3aI8Droc.original.fullsize.png +:align: center +:alt: Critical values in Hypothesis Testing +:width: 80% +``` + +```{image} https://cdn.mathpix.com/snip/images/M8w97dpXZ9nyvOgbPEuDObaVI9gS7Qmrt9gW7GHZeYs.original.fullsize.png +:align: center +:alt: Critical values example +:width: 80% +``` + +## Errors in Hypothesis Testing + +Let $X_1, X_2, \ldots, X_n$ be a random sample from the normal distribution with mean $\mu$ and variance $\sigma^2=2$ + +$$ +H _0: \mu \leq 3 \quad H _1: \mu>3 +$$ + +**Idea**: Look at $\bar{X}$ and reject $H_0$ in favor of $H _1$ if $\overline{ X }$ is "large".\ +i.e. Look at $\bar{X}$ and reject $H_0$ in favor of $H _1$ if $\overline{ X }> c$ for some value $c$. + + +```{image} https://cdn.mathpix.com/snip/images/JeCsNYRlM6qG5RBLyuckje_opt6MoxGFvrmOe5QyfT0.original.fullsize.png +:align: center +:alt: Errors in Hypothesis Testing +:width: 80% +``` + +```{image} https://cdn.mathpix.com/snip/images/CQje4JfzfdpSnlrWFvGHbbIsWFMq67TI7pIRUiyzTF4.original.fullsize.png +:align: center +:alt: Errors in Hypothesis Testing +:width: 80% +``` + +You are a potato chip manufacturer and you want to ensure that the mean amount in 15 ounce bags is at least 15 ounces. +$\mathrm{H}_0: \mu \leq 15 \quad \mathrm{H}_1: \mu>15$ + +### Type I Error +The true mean is $\leq 15$ but you concluded i was $>15$. You are going to save some money because you won't be adding +chips but you are risking a lawsuit! + +### Type II Error +The true mean is $> 15$ but you concluded it was $\leq 15$ . You are going to be spending money increasing the amount +of chips when you didn’t have to. + +## Developing a Test +Let $X_1, X_2, \ldots, X_n$ be a random sample from the normal distribution with mean $\mu$ and known variance $\sigma^2$. + +Consider testing the simple versus simple hypotheses + +$$ +\begin{aligned} +& H _0: \mu=5 \\ +& H _1: \mu=3 +\end{aligned} +$$ + +### level of significance + +Let $\alpha= P$ (Type I Error) \ +$= P \left(\right.$ Reject $H _0$ when it's true $)$ \ +$= P \left(\right.$ Reject $H _0$ when $\left.\mu=5\right)$ + +$\alpha$ is called the level of significance of the test. It is also sometimes referred to as the size of the test. + +### Step One +Choose an estimator for μ. + +$$ +\widehat{\mu}=\bar{X} +$$ + +### Step Two + +Choose a test statistic or Give the “form” of the test. + +- We are looking for evidence that $H _1$ is true. +- The $N \left(3, \sigma^2\right)$ distribution takes on values from $-\infty$ to $\infty$. +- $\overline{ X } \sim N \left(\mu, \sigma^2 / n \right) \Rightarrow \overline{ X }$ also takes on values from $-\infty$ to $\infty$. +- It is entirely possible that $\bar{X}$ is very large even if the mean of its distribution is 3. +- However, if $\bar{X}$ is very large, it will start to seem more likely that $\mu$ is larger than 3. +- Eventually, a population mean of 5 will seem more likely than a population mean of 3. + +Reject $H _0$, in favor of $H _1$, if $\overline{ X }< c$ for some c to be determined. + +### Step Three + +Find c. + +- If $c$ is too large, we are making it difficult to reject $H _0$. We are more likely to fail to reject when it should be rejected. +- If $c$ is too small, we are making it to easy to reject $H _0$. We are more likely reject when it should not be rejected. + +This is where $\alpha$ comes in. + +$$ +\alpha&= P(Type I Error) \\ +&=P( \text{Reject } H_0 \text{ when true}) \\ +&=P (\overline{ X }< c \text{ when } \mu=3) +$$ + +### Step Four + +Give a conclusion! + +$0.05= P ($ Type I Error) \ +$= P \left(\right.$ Reject $H _0$ when true $)$ \ +$= P (\overline{ X }< c$ when $\mu=5)$ + + +$ = P \left(\frac{\overline{ X }-\mu_0}{\sigma / \sqrt{ n }}<\frac{ c -5}{2 / \sqrt{10}}\right.$ when $\left.\mu=5\right) + + +```{image} https://cdn.mathpix.com/snip/images/A2zQa5iD99VnS5sLbiZ947KpZWH7i7xSbnJ6IZ88j2w.original.fullsize.png +:align: center +:alt: Errors in Hypothesis Testing +:width: 80% +``` + + +```{image} https://cdn.mathpix.com/snip/images/Q5ADdylsMg5__QGyDBeVgUtKCf5dpp5b24ur5L0phO4.original.fullsize.png +:align: center +:alt: Errors in Hypothesis Testing +:width: 80% +``` + +```{image} https://cdn.mathpix.com/snip/images/T3f91rQbmLPwPT_cU3z8y51z-xQ8jdb9PtGskQ2pa3c.original.fullsize.png +:align: center +:alt: Errors in Hypothesis Testing +:width: 80% +``` diff --git a/docs/probability/random_variable.md b/docs/probability/random_variable.md index 3d7bfd7..6abac7e 100644 --- a/docs/probability/random_variable.md +++ b/docs/probability/random_variable.md @@ -8,6 +8,18 @@ kernelspec: ``` # Random Variables +The first step to understand random variable is to do a fun experiment. Go outside in front of your house with a pen +and paper. Take note of every person you pass and their hair color & height in centimeters. Spend about 10 minutes doing +this. + +Congratulations! You have conducted your first experiment! Now you will be able to answer some questions such as: + +- How many people walked past you? +- Did many people who walked past you have blue hair? +- How tall were the people who walked past you on average? + +You pass 10 people in this experiment, 3 of whom have blue hair, and their average height may be 165.32 cm. +In each of these questions, there was a number; a measurable quantity was attached. ## Definition @@ -316,6 +328,7 @@ $\text { Let } x=g^{-1}(y) \text {. Then } d x=\frac{d}{d y} g^{-1}(y) d y$ $E[g(X)]=\int_{-\infty}^{\infty} g(x) f_{X}(x)) d x$ +(variance-link)= ## Variance - Measures how far we expect our random variable to be from the mean. @@ -476,6 +489,7 @@ $\text{Indicator function}_{A}(X) = \mathbf{1}_A(x) =\begin{cases} 1, & \text { Notation= $\mathbb{1} _{A}(x)$ +(random-sample)= ## Random Sample A collection of random variables is independent and identically distributed if each random variable has the same diff --git a/docs/probability/what_is_probability.md b/docs/probability/what_is_probability.md index 52ef27c..8a251c1 100644 --- a/docs/probability/what_is_probability.md +++ b/docs/probability/what_is_probability.md @@ -14,7 +14,7 @@ kernelspec: - Probability is the branch of mathematics that deals with the occurrence of a random event. - Probability is the measure of the likelihood of an event to happen. -probability is the study of randomness and uncertainty. Probability theory is widely used in the area of studies such +Probability is the study of randomness and uncertainty. Probability theory is widely used in the area of studies such as statistics, finance, gambling, artificial intelligence, machine learning, computer science, game theory, and philosophy. @@ -22,21 +22,30 @@ philosophy. Some of the applications of probability are predicting results of the following events: -1. that a customer will buy milk if they are also buying bread. -2. Of getting at least 2 heads in 5 coin flips. -3. Getting 3 and 5 on throwing a die. -4. Choosing a card from the deck. -5. Pulling a green candy from a bag of red candies. -6. Winning a lottery 1 in many millions. -7. \# of vehicles crossing a bridge in one day -8. \# of customers arriving at a bank in a week - -### Major Applications of Probability - +::::{grid} + +:::{grid-item-card} +Minor +^^^^^^ +- that a customer will buy milk if they are also buying bread. +- Of getting at least 2 heads in 5 coin flips. +- Getting 3 and 5 on throwing a die. +- Pulling a green candy from a bag of red candies. +- Winning a lottery 1 in many millions. +- \# of customers arriving at a bank in a week +::: + +:::{grid-item-card} +Major +^^^^^^ - It is used for risk assessment and modelling in various industries - Weather forecasting or prediction of weather changes - Probability of a team winning in a sport based on players and strength of team - In the share market, chances of getting the hike of share prices +::: + +:::: + ## Probability Terms The first thing we do when we start thinking about the probability list a number of things that could possibly happen.\ @@ -49,11 +58,12 @@ Suppose that we toss a die. Six numbers, from 1 to 6, can appear face up, but we will appear. The sample space is S = {1,2,3,4,5,6}.\ Tossing a coin, Sample Space = {H,T}. -```{image} https://cdn.mathpix.com/snip/images/NIgpfDh_vIIrB4C6KWs6SlZbyH4xSHzIPYolo_FcY-U.original.fullsize.png +```{image} https://cdn.mathpix.com/snip/images/0nY3sA4gdcKRJqiIpDf5RvXHDTessQ-cDgjFFMMhuuE.original.fullsize.png :align: center :alt: Sample space -:width: 60% +:width: 80% ``` +

Image from byjus.com

### Experiment or Trial Experiment is any action or process that generates observations or outcomes. \