# Statistics Advance 1

## 1. What is a random variable in probability theory?  
A random variable (RV) is a numerical quantity whose value is determined by the outcome of a random experiment. Formally, it is a function \( X: \Omega \rightarrow \mathbb{R} \) that assigns a real number to every sample-space outcome \( \omega \).

---

## 2. What are the types of random variables?  
There are two main types of random variables:  
- **Discrete Random Variables**: Take on a countable number of distinct values (e.g., number of heads in coin flips).  
- **Continuous Random Variables**: Take on any value within a continuous range (e.g., height, weight).  

---

## 3. What is the difference between discrete and continuous distributions?  
- **Discrete Distributions**: Describe the probabilities of outcomes for discrete random variables (e.g., Binomial, Poisson). Probabilities are assigned to specific values.  
- **Continuous Distributions**: Describe probabilities for continuous random variables (e.g., Normal, Uniform). Probabilities are assigned to intervals and represented by areas under a curve (PDF).  

---

## 4. What are probability distribution functions (PDF)?  
- For **discrete** RVs: The Probability Mass Function (PMF) gives the probability that a discrete RV equals a specific value.  
- For **continuous** RVs: The Probability Density Function (PDF) describes the relative likelihood of the RV taking a value within an interval (area under the PDF curve gives the probability).  

---

## 5. How do cumulative distribution functions (CDF) differ from probability distribution functions (PDF)?  
- **CDF**: \( F(x) = P(X \leq x) \), gives the cumulative probability up to \( x \). Applies to both discrete and continuous RVs.  
- **PDF/PMF**: Describes probabilities for specific values (PMF) or intervals (PDF). The CDF is the integral of the PDF for continuous RVs.  

---

## 6. What is a discrete uniform distribution?  
A distribution where a discrete RV has a finite number of equally likely outcomes (e.g., rolling a fair die: \( P(X=k) = \frac{1}{6} \) for \( k = 1, 2, \dots, 6 \)).  

---

## 7. What are the key properties of a Bernoulli distribution?  
- Models a single trial with two outcomes: success (1) with probability \( p \), or failure (0) with probability \( 1-p \).  
- Mean \( = p \), Variance \( = p(1-p) \).  

---

## 8. What is the binomial distribution, and how is it used in probability?  
- Models the number of successes \( k \) in \( n \) independent Bernoulli trials.  
- PMF: \( P(X=k) = \binom{n}{k} p^k (1-p)^{n-k} \).  
- Used in scenarios like counting defective items in a batch or coin flips.  

---

## 9. What is the Poisson distribution and where is it applied?  
- Models the number of rare events occurring in a fixed interval/time.  
- PMF: \( P(X=k) = \frac{\lambda^k e^{-\lambda}}{k!} \), where \( \lambda \) is the average rate.  
- Applications: Traffic accidents, call center arrivals, radioactive decay.  

---

## 10. What is a continuous uniform distribution?  
- A distribution where a continuous RV has equal probability over an interval \([a, b]\).  
- PDF: \( f(x) = \frac{1}{b-a} \) for \( x \in [a, b] \).  

---

## 11. What are the characteristics of a normal distribution?  
- Symmetric, bell-shaped curve centered at mean \( \mu \).  
- PDF: \( f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}} \).  
- 68-95-99.7% of data within \( 1, 2, 3 \) standard deviations of \( \mu \).  

---

## 12. What is the standard normal distribution, and why is it important?  
- A normal distribution with \( \mu = 0 \) and \( \sigma = 1 \).  
- Importance: Simplifies calculations; any normal RV can be standardized to it using \( Z = \frac{X-\mu}{\sigma} \).  

---

## 13. What is the Central Limit Theorem (CLT), and why is it critical in statistics?  
- CLT states that the sampling distribution of the mean of any independent, identically distributed (i.i.d.) RVs approaches normality as sample size \( n \to \infty \), regardless of the population distribution.  
- Critical because it justifies using normal methods for inference (e.g., confidence intervals) even when population distributions are unknown.  

---

## 14. How does the Central Limit Theorem relate to the normal distribution?  
CLT explains why many real-world sample means are approximately normal, even if the underlying data is not (e.g., averages of skewed distributions become normal for large \( n \)).  

---

## 15. What is the application of Z statistics in hypothesis testing?  
- \( Z \)-tests compare sample means to population means when \( \sigma \) is known.  
- \( Z \)-scores standardize data, enabling comparison across different scales and calculation of p-values.  

---

## 16. How do you calculate a Z-score, and what does it represent?  
- Formula: \( Z = \frac{X-\mu}{\sigma} \).  
- Represents the number of standard deviations \( X \) is from the mean \( \mu \).  

---

## 17. What are point estimates and interval estimates in statistics?  
- **Point Estimate**: Single value approximating a parameter (e.g., sample mean \( \bar{x} \) for \( \mu \)).  
- **Interval Estimate**: Range (e.g., confidence interval) likely to contain the parameter.  

---

## 18. What is the significance of confidence intervals in statistical analysis?  
They provide a range of plausible values for a parameter (e.g., population mean) with a specified confidence level (e.g., 95%), quantifying uncertainty in estimates.  

---

## 19. What is the relationship between a Z-score and a confidence interval?  
For a normal distribution, a \( (1-\alpha) \)% CI for the mean is:  
\( \bar{x} \pm Z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \), where \( Z_{\alpha/2} \) is the critical Z-score for the desired confidence level.  

---

## 20. How are Z-scores used to compare different distributions?  
By converting values to Z-scores (standardizing), comparisons can be made across distributions with different units/scales (e.g., SAT vs. ACT scores).  

---

## 21. What are the assumptions for applying the Central Limit Theorem?  
1. Independent, identically distributed (i.i.d.) samples.  
2. Sample size \( n \) sufficiently large (typically \( n \geq 30 \)).  

---

## 22. What is the concept of expected value in a probability distribution?  
The long-run average value of the RV, calculated as:  
- Discrete: \( E(X) = \sum x_i P(x_i) \).  
- Continuous: \( E(X) = \int_{-\infty}^{\infty} x f(x) dx \).  

---

## 23. How does a probability distribution relate to the expected outcome of a random variable?  
The expected value \( E(X) \) is the center of mass of the distribution, summarizing the average outcome over many repetitions of the experiment.  

--- 