In [None]:
**Statistics Advance Part 1**

1. **What is a random variable in probability theory?**
   A random variable is a numerical outcome of a random phenomenon. It assigns a real number to each outcome in a sample space.

2. **What are the types of random variables?**

* Discrete Random Variable
* Continuous Random Variable

3. **Difference between discrete and continuous distributions:**

* Discrete: Takes finite/countable values (e.g., number of heads in coin toss).
* Continuous: Takes infinite values within a range (e.g., height, weight).

4. **What are probability distribution functions (PDF)?**
   A PDF gives the probability that a continuous random variable falls within a particular range of values.

5. **CDF vs. PDF:**

* PDF: Probability density at a point.
* CDF: Probability up to a point; it is the integral of the PDF.

6. **What is a discrete uniform distribution?**
   Each outcome in a finite set has equal probability. Example: rolling a fair die.

7. **Key properties of Bernoulli distribution:**

* Two outcomes: 0 (failure), 1 (success)
* Mean = p, Variance = p(1-p)

8. **What is binomial distribution?**
   Probability of exactly k successes in n independent Bernoulli trials. Used in success/failure experiments.

9. **Poisson distribution and application:**
   Models number of events in a fixed interval. Used in call centers, traffic, etc.

10. **Continuous uniform distribution:**
    Every value in a continuous range is equally likely.

11. **Characteristics of normal distribution:**

* Bell-shaped curve
* Symmetric about the mean
* Mean = Median = Mode

12. **Standard normal distribution:**
    Normal distribution with mean 0 and standard deviation 1.

13. **Central Limit Theorem (CLT):**
    The sampling distribution of the sample mean approaches normality as sample size increases.

14. **Relation of CLT to normal distribution:**
    CLT justifies the use of the normal distribution for large sample sizes.

15. **Application of Z statistics in hypothesis testing:**
    Used to determine how far a sample mean is from the population mean under the null hypothesis.

16. **How to calculate a Z-score:**
    Z = (X - µ) / σ
    Represents how many standard deviations a point is from the mean.

17. **Point estimates vs. interval estimates:**

* Point estimate: Single value (e.g., sample mean)
* Interval estimate: Range with confidence level

18. **Significance of confidence intervals:**
    They provide a range in which we expect the population parameter to lie, with a certain level of confidence.

19. **Z-score and confidence interval relationship:**
    Z-score determines the margin of error for confidence intervals.

20. **Using Z-scores to compare distributions:**
    Z-scores standardize different distributions for comparison.

21. **Assumptions for applying CLT:**

* Independent samples
* Sample size > 30
* Finite variance

22. **Expected value in a probability distribution:**
    The long-run average value of repetitions of the experiment.

23. **Relation of distribution to expected value:**
    Expected value is computed using the values and probabilities in the distribution.

---

**Python Programs**

```python
# 1. Generate a random variable
import numpy as np
print("Random variable:", np.random.rand())

# 2. Discrete uniform distribution and PMF
import matplotlib.pyplot as plt
values = np.arange(1, 7)
probabilities = np.full(6, 1/6)
plt.bar(values, probabilities)
plt.title("Discrete Uniform Distribution (Die)")
plt.xlabel("Value")
plt.ylabel("Probability")
plt.show()

# 3. Bernoulli PDF
from scipy.stats import bernoulli
def bernoulli_pdf(p):
    x = [0, 1]
    return bernoulli.pmf(x, p)
print("Bernoulli PDF (p=0.6):", bernoulli_pdf(0.6))

# 4. Binomial Distribution
n, p = 10, 0.5
binom_samples = np.random.binomial(n, p, 1000)
plt.hist(binom_samples, bins=np.arange(12)-0.5, density=True)
plt.title("Binomial Distribution Histogram")
plt.show()

# 5. Poisson Distribution
from scipy.stats import poisson
x = np.arange(0, 20)
pmf = poisson.pmf(x, mu=5)
plt.bar(x, pmf)
plt.title("Poisson Distribution (mu=5)")
plt.show()

# 6. Discrete uniform CDF
cdf = np.cumsum(probabilities)
plt.step(values, cdf)
plt.title("Discrete Uniform CDF")
plt.show()

# 7. Continuous uniform distribution
samples = np.random.uniform(0, 1, 1000)
plt.hist(samples, bins=20)
plt.title("Continuous Uniform Distribution")
plt.show()

# 8. Simulate normal distribution
normal_data = np.random.normal(loc=0, scale=1, size=1000)
plt.hist(normal_data, bins=20)
plt.title("Normal Distribution")
plt.show()

# 9. Z-scores
from scipy.stats import zscore
data = np.random.normal(50, 10, 100)
z_scores = zscore(data)
plt.hist(z_scores)
plt.title("Z-scores")
plt.show()

# 10. CLT implementation
samples = [np.mean(np.random.exponential(scale=2, size=30)) for _ in range(1000)]
plt.hist(samples, bins=30)
plt.title("CLT from Exponential Distribution")
plt.show()
```

---

**Practical Section**

```python
# 1. CLT from normal samples
means = [np.mean(np.random.normal(loc=0, scale=1, size=30)) for _ in range(1000)]
plt.hist(means, bins=30)
plt.title("CLT Verification")
plt.show()

# 2. Standard Normal Distribution
x = np.linspace(-4, 4, 100)
y = (1/np.sqrt(2*np.pi)) * np.exp(-x**2/2)
plt.plot(x, y)
plt.title("Standard Normal Distribution")
plt.show()

# 3. Binomial probabilities
print(np.random.binomial(n=10, p=0.5, size=5))

# 4. Z-score comparison
value = 72
mean = 70
std = 5
z = (value - mean) / std
print("Z-score:", z)

# 5. Hypothesis testing
from scipy.stats import norm
z = (72 - 70) / 5
p_value = 1 - norm.cdf(z)
print("P-value:", p_value)

# 6. Confidence interval
sample = np.random.normal(50, 10, 100)
mean = np.mean(sample)
se = np.std(sample) / np.sqrt(len(sample))
z_critical = 1.96
ci = (mean - z_critical * se, mean + z_critical * se)
print("Confidence Interval:", ci)

# 7. CI for normal data
normal_data = np.random.normal(60, 15, 100)
mean = np.mean(normal_data)
se = np.std(normal_data)/np.sqrt(100)
ci = (mean - 1.96*se, mean + 1.96*se)
print("CI for Mean:", ci)

# 8. PDF of normal distribution
from scipy.stats import norm
x = np.linspace(-5, 5, 100)
pdf = norm.pdf(x, loc=0, scale=1)
plt.plot(x, pdf)
plt.title("Normal Distribution PDF")
plt.show()

# 9. Poisson CDF
cdf = poisson.cdf(x, mu=5)
plt.plot(x, cdf)
plt.title("Poisson CDF")
plt.show()

# 10. Continuous uniform expected value
samples = np.random.uniform(2, 10, 1000)
expected_value = np.mean(samples)
print("Expected Value:", expected_value)

# 11. Standard deviation comparison
data1 = np.random.normal(60, 10, 100)
data2 = np.random.normal(65, 15, 100)
plt.hist(data1, alpha=0.5, label='Data 1')
plt.hist(data2, alpha=0.5, label='Data 2')
plt.legend()
plt.title("Standard Deviation Comparison")
plt.show()

# 12. Range and IQR
from scipy.stats import iqr
data = np.random.normal(100, 20, 1000)
print("Range:", np.max(data) - np.min(data))
print("IQR:", iqr(data))

# 13. Z-score normalization
normalized = zscore(data)
plt.hist(normalized)
plt.title("Z-score Normalization")
plt.show()

# 14. Skewness and kurtosis
from scipy.stats import skew, kurtosis
print("Skewness:", skew(data))
print("Kurtosis:", kurtosis(data))
```

