1.What is a random variable in probability theory?

A random variable is a function that assigns numerical values to outcomes in a sample space.

2.What are the types of random variables?

There are two types: Discrete (countable values) and Continuous (any value within a range).

3.What is the difference between discrete and continuous distributions?
Discrete distributions describe countable outcomes, while continuous distributions represent values in a continuous range.

4.What are probability distribution functions (PDFs)?

PDFs describe the likelihood of different outcomes for a random variable.

5.How do cumulative distribution functions (CDFs) differ from PDFs?

The CDF gives the probability that a random variable is less than or equal to a certain value, while the PDF shows probability density

6.What is a discrete uniform distribution?

A discrete uniform distribution assigns equal probability to all possible discrete outcomes.

7.What are the key properties of a Bernoulli distribution?

A Bernoulli distribution has two possible outcomes (success or failure) with probability 
𝑝p for success and 1−𝑝1−p for failure.

8.What is the binomial distribution, and how is it used in probability?
The binomial distribution models the number of successes in 
𝑛
n independent Bernoulli trials with success probability 
𝑝
p.

9.What is the Poisson distribution and where is it applied?

The Poisson distribution models the number of events occurring in a fixed interval, given a constant average rate. It is used in queuing theory and reliability analysis.


10.What is a continuous uniform distribution?
A distribution where every value in a given interval is equally likely.

11.What are the characteristics of a normal distribution?
Symmetric, bell-shaped, fully defined by its mean 
𝜇
μ and standard deviation 
𝜎
σ.

12.What is the standard normal distribution, and why is it important?
A normal distribution with 
𝜇=0 μ=0 and 𝜎 = 1 σ=1, used for standardizing data.

13.What is the Central Limit Theorem (CLT), and why is it critical in statistics?
The CLT states that the distribution of the sample mean approaches normality as the sample size increases, regardless of the population’s distribution.

14.How does the Central Limit Theorem relate to the normal distribution?
It explains why many real-world data distributions tend to be normal when large enough samples are taken.

15.What is the application of Z statistics in hypothesis testing?
Z-tests compare sample means to population means when population variance is known.


16.How do you calculate a Z-score, and what does it represent?
𝑍=𝑥−𝜇 𝜎 Z= σ x−μ measures how many standard deviations 
x is from the mean 𝜇 μ.

In [None]:
17.What are point estimates and interval estimates in statistics?
A point estimate provides a single value estimate, while an interval estimate gives a range of values with confidence

18.What is the significance of confidence intervals in statistical analysis?
Confidence intervals quantify uncertainty in estimating population parameters.


19.What is the relationship between a Z-score and a confidence interval?
Z-scores help determine critical values for confidence intervals.

20.How are Z-scores used to compare different distributions?
They standardize different distributions, allowing comparison on a common scale.

21.What are the assumptions for applying the Central Limit Theorem?
Independent samples, sufficiently large sample size, and finite variance.

22.What is the concept of expected value in a probability distribution?
The expected value is the weighted average of all possible values of a random variable.

23.How does a probability distribution relate to the expected outcome of a random variable?
The probability distribution defines how likely different outcomes are, influencing the expected value.

In [None]:
#Practical

In [1]:
#1. Generate a random variable and display its value

import random

random_var = random.randint(1, 100)
print("Random Variable:", random_var)

Random Variable: 56


In [None]:
#2. Generate a discrete uniform distribution and plot PMF
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats

values = np.arange(1, 7)  # Simulating a fair die (1-6)
pmf = stats.randint.pmf(values, 1, 7)

plt.bar(values, pmf, color='skyblue', edgecolor='black')
plt.xlabel('Value')
plt.ylabel('Probability')
plt.title('PMF of a Discrete Uniform Distribution')
plt.show()

In [3]:
#3. Calculate the PDF of a Bernoulli distribution

def bernoulli_pdf(p, x):
    return p**x * (1 - p)**(1 - x) if x in [0, 1] else 0

print("P(X=1) for p=0.6:", bernoulli_pdf(0.6, 1))
print("P(X=0) for p=0.6:", bernoulli_pdf(0.6, 0))

P(X=1) for p=0.6: 0.6
P(X=0) for p=0.6: 0.4


In [None]:
#4. Simulate a binomial distribution and plot histogram

n, p = 10, 0.5
binomial_data = np.random.binomial(n, p, 1000)

plt.hist(binomial_data, bins=range(n+2), density=True, alpha=0.7, color='blue', edgecolor='black')
plt.xlabel('Successes')
plt.ylabel('Probability')
plt.title('Binomial Distribution (n=10, p=0.5)')
plt.show()

In [None]:
#5. Create a Poisson distribution and visualize it

lam = 5  # Average rate
poisson_data = np.random.poisson(lam, 1000)

plt.hist(poisson_data, bins=range(max(poisson_data)+1), density=True, alpha=0.7, color='red', edgecolor='black')
plt.xlabel('Occurrences')
plt.ylabel('Probability')
plt.title('Poisson Distribution (λ=5)')
plt.show()

In [None]:
#6. Calculate and plot the CDF of a discrete uniform distribution

cdf = stats.randint.cdf(values, 1, 7)

plt.step(values, cdf, where='mid', color='green', linewidth=2)
plt.xlabel('Value')
plt.ylabel('Cumulative Probability')
plt.title('CDF of a Discrete Uniform Distribution')
plt.show()

In [None]:
#7. Generate and visualize a continuous uniform distribution

uniform_data = np.random.uniform(0, 1, 1000)

plt.hist(uniform_data, bins=30, density=True, alpha=0.7, color='purple', edgecolor='black')
plt.xlabel('Value')
plt.ylabel('Density')
plt.title('Continuous Uniform Distribution')
plt.show()

In [None]:
#8. Simulate a normal distribution and plot histogram

mu, sigma = 0, 1
normal_data = np.random.normal(mu, sigma, 1000)

plt.hist(normal_data, bins=30, density=True, alpha=0.7, color='orange', edgecolor='black')
plt.xlabel('Value')
plt.ylabel('Density')
plt.title('Histogram of Normal Distribution')
plt.show()

In [None]:
#9. Calculate Z-scores from a dataset and plot them

from scipy.stats import zscore

data = np.random.normal(50, 15, 1000)
z_scores = zscore(data)

plt.hist(z_scores, bins=30, density=True, alpha=0.7, color='cyan', edgecolor='black')
plt.xlabel('Z-Score')
plt.ylabel('Density')
plt.title('Z-Scores of a Dataset')
plt.show()

In [None]:
#10. Implement the CLT using Python

sample_size = 30
num_samples = 1000
non_normal_data = np.random.exponential(scale=2, size=10000)

sample_means = [np.mean(np.random.choice(non_normal_data, sample_size)) for _ in range(num_samples)]

plt.hist(sample_means, bins=30, density=True, alpha=0.7, color='brown', edgecolor='black')
plt.xlabel('Sample Mean')
plt.ylabel('Density')
plt.title('Central Limit Theorem Demonstration')
plt.show()


In [None]:
#15. Simulate multiple samples from a normal distribution and verify the Central Limit Theorem

import numpy as np
import matplotlib.pyplot as plt

sample_size = 30
num_samples = 1000
population = np.random.normal(50, 15, 10000)

sample_means = [np.mean(np.random.choice(population, sample_size)) for _ in range(num_samples)]

plt.hist(sample_means, bins=30, density=True, alpha=0.7, color='blue', edgecolor='black')
plt.xlabel('Sample Mean')
plt.ylabel('Density')
plt.title('Verification of Central Limit Theorem')
plt.show()


In [None]:
#16. Calculate and plot the standard normal distribution (mean = 0, std = 1)

import scipy.stats as stats

x = np.linspace(-4, 4, 100)
pdf = stats.norm.pdf(x, 0, 1)

plt.plot(x, pdf, color='red', label='Standard Normal Distribution')
plt.xlabel('Z-Score')
plt.ylabel('Density')
plt.title('Standard Normal Distribution (Mean=0, Std=1)')
plt.legend()
plt.show()


In [None]:
#17. Generate binomial random variables and calculate their probabilities

n, p = 10, 0.5
binomial_data = np.random.binomial(n, p, 1000)

print("Probability of getting exactly 5 successes:", np.sum(binomial_data == 5) / len(binomial_data))

In [None]:
#18. Calculate the Z-score for a data point and compare it to a standard normal distribution

def calculate_z_score(x, mean, std):
    return (x - mean) / std

data = np.random.normal(50, 10, 100)
z_score = calculate_z_score(55, np.mean(data), np.std(data))

print("Z-score:", z_score)

In [None]:
#19. Implement hypothesis testing using Z-statistics

from scipy.stats import norm

sample_mean = 52
pop_mean = 50
pop_std = 10
sample_size = 30

z_stat = (sample_mean - pop_mean) / (pop_std / np.sqrt(sample_size))
p_value = 1 - norm.cdf(z_stat)

print("Z-Statistic:", z_stat)
print("P-Value:", p_value)

In [None]:
#20. Create a confidence interval for a dataset

import scipy.stats as st

data = np.random.normal(100, 15, 50)
confidence = 0.95

mean, std_err = np.mean(data), st.sem(data)
margin_error = std_err * st.t.ppf((1 + confidence) / 2, len(data) - 1)

conf_interval = (mean - margin_error, mean + margin_error)

print("95% Confidence Interval:", conf_interval)

In [None]:
#21. Generate normal data, then calculate and interpret confidence interval

data = np.random.normal(50, 10, 100)
mean, std_err = np.mean(data), st.sem(data)
conf_interval = st.norm.interval(0.95, loc=mean, scale=std_err)

print("95% Confidence Interval for Mean:", conf_interval)

In [None]:
#22. Calculate and visualize the PDF of a normal distribution

x = np.linspace(30, 70, 100)
pdf = stats.norm.pdf(x, 50, 10)

plt.plot(x, pdf, color='green', label='PDF')
plt.xlabel('Value')
plt.ylabel('Density')
plt.title('Probability Density Function of Normal Distribution')
plt.legend()
plt.show()

In [None]:
#23. Calculate and interpret the CDF of a Poisson distribution

lambda_poisson = 4
x = np.arange(0, 15)
cdf = stats.poisson.cdf(x, lambda_poisson)

plt.step(x, cdf, where='mid', color='purple')
plt.xlabel('x')
plt.ylabel('CDF')
plt.title('Cumulative Distribution Function of Poisson Distribution')
plt.show()

In [None]:
#24. Simulate a continuous uniform distribution and calculate expected value

data = np.random.uniform(10, 20, 1000)
expected_value = np.mean(data)

print("Expected Value:", expected_value)

In [None]:
#25. Compare the standard deviations of two datasets and visualize the difference

data1 = np.random.normal(50, 5, 1000)
data2 = np.random.normal(50, 15, 1000)

print("Standard Deviation of Data1:", np.std(data1))
print("Standard Deviation of Data2:", np.std(data2))

plt.hist(data1, bins=30, alpha=0.5, label='Std Dev = 5', color='blue')
plt.hist(data2, bins=30, alpha=0.5, label='Std Dev = 15', color='red')
plt.legend()
plt.show()