### I. What is a random variable in probability theory?

A random variable is a numerical outcome of a random phenomenon. It maps outcomes of a random experiment to numbers.
### II. What are the types of random variables?

Random variables are of two types: Discrete (finite/countable outcomes) and Continuous (infinite/unbounded outcomes).
### III. What is the difference between discrete and continuous distributions?

Discrete distributions deal with countable outcomes, while continuous distributions deal with outcomes over a range or interval.
### IV. What are probability distribution functions (PDF)?

PDF describes the likelihood of a random variable taking a specific value. For discrete variables, it's called PMF.
### V. How do cumulative distribution functions (CDF) differ from PDF?

CDF gives the probability that a variable is less than or equal to a certain value, while PDF gives the likelihood of a specific value.
### VI. What is a discrete uniform distribution?

A discrete uniform distribution assigns equal probability to all outcomes in a finite set.
### VII. What are the key properties of a Bernoulli distribution?

A Bernoulli distribution has only two outcomes (0 or 1) with probabilities p and 1-p, respectively.
### VIII. What is the binomial distribution, and how is it used in probability?

The binomial distribution models the number of successes in a fixed number of independent Bernoulli trials.
### IX. What is the Poisson distribution and where is it applied?

Poisson distribution models the number of events occurring in a fixed interval of time or space when events occur independently.
### X. What is a continuous uniform distribution?

It is a distribution where all outcomes in a range [a, b] are equally likely.
### XI. What are the characteristics of a normal distribution?

A normal distribution is symmetric, bell-shaped, and described by mean (μ) and standard deviation (σ).
### XII. What is the standard normal distribution, and why is it important?

It is a normal distribution with mean 0 and std 1. It is used to compute probabilities and perform Z-tests.
### XIII. What is the Central Limit Theorem (CLT), and why is it critical in statistics?

CLT states that the sampling distribution of the sample mean approaches normality as sample size increases, regardless of population distribution.
### XIV. How does the Central Limit Theorem relate to the normal distribution?

It justifies the use of the normal distribution to approximate sampling distributions.
### XV. What is the application of Z statistics in hypothesis testing?

Z-statistics are used to test hypotheses about population means when population variance is known.
### XVI. How do you calculate a Z-score, and what does it represent?

Z = (X - μ) / σ. It represents how many standard deviations a value is from the mean.
### XVII. What are point estimates and interval estimates in statistics?

Point estimate gives a single value estimate; interval estimate gives a range with a confidence level.
### XVIII. What is the significance of confidence intervals in statistical analysis?

Confidence intervals provide a range in which the population parameter is expected to lie with a certain probability.
### XIX. What is the relationship between a Z-score and a confidence interval?

Z-scores are used to construct confidence intervals, especially when the population standard deviation is known.
### XX. How are Z-scores used to compare different distributions?

Z-scores standardize values, allowing comparison across different distributions.
### XXI. What are the assumptions for applying the Central Limit Theorem?

Independent, identically distributed variables with finite mean and variance, and sufficient sample size.
### XXII. What is the concept of expected value in a probability distribution?

It is the mean or average outcome expected if an experiment is repeated many times.
### XXIII. How does a probability distribution relate to the expected outcome of a random variable?

Expected value is computed using the probability distribution: E(X) = Σx*P(x) for discrete variables.

In [3]:


import random

# Generate a random integer between 1 and 100
random_value = random.randint(1, 100)
print("Random variable value:", random_value)



Random variable value: 98


In [None]:


import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import randint

# Discrete uniform distribution from 1 to 10
x = np.arange(1, 11)
pmf_vals = randint.pmf(x, 1, 11)

plt.stem(x, pmf_vals, use_line_collection=True)
plt.title('PMF of Discrete Uniform Distribution')
plt.xlabel('x')
plt.ylabel('PMF')
plt.grid(True)
plt.show()



In [None]:

from scipy.stats import bernoulli

def bernoulli_pdf(p, x):
    return bernoulli.pmf(x, p)

# Example: p = 0.6
for x in [0, 1]:
    print(f"P(X={x}) = {bernoulli_pdf(0.6, x)}")



In [None]:


from scipy.stats import binom

# Simulate 1000 samples
binom_samples = binom.rvs(n=10, p=0.5, size=1000)

plt.hist(binom_samples, bins=range(12), align='left', rwidth=0.8, color='skyblue', edgecolor='black')
plt.title('Binomial Distribution Histogram (n=10, p=0.5)')
plt.xlabel('Number of Successes')
plt.ylabel('Frequency')
plt.grid(True)
plt.show()



In [None]:


from scipy.stats import poisson

mu = 3  # Mean of the distribution
x = np.arange(0, 10)
pmf_vals = poisson.pmf(x, mu)

plt.stem(x, pmf_vals, use_line_collection=True)
plt.title('PMF of Poisson Distribution (mu=3)')
plt.xlabel('x')
plt.ylabel('PMF')
plt.grid(True)
plt.show()



In [None]:


cdf_vals = randint.cdf(x, 1, 11)

plt.step(x, cdf_vals, where='mid')
plt.title('CDF of Discrete Uniform Distribution')
plt.xlabel('x')
plt.ylabel('CDF')
plt.grid(True)
plt.show()



In [None]:


from scipy.stats import uniform

samples = uniform.rvs(size=1000)

plt.hist(samples, bins=20, density=True, edgecolor='black')
plt.title('Histogram of Continuous Uniform Distribution')
plt.xlabel('Value')
plt.ylabel('Density')
plt.grid(True)
plt.show()



In [None]:


normal_data = np.random.normal(loc=0, scale=1, size=1000)

plt.hist(normal_data, bins=30, edgecolor='black')
plt.title('Normal Distribution Histogram')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.grid(True)
plt.show()



In [None]:


from scipy.stats import zscore

data = np.random.normal(10, 2, 100)
z_scores = zscore(data)

plt.plot(z_scores, marker='o')
plt.title('Z-scores of Dataset')
plt.xlabel('Index')
plt.ylabel('Z-score')
plt.grid(True)
plt.show()



In [None]:

sample_means = []
for _ in range(1000):
    sample = np.random.exponential(scale=2.0, size=30)
    sample_means.append(np.mean(sample))

plt.hist(sample_means, bins=30, edgecolor='black')
plt.title('CLT using Exponential Distribution')
plt.xlabel('Sample Mean')
plt.ylabel('Frequency')
plt.grid(True)
plt.show()



In [None]:


x = np.linspace(-4, 4, 1000)
pdf = (1/np.sqrt(2*np.pi)) * np.exp(-0.5 * x**2)

plt.plot(x, pdf)
plt.title('Standard Normal Distribution PDF')
plt.xlabel('x')
plt.ylabel('PDF')
plt.grid(True)
plt.show()



In [None]:


n, p = 10, 0.5
x = np.arange(0, n+1)
probs = binom.pmf(x, n, p)

for i, prob in zip(x, probs):
    print(f"P(X={i}) = {prob:.4f}")



In [None]:


data_point = 75
mean = 70
std_dev = 5
z = (data_point - mean) / std_dev
print(f"Z-score: {z}")

# Compare to standard normal
print(f"Probability to the left of {data_point}: {norm.cdf(z):.4f}")



In [None]:


# Sample data
sample_mean = 52
population_mean = 50
std_dev = 10
n = 30

z_stat = (sample_mean - population_mean) / (std_dev / np.sqrt(n))
print(f"Z-statistic: {z_stat}")
print(f"P-value: {2 * (1 - norm.cdf(abs(z_stat)))}")



In [None]:


import scipy.stats as stats

data = np.random.normal(100, 15, 50)
mean = np.mean(data)
std_err = stats.sem(data)
conf_int = stats.t.interval(0.95, len(data)-1, loc=mean, scale=std_err)
print("Confidence Interval (95%):", conf_int)



In [None]:


data = np.random.normal(loc=50, scale=10, size=100)
mean = np.mean(data)
stderr = stats.sem(data)
conf = stats.norm.interval(0.95, loc=mean, scale=stderr)
print("95% Confidence Interval:", conf)



In [None]:


x = np.linspace(-5, 5, 1000)
pdf_vals = norm.pdf(x)

plt.plot(x, pdf_vals)
plt.title('Normal Distribution PDF')
plt.xlabel('x')
plt.ylabel('PDF')
plt.grid(True)
plt.show()



In [None]:


x = np.arange(0, 10)
cdf_vals = poisson.cdf(x, mu=3)

plt.step(x, cdf_vals, where='mid')
plt.title('Poisson Distribution CDF')
plt.xlabel('x')
plt.ylabel('CDF')
plt.grid(True)
plt.show()



In [None]:


samples = uniform.rvs(loc=0, scale=10, size=1000)
expected_value = np.mean(samples)
print("Expected Value:", expected_value)



In [None]:


data1 = np.random.normal(50, 5, 100)
data2 = np.random.normal(50, 10, 100)

std1 = np.std(data1)
std2 = np.std(data2)

plt.hist(data1, alpha=0.5, label='std=5')
plt.hist(data2, alpha=0.5, label='std=10')
plt.legend()
plt.title('Comparison of Standard Deviations')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.grid(True)
plt.show()

print("Standard Deviations:", std1, std2)



In [None]:


data = np.random.normal(100, 15, 100)
range_val = np.ptp(data)
iqr_val = stats.iqr(data)
print("Range:", range_val)
print("Interquartile Range (IQR):", iqr_val)



In [None]:


data = np.random.normal(60, 10, 100)
normalized_data = zscore(data)

plt.hist(normalized_data, bins=30, edgecolor='black')
plt.title('Z-score Normalized Data')
plt.xlabel('Z-score')
plt.ylabel('Frequency')
plt.grid(True)
plt.show()



In [None]:


from scipy.stats import skew, kurtosis

data = np.random.normal(0, 1, 1000)
print("Skewness:", skew(data))
print("Kurtosis:", kurtosis(data))

