Q 1: What is a random variable in probability theory?
- In probability theory, a random variable is a variable whose value is a numerical outcome of a random phenomenon. It's a function that assigns a numerical value to each possible outcome of a random experiment. Think of it as a way to quantify the results of a random event.

Q 2: What are the types of random variables?
- There are two main types of random variables: discrete and continuous. Discrete random variables can only take on a finite number of values or a countably infinite number of values, often integers. Continuous random variables, on the other hand, can take on any value within a given range or interval.

Q 3: What is the difference between discrete and continuous distributions ?
- The main difference between discrete and continuous distributions lies in the nature of the variable they describe. Discrete distributions deal with countable, separate values, while continuous distributions deal with variables that can take on any value within a given range.

Q 4: What are probability distribution functions (PDF) ?
- A probability distribution function (PDF) describes the likelihood of different outcomes for a continuous random variable. It essentially tells you how probable it is to find the variable within a specific range of values. For continuous variables, the PDF is a function that gives a value for the probability density at each point, and the probability of the variable falling within an interval is found by integrating the PDF over that interval.

Q 5: How do cumulative distribution functions (CDF) differ from probability distribution functions (PDF)?
- The Probability Density Function (PDF) and Cumulative Distribution Function (CDF) are both ways to describe the distribution of a random variable, but they represent different aspects. The PDF gives the probability density at a specific value, while the CDF gives the probability that the random variable is less than or equal to a specific value.

Q 6: What is a discrete uniform distribution?
- A discrete uniform distribution is a probability distribution where a finite number of outcomes are all equally likely to occur. It's a simple distribution where each possible value within a defined range has the same probability of being selected.

Q 7: What are the key properties of a Bernoulli distribution ?
- A Bernoulli distribution is a discrete probability distribution representing a single trial with two possible outcomes: success (1) or failure (0). Key properties include a single parameter 'p' representing the probability of success, independence of trials, and that it's a discrete, univariate distribution.

Q 8: What is the binomial distribution, and how is it used in probability ?
- The binomial distribution is a discrete probability distribution that models the probability of achieving a specific number of successes in a fixed number of independent trials, where each trial has only two possible outcomes (success or failure). It's frequently used to analyze scenarios like coin flips, quality control, and survey responses, where the outcome of each trial is binary.

Q 9: What is the Poisson distribution and where is it applied ?
- The Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event.

Q 10: What is a continuous uniform distribution ?
- A continuous uniform distribution is a probability distribution where a continuous random variable has an equal probability of taking any value within a specified range [a, b]. This means the probability density function (PDF) is constant within that range and zero outside of it. Essentially, every outcome within the range is equally likely to occur.

Q 11: What are the characteristics of a normal distribution ?
- A normal distribution, also known as a Gaussian distribution or bell curve, is characterized by several key features. It's symmetric around its mean, meaning the left and right sides of the curve are mirror images. The mean, median, and mode are all equal, and located at the center of the distribution. The curve is bell-shaped, with the highest point at the mean and tapering off equally in both directions. Furthermore, the total area under the curve equals 1, and the distribution is fully defined by its mean and standard deviation.

Q 12: What is the standard normal distribution, and why is it important ?
- The standard normal distribution is a specific type of normal distribution where the mean is 0 and the standard deviation is 1. It's a crucial concept in statistics because it allows us to compare data from different normal distributions and calculate probabilities related to those distributions.

Q 13: What is the Central Limit Theorem (CLT), and why is it critical in statistics ?
- The Central Limit Theorem (CLT) states that the distribution of sample means will approximate a normal distribution, regardless of the original population's distribution, as long as the sample size is sufficiently large. This is critical in statistics because it allows us to make inferences about population parameters (like the mean) using sample data, even when the population distribution is unknown or non-normal.

Q 14: How does the Central Limit Theorem relate to the normal distribution ?
- The central limit theorem (CLT) explains the relationship between the normal distribution and sample means. It states that the distribution of sample means will approach a normal distribution, regardless of the shape of the original population distribution, as the sample size gets larger. In essence, the CLT provides a powerful tool for understanding and working with data by connecting the concept of sample means to the well-understood properties of the normal distribution.
Q 15: What is the application of Z statistics in hypothesis testing ?
- Z-statistics, calculated using a Z-test, are applied in hypothesis testing to determine if there's a significant difference between a sample's mean and a population's mean, or between the means of two independent samples. This is particularly useful when the population variance is known, or when dealing with large sample sizes (typically n â‰¥ 30).

Q 16: How do you calculate a Z-score, and what does it represent ?
- A z-score, also known as a standard score, indicates how many standard deviations a data point is away from the mean of a distribution. It's calculated by subtracting the mean from the individual data point and then dividing by the standard deviation.

Q 17: What are point estimates and interval estimates in statistics ?
- In statistics, point estimates and interval estimates are two fundamental ways to estimate unknown population parameters using sample data. Point estimates provide a single value as the best guess, while interval estimates provide a range of values within which the parameter is likely to fall.

Q 18: What is the significance of confidence intervals in statistical analysis ?
- Confidence intervals in statistical analysis are crucial because they provide a range of plausible values for an unknown population parameter, offering a more informative measure of uncertainty than a single point estimate. They help researchers understand the reliability and precision of their estimates, guiding them in making data-driven decisions.

Q 19: What is the relationship between a Z-score and a confidence interval ?
- A Z-score and a confidence interval are related through their connection to the standard normal distribution. The Z-score indicates how many standard deviations a data point is from the mean, while a confidence interval provides a range of values likely to contain the true population parameter. The Z-score is used to calculate the margin of error, which defines the width of the confidence interval.

Q 20: How are Z-scores used to compare different distributions ?
- Z-scores are used to compare data from different distributions by standardizing the values. This means converting raw scores into a common scale based on standard deviations from the mean, allowing for direct comparison regardless of the original scale. A Z-score indicates how many standard deviations a data point is away from the mean of its distribution.

Q 21: What are the assumptions for applying the Central Limit Theorem ?
- The Central Limit Theorem (CLT) requires a few key assumptions to be met for the sampling distribution of sample means to approximate a normal distribution. These include: random sampling, independence of samples, a sufficiently large sample size, and sometimes, that the sample size is less than 10% of the population when sampling without replacement.

Q 22: What is the concept of expected value in a probability distribution ?
- The expected value in a probability distribution represents the long-run average outcome of a random variable. It's a weighted average of all possible values, where each value is weighted by its corresponding probability. Essentially, it tells you what you can "expect" to see, on average, if you repeat an experiment many times.

Q 23: How does a probability distribution relate to the expected outcome of a random variable ?
- A probability distribution defines the likelihood of each possible outcome of a random variable, and the expected value is a way to summarize this distribution into a single, representative number. Specifically, the expected value represents the average outcome you would expect if you repeated the random experiment many times.







In [None]:
# Q1:  Write a Python program to generate a random variable and display its value ?
import random
import numpy as np

print("Discrete Random Integer:", random.randint(1, 100))
print("Continuous Random Float (Uniform 0-1):", np.random.uniform(0, 1))
print("Normal Random Value (mean=0, std=1):", np.random.normal(0, 1))

In [None]:
# Q 2: Generate a discrete uniform distribution using Python and plot the probability mass function (PMF)?
import matplotlib.pyplot as plt
import numpy as np

values = np.arange(1, 7)  # Dice outcomes
prob = np.ones_like(values) / len(values)  # Equal probabilities

plt.stem(values, prob, use_line_collection=True)
plt.title('PMF of Discrete Uniform Distribution (Dice)')
plt.xlabel('Value')
plt.ylabel('Probability')
plt.grid(True)
plt.show()

In [None]:
# Q3: Write a Python function to calculate the probability distribution function (PDF) of a Bernoulli distribution ?
from scipy.stats import bernoulli
import matplotlib.pyplot as plt

p = 0.6  # success probability
x = [0, 1]
pmf = bernoulli.pmf(x, p)

plt.bar(x, pmf)
plt.title("Bernoulli Distribution PMF (p=0.6)")
plt.xlabel("x")
plt.ylabel("P(X=x)")
plt.show()

In [None]:
# Q 4: Write a Python script to simulate a binomial distribution with n=10 and p=0.5, then plot its histogram ?
from numpy.random import binomial
import matplotlib.pyplot as plt

data = binomial(n=10, p=0.5, size=1000)
plt.hist(data, bins=11, density=True)
plt.title("Binomial Distribution (n=10, p=0.5)")
plt.xlabel("Number of Successes")
plt.ylabel("Frequency")
plt.show()

In [None]:
# Q 5: Create a Poisson distribution and visualize it using Python>
from scipy.stats import poisson
import numpy as np
import matplotlib.pyplot as plt

mu = 3
x = np.arange(0, 10)
pmf = poisson.pmf(x, mu)

plt.stem(x, pmf, use_line_collection=True)
plt.title("Poisson Distribution (mu=3)")
plt.xlabel("x")
plt.ylabel("PMF")
plt.show()

In [None]:
# Q 6: Write a Python program to calculate and plot the cumulative distribution function (CDF) of a discrete uniform distribution ?
from scipy.stats import randint

x = np.arange(1, 8)
cdf = randint.cdf(x, 1, 7)

plt.step(x, cdf, where='post')
plt.title("CDF of Discrete Uniform Distribution")
plt.xlabel("x")
plt.ylabel("CDF")
plt.grid(True)
plt.show()

In [None]:
# Q 7: Generate a continuous uniform distribution using NumPy and visualize it> ?
import numpy as np
import matplotlib.pyplot as plt

data = np.random.uniform(low=0, high=10, size=1000)
plt.hist(data, bins=20, edgecolor='black')
plt.title("Continuous Uniform Distribution")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.show()

In [None]:
# Q 8: Simulate data from a normal distribution and plot its histogram ?

data = np.random.normal(loc=0, scale=1, size=1000)
plt.hist(data, bins=30, edgecolor='black')
plt.title("Normal Distribution Histogram")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.show()

In [None]:
# Q 9: Write a Python function to calculate Z-scores from a dataset and plot them ?

from scipy.stats import zscore
import numpy as np
import matplotlib.pyplot as plt

data = np.random.normal(100, 15, 100)
z_scores = zscore(data)

plt.plot(z_scores, marker='o', linestyle='none')
plt.title("Z-Scores of Data")
plt.xlabel("Index")
plt.ylabel("Z-score")
plt.grid(True)
plt.show()


In [None]:
# Q 10: Implement the Central Limit Theorem (CLT) using Python for a non-normal distribution ?
data = np.random.exponential(scale=2, size=10000)
sample_means = [np.mean(np.random.choice(data, 30)) for _ in range(1000)]

plt.hist(sample_means, bins=30)
plt.title("Central Limit Theorem Demonstration")
plt.xlabel("Sample Mean")
plt.ylabel("Frequency")
plt.show()

In [None]:
# Q 11: Simulate multiple samples from a normal distribution and verify the Central Limit Theorem ?

sample_means = []
for _ in range(1000):
    sample = np.random.normal(loc=10, scale=5, size=50)
    sample_means.append(np.mean(sample))

plt.hist(sample_means, bins=30)
plt.title("Multiple Samples CLT Verification")
plt.show()

In [None]:
# Q 12: Write a Python function to calculate and plot the standard normal distribution (mean = 0, std = 1) ?
from scipy.stats import norm
import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(-4, 4, 100)
pdf = norm.pdf(x)

plt.plot(x, pdf)
plt.title("Standard Normal Distribution")
plt.xlabel("x")
plt.ylabel("PDF")
plt.grid(True)
plt.show()


In [None]:
# Q 13: Generate random variables and calculate their corresponding probabilities using the binomial distribution ?

from scipy.stats import binom

n, p = 10, 0.5
x = np.arange(0, 11)
pmf = binom.pmf(x, n, p)

for i in range(len(x)):
    print(f"P(X={x[i]}) = {pmf[i]:.4f}")


In [None]:
# Q 14: Write a Python program to calculate the Z-score for a given data point and compare it to a standard normal distribution ?

x = 85
mean = 80
std = 10

z = (x - mean) / std
print("Z-score:", z)

In [None]:
# Q 15: Implement hypothesis testing using Z-statistics for a sample dataset ?

from scipy.stats import norm

sample_mean = 105
population_mean = 100
std_dev = 15
n = 36

z = (sample_mean - population_mean) / (std_dev / np.sqrt(n))
p_value = 1 - norm.cdf(z)

print("Z-statistic:", z)
print("P-value:", p_value)

In [None]:
# Q 16: Create a confidence interval for a dataset using Python and interpret the result ?

import scipy.stats as stats

data = np.random.normal(50, 10, 100)
mean = np.mean(data)
sem = stats.sem(data)
ci = stats.t.interval(0.95, len(data)-1, loc=mean, scale=sem)

print("95% Confidence Interval:", ci)

In [None]:
# Q 17: Generate data from a normal distribution, then calculate and interpret the confidence interval for its mean ?

data = np.random.normal(100, 20, 50)
mean = np.mean(data)
sem = stats.sem(data)
ci = stats.t.interval(0.95, len(data)-1, loc=mean, scale=sem)

print("CI for mean of normal data:", ci)

In [None]:
# Q 18: Write a Python script to calculate and visualize the probability density function (PDF) of a normal distribution ?

x = np.linspace(-4, 4, 100)
pdf = norm.pdf(x, loc=0, scale=1)

plt.plot(x, pdf)
plt.title("Normal Distribution PDF")
plt.grid(True)
plt.show()

In [None]:
# Q 19: Use Python to calculate and interpret the cumulative distribution function (CDF) of a Poisson distribution ?

x = np.arange(0, 11)
cdf = poisson.cdf(x, mu=3)

plt.step(x, cdf, where='post')
plt.title("CDF of Poisson Distribution")
plt.show()

In [None]:
# Q 20: Simulate a random variable using a continuous uniform distribution and calculate its expected value ?

data = np.random.uniform(0, 10, 10000)
expected_value = np.mean(data)
print("Expected Value (Uniform):", expected_value)

In [None]:
# Q 21:  Write a Python program to compare the standard deviations of two datasets and visualize the difference ?

data1 = np.random.normal(100, 10, 100)
data2 = np.random.normal(100, 20, 100)

plt.hist(data1, bins=20, alpha=0.5, label="SD=10")
plt.hist(data2, bins=20, alpha=0.5, label="SD=20")
plt.legend()
plt.title("Comparison of Standard Deviations")
plt.show()

In [None]:
# Q 22: Calculate the range and interquartile range (IQR) of a dataset generated from a normal distribution ?

import numpy as np

data = np.random.normal(0, 1, 1000)
data_range = np.max(data) - np.min(data)
iqr = np.percentile(data, 75) - np.percentile(data, 25)

print("Range:", data_range)
print("IQR:", iqr)

In [None]:
# Q 23:  Implement Z-score normalization on a dataset and visualize its transformation ?

data = np.random.normal(100, 20, 100)
z_data = zscore(data)

plt.hist(z_data, bins=20)
plt.title("Z-score Normalized Data")
plt.show()

In [None]:
 # Q 24: Write a Python function to calculate the skewness and kurtosis of a dataset generated from a normal distribution.?

from scipy.stats import skew, kurtosis

data = np.random.normal(0, 1, 1000)

print("Skewness:", skew(data))
print("Kurtosis:", kurtosis(data))