**Question1: Define the z-statistic and explain its relationship to the standard normal distribution. How is the z-statistic used in hypothesis testing?**

The z-statistic (or z-score) is a statistical measurement that describes how many standard deviations a data point is from the mean of a distribution. It is primarily used when data follows a normal distribution, and it standardizes data by converting values into a common scale with a mean of 0 and a standard deviation of 1, which corresponds to the standard normal distribution.

The z-statistic is calculated using the following formula:
z= (X-μ)/σ

X = the data point or sample mean

μ = the population mean (or the expected value)

σ = the population standard deviation


Relationship to the Standard Normal Distribution:

The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1. The z-statistic transforms any normal distribution to the standard normal distribution, which allows for easy comparison of data from different normal distributions.

In the standard normal distribution:

68.27% of the data falls within 1 standard deviation (z-scores between -1 and 1).

95.45% falls within 2 standard deviations (z-scores between -2 and 2).

99.73% falls within 3 standard deviations (z-scores between -3 and 3).

By converting raw data into z-scores, you can easily reference standard normal distribution tables (z-tables) to find probabilities and make statistical inferences.

Z-Statistic in Hypothesis Testing:

The z-statistic is commonly used in hypothesis testing, particularly when the population standard deviation (σ) is known and the sample size is large (n > 30). The goal is to determine if a sample mean significantly differs from the population mean.

Formulate the Hypotheses:

1. Null Hypothesis (H₀):There is no effect or difference. For example, H₀:μ=μ₀
where is μ₀ the population mean.

2. Alternative Hypothesis (H₁): There is a significant effect or difference. For example, H₁:μ≠μ₀

Calculate the Z-Statistic:

For a sample mean X: z= (X-μ)/σ/√n

Where:
X= sample mean
μ₀= hypothesized population mean
σ = population standard deviation
n = sample size

Determine the Critical Value:

Choose a significance level (α), typically 0.05 or 0.01, and determine the critical z-value from the z-table corresponding to this significance level.
For a two-tailed test at α = 0.05, the critical z-values are approximately ±1.96 (for a 95% confidence level).

Make a Decision:

Compare the calculated z-statistic to the critical z-value.
If the absolute value of the z-statistic exceeds the critical z-value, reject the null hypothesis.
Otherwise, fail to reject the null hypothesis.

**Question2 : What is a p-value, and how is it used in hypothesis testing? What does it mean if the p-value is very small (e.g., 0.01)?**

The p-value is a fundamental concept in hypothesis testing. It represents the probability of observing a test statistic (or something more extreme) assuming that the null hypothesis (H₀) is true. In other words, it tells us how likely it is to get the observed data (or more extreme data) if there is no actual effect or difference.

The p-value is the probability of obtaining a result at least as extreme as the one observed, under the assumption that the null hypothesis is true. Formally, it’s calculated as:

p=P(test statistic ≥ observed value | H₀ is true)

A large p-value suggests that the observed data is consistent with the null hypothesis.
A small p-value suggests that the observed data is unlikely under the null hypothesis, indicating that the null hypothesis might not be true.

P-Value in Hypothesis Testing:

Formulating the Hypotheses:

Null Hypothesis (H₀): A statement of no effect or no difference (e.g.,  H₀:μ=μ₀ )

Alternative Hypothesis (H₁): A statement that contradicts the null hypothesis (e.g., H₁:μ≠μ₀)

nducting the Test:

Collect data and calculate a test statistic (e.g., z-statistic, t-statistic) that quantifies how far the sample data is from what is expected under the null hypothesis.

Calculating the P-Value:

Use the test statistic and a probability distribution (e.g., the standard normal distribution for a z-test or the t-distribution for a t-test) to determine the p-value. The p-value is the area under the curve beyond the observed test statistic (i.e., the probability of getting that result or something more extreme).

Decision Rule:

Compare the p-value to a predetermined significance level (α), typically 0.05 or 0.01.
If p ≤ α, reject the null hypothesis (statistically significant result).
If p > α, fail to reject the null hypothesis (result is not statistically significant).


Small P-Value (e.g., 0.01):

If the p-value is very small (such as 0.01), it means the observed data is highly unlikely under the null hypothesis. In other words, there's only a 1% chance that you would observe such extreme data if the null hypothesis were true.
A small p-value leads to rejecting the null hypothesis, indicating that there is strong evidence in favor of the alternative hypothesis.

**Question3: Compare and contrast the binomial and Bernoulli distributions.**

The binomial and Bernoulli distributions are both discrete probability distributions that describe the number of successes in a series of independent trials. However, they have some key differences:

Bernoulli Distribution:

Describes the outcome of a single binary trial (success or failure).
Has only two possible values: 0 (failure) and 1 (success).
The probability of success is denoted by p and the probability of failure is denoted by q = 1 - p.
The mean of a Bernoulli distribution is p and the variance is p(1-p).

Binomial Distribution:

Describes the number of successes in a fixed number of independent Bernoulli trials.
The probability

 of exactly k successes in n trials is given by the binomial probability mass function: P(X = k) = nCk * pk * (1-p)^(n-k) where nCk is the combination of n things taken k at a time. The mean of a binomial distribution is np and the variance is np(1-p).

 Comparison:

 A Bernoulli distribution is a special case of a binomial distribution with n = 1.
Both distributions have the same parameters: p and q.
The binomial distribution is used to model the number of successes in multiple trials, while the Bernoulli distribution is used to model the outcome of a single trial.

**Question 4: Under what conditions is the binomial distribution used, and how does it relate to the Bernoulli distribution?**

The binomial distribution is used to model the number of successes in a fixed number of independent Bernoulli trials. Each trial has only two possible outcomes (success or failure), and the probability of success (p) remains constant across all trials.

Conditions for using the binomial distribution:

The trials must be independent. This means that the outcome of one trial does not affect the outcome of the next trial.
The probability of success (p) must remain constant across all trials.
The number of trials (n) must be fixed.

Relationship to the Bernoulli distribution:

The Bernoulli distribution is a special case of the binomial distribution with n = 1. This means that it describes the outcome of a single binary trial. The binomial distribution can be thought of as the sum of n independent Bernoulli trials.

**Question5: What are the key properties of the Poisson distribution, and when is it appropriate to use this distribution?**

The Poisson distribution is a discrete probability distribution that describes the number of events occurring in a fixed interval of time or space. It is often used to model rare events, such as the number of car accidents on a highway, the number of customers arriving at a store, or the number of radioactive decays in a sample.

Key properties of the Poisson distribution:

Mean and variance are equal: The mean and variance of a Poisson distribution are both equal to the parameter λ.

No upper limit: The Poisson distribution can take on any non-negative integer value.

Asymptotic normality: For large values of λ, the Poisson distribution can be approximated by a normal distribution with mean λ and variance λ.

When to use the Poisson distribution:

The events being counted are rare (i.e., the probability of an event occurring in a small interval is small).
The events occur independently of each other.
The rate of occurrence of events is constant over time or space.



**Question6: Define the terms "probability distribution" and "probability density function" (PDF). How does a PDF differ from a probability mass function (PMF)?**

Probability Distribution:

A probability distribution is a mathematical function that describes the likelihood of different possible outcomes of a random variable. It provides a complete picture of the probability of each possible value occurring.

Probability Mass Function (PMF):

A PMF is used for discrete random variables. It gives the probability that a discrete random variable is exactly equal to a specific value. In other words, it assigns a probability to each possible outcome.

Probability Density Function (PDF):

A PDF is used for continuous random variables. It does not give the exact probability of a specific value occurring, but rather the probability density at that point. The probability of a continuous random variable falling within a certain range is calculated by integrating the PDF over that range.

Key Differences:

Discrete vs. Continuous: PMFs are used for discrete variables, while PDFs are used for continuous variables.

Exact Probability: PMFs give the exact probability of a specific value, while PDFs give the probability density at a point.

Integration: To find the probability of a continuous random variable falling within a range, you need to integrate the PDF over that range.

**Question7: Explain the Central Limit Theorem (CLT) with example**

The Central Limit Theorem (CLT) is a fundamental theorem in statistics that states:

"The distribution of sample means from a large number of independent, identically distributed random variables approaches a normal distribution, regardless of the underlying distribution of the individual random variables."

In simpler terms, no matter what the shape of the original population distribution (even if it's not normal), the distribution of sample means taken from that population will become increasingly normal as the sample size increases.

Example:

Suppose you have a population of heights that is not normally distributed (e.g., it is skewed to the right). If you take many random samples of a large size from this population and calculate the mean height for each sample, the distribution of these sample means will be approximately normal, even if the original population distribution is not.

**Question8: Compare z-scores and t-scores. When should you use a z-score, and when should a t-score be applied instead?**

z-scores:

Assumptions: Known population standard deviation.

Usage: Used when the population standard deviation is known, or when the sample size is large (typically n ≥ 30).

Calculation: z = (x - μ) / σ, where x is the data point, μ is the population mean, and σ is the population standard deviation.

t-scores:

Assumptions: Unknown population standard deviation, estimated using the sample standard deviation.

Usage: Used when the population standard deviation is unknown, or when the sample size is small (typically n < 30).

Calculation: t = (x - μ) / (s / √n), where x is the data point, μ is the sample mean, s is the sample standard deviation, and n is the sample size.