Q-1

The main difference between a t-test and a z-test lies in the information available about the population standard deviation.

**T-Test:**
1. **When to use:** A t-test is appropriate when the sample size is small, and the population standard deviation is unknown. It is commonly used for testing hypotheses about the mean of a single sample or the difference between means of two independent samples.
2. **Formula for the t-statistic:** \[ t = \frac{\bar{x} - \mu}{\frac{s}{\sqrt{n}}} \]
   - \(\bar{x}\) is the sample mean.
   - \(\mu\) is the population mean under the null hypothesis.
   - \(s\) is the sample standard deviation.
   - \(n\) is the sample size.

**Z-Test:**
1. **When to use:** A z-test is appropriate when the population standard deviation is known, or the sample size is large enough for the Central Limit Theorem to apply. It is commonly used for testing hypotheses about the mean or proportion of a single sample or the difference between means of two independent samples.
2. **Formula for the z-statistic:** \[ z = \frac{\bar{x} - \mu}{\frac{\sigma}{\sqrt{n}}} \]
   - \(\bar{x}\) is the sample mean.
   - \(\mu\) is the population mean under the null hypothesis.
   - \(\sigma\) is the population standard deviation.
   - \(n\) is the sample size.

**Example Scenarios:**
- **T-Test Scenario:**
  - *Situation:* You want to test whether a new teaching method has a significant effect on student scores. You collect scores from a sample of 20 students and compare them to the population mean.
  - *Test to Use:* One-sample t-test.

- **Z-Test Scenario:**
  - *Situation:* You are comparing the average heights of two populations where the standard deviations are known, or the sample sizes are large (e.g., more than 30).
  - *Test to Use:* Two-sample z-test.

In summary, use a t-test when dealing with small sample sizes or when the population standard deviation is unknown. Use a z-test when the population standard deviation is known, or the sample size is large, allowing the Central Limit Theorem to be applied.

Q-2

**One-Tailed Test:**
- **Definition:** In a one-tailed test (also called a one-sided test), the critical region is located in only one tail of the distribution (either the right or the left).
- **Hypotheses:** The null hypothesis (\(H_0\)) typically includes an equal sign (e.g., \(\mu = 10\)), and the alternative hypothesis (\(H_a\)) specifies a direction of the effect (e.g., \(\mu > 10\) for a right-tailed test or \(\mu < 10\) for a left-tailed test).
- **Decision Rule:** You reject the null hypothesis if the test statistic falls into the critical region corresponding to the specified tail.

**Two-Tailed Test:**
- **Definition:** In a two-tailed test, the critical region is split between both tails of the distribution.
- **Hypotheses:** The null hypothesis (\(H_0\)) typically includes an equal sign (e.g., \(\mu = 10\)), and the alternative hypothesis (\(H_a\)) specifies a difference in either direction (e.g., \(\mu \neq 10\)).
- **Decision Rule:** You reject the null hypothesis if the test statistic falls into either of the critical regions corresponding to both tails.

**Key Differences:**
1. **Directionality of Effect:**
   - One-tailed tests focus on the direction of the effect (greater than or less than).
   - Two-tailed tests focus on the existence of an effect in either direction.

2. **Critical Regions:**
   - One-tailed tests have a critical region in only one tail of the distribution.
   - Two-tailed tests have critical regions in both tails of the distribution.

3. **Decision Rule:**
   - In one-tailed tests, you reject the null hypothesis if the test statistic falls into the critical region in the specified tail.
   - In two-tailed tests, you reject the null hypothesis if the test statistic falls into either of the critical regions.


Q-3

**Type I Error:**
- **Definition:** Type I error occurs when you reject a true null hypothesis. In other words, it is a false positive or the conclusion that there is a significant effect when, in reality, there is no effect.
- **Probability of Type I Error:** Denoted by \(\alpha\), it is the significance level chosen for the hypothesis test (e.g., 0.05).
- **Example Scenario:**
  - *Scenario:* A pharmaceutical company is testing a new drug to see if it reduces cholesterol levels. The null hypothesis (\(H_0\)) is that the drug has no effect on cholesterol, and the alternative hypothesis (\(H_a\)) is that the drug reduces cholesterol. If the company incorrectly concludes that the drug is effective (rejects \(H_0\)) when, in fact, it has no effect, it commits a Type I error.

**Type II Error:**
- **Definition:** Type II error occurs when you fail to reject a false null hypothesis. It is a false negative or the conclusion that there is no significant effect when, in reality, there is an effect.
- **Probability of Type II Error:** Denoted by \(\beta\), it is influenced by factors like sample size, effect size, and the chosen significance level.
- **Example Scenario:**
  - *Scenario:* A medical test is conducted to detect a disease. The null hypothesis (\(H_0\)) is that the person is healthy, and the alternative hypothesis (\(H_a\)) is that the person has the disease. If the test fails to detect the disease (fails to reject \(H_0\)) when the person actually has the disease, it results in a Type II error.

**Summary:**
- **Type I Error:** Incorrectly concluding there is an effect when there isn't (False Positive).
  - Example: Concluding a new drug is effective when it is not.
- **Type II Error:** Incorrectly concluding there is no effect when there is (False Negative).
  - Example: Failing to detect a disease when it is present.

Balancing Type I and Type II errors is crucial in hypothesis testing. Typically, decreasing the probability of one type of error increases the probability of the other, and finding an optimal balance depends on the context and consequences of the errors.

Q-4

Bayes' Theorem is a fundamental concept in probability theory that describes the probability of an event based on prior knowledge of conditions that might be related to the event. It is named after the Reverend Thomas Bayes.

The formula for Bayes' Theorem is as follows:

\[ P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)} \]

Where:
- \( P(A|B) \) is the probability of event A given that event B has occurred.
- \( P(B|A) \) is the probability of event B given that event A has occurred.
- \( P(A) \) and \( P(B) \) are the probabilities of events A and B, respectively.

**Example: Medical Test Scenario**

Suppose you have a medical test to detect a disease, and the test is not perfect. Let:
- \( A \) be the event "having the disease."
- \( B \) be the event "testing positive."

Assume the following probabilities:
- \( P(A) \), the prior probability of having the disease, is 0.01 (1% of the population has the disease).
- \( P(B|A) \), the probability of testing positive given that you have the disease, is 0.9 (90% sensitivity).
- \( P(\neg A) \), the prior probability of not having the disease, is 0.99 (1 - \( P(A) \)).
- \( P(B|\neg A) \), the probability of testing positive given that you do not have the disease, is 0.05 (5% false positive rate).

Now, we can use Bayes' Theorem to calculate the probability of actually having the disease given a positive test result:

\[ P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)} \]

\[ P(A|B) = \frac{0.9 \cdot 0.01}{P(B)} \]

To find \( P(B) \), we can use the law of total probability:

\[ P(B) = P(A) \cdot P(B|A) + P(\neg A) \cdot P(B|\neg A) \]

\[ P(B) = 0.01 \cdot 0.9 + 0.99 \cdot 0.05 \]

Now, we can substitute this back into the Bayes' Theorem formula:

\[ P(A|B) = \frac{0.9 \cdot 0.01}{0.01 \cdot 0.9 + 0.99 \cdot 0.05} \]

This calculation will give us the probability of actually having the disease given a positive test result. Bayes' Theorem allows us to update our beliefs based on new evidence (the positive test result) and helps avoid common misconceptions related to conditional probabilities.

Q-5

A confidence interval is a statistical range that estimates the true value of a population parameter with a certain level of confidence. It provides a range of values within which we can reasonably expect the true parameter to fall. The level of confidence is expressed as a percentage, and a common choice is 95%.

The formula for a confidence interval for the population mean (\(\mu\)) is given by:

\[ \text{Confidence Interval} = \bar{x} \pm \left( \text{critical value} \times \frac{s}{\sqrt{n}} \right) \]

Where:
- \(\bar{x}\) is the sample mean.
- \(s\) is the sample standard deviation.
- \(n\) is the sample size.

The critical value is determined based on the desired level of confidence and the distribution (e.g., using the t-distribution for small sample sizes or the normal distribution for large sample sizes).

**Example:**
Suppose you want to estimate the average height of students in a school with 95% confidence. You take a random sample of 30 students and find the sample mean height (\(\bar{x}\)) to be 160 cm with a sample standard deviation (\(s\)) of 5 cm.

1. **Determine the critical value:**
   - For a 95% confidence interval with \(n = 30\), you use the t-distribution.
   - Using a t-table or statistical software, you find the critical t-value for a two-tailed test with \(df = 29\) (degrees of freedom for a sample size of 30 minus 1).

2. **Calculate the margin of error:**
   \[ \text{Margin of Error} = \text{critical value} \times \frac{s}{\sqrt{n}} \]

3. **Calculate the confidence interval:**
   \[ \text{Confidence Interval} = \bar{x} \pm \text{Margin of Error} \]

Let's assume the critical t-value is approximately 2.045 for a 95% confidence level with \(df = 29\). Then:

\[ \text{Margin of Error} = 2.045 \times \frac{5}{\sqrt{30}} \]

Calculate the lower and upper bounds of the confidence interval:

\[ \text{Lower Bound} = 160 - \text{Margin of Error} \]

\[ \text{Upper Bound} = 160 + \text{Margin of Error} \]

This will give you a confidence interval such as \([157.3, 162.7]\). Interpretation: With 95% confidence, you estimate that the true average height of students in the school is between 157.3 cm and 162.7 cm based on the sample data.

Q-6

Certainly! Let's consider a sample problem where we want to calculate the probability of drawing a red marble from a bag given prior knowledge and new evidence.

**Sample Problem:**

Suppose you have a bag of marbles that contains red (R) and green (G) marbles. You know that 30% of the marbles in the bag are red, and 70% are green. Now, you draw a marble from the bag, and you want to calculate the probability that it's red given your prior knowledge and the new evidence of drawing a red marble.

**Solution using Bayes' Theorem:**

Let the events be defined as follows:
- \( R \): Drawing a red marble.
- \( G \): Drawing a green marble.

The prior probabilities are given as:
- \( P(R) \): Prior probability of drawing a red marble = 0.30.
- \( P(G) \): Prior probability of drawing a green marble = 0.70.

Now, suppose you draw a marble, and you know the conditional probabilities:
- \( P(R|G) \): Probability of drawing a red marble given that the previous marble was green = 0.10.
- \( P(G|R) \): Probability of drawing a green marble given that the previous marble was red = 0.05.

We want to calculate \( P(R) \) given the new evidence of drawing a red marble.

The formula for Bayes' Theorem is:

\[ P(R|D) = \frac{P(D|R) \cdot P(R)}{P(D)} \]

We need to calculate the probabilities:
- \( P(R|D) \): Probability of drawing a red marble given the new evidence.
- \( P(D|R) \): Probability of the new evidence given that the previous marble was red (which is 1, as we are certain of drawing a red marble next).
- \( P(R) \): Prior probability of drawing a red marble = 0.30.
- \( P(D) \): Probability of the new evidence, which can be calculated using the law of total probability.

\[ P(D) = P(D|R) \cdot P(R) + P(D|G) \cdot P(G) \]

\[ P(D) = 1 \cdot 0.30 + 0 \cdot 0.70 \]

Now, substitute these values into the Bayes' Theorem formula:

\[ P(R|D) = \frac{1 \cdot 0.30}{1 \cdot 0.30 + 0 \cdot 0.70} \]

Calculate this expression to find the probability of drawing a red marble given the new evidence.

Q-7

To calculate the 95% confidence interval for a sample of data with a known mean (\(\bar{x} = 50\)) and standard deviation (\(s = 5\)), you can use the formula for the confidence interval of the mean:

\[ \text{Confidence Interval} = \bar{x} \pm \left( \text{critical value} \times \frac{s}{\sqrt{n}} \right) \]

Where:
- \(\bar{x}\) is the sample mean,
- \(s\) is the sample standard deviation,
- \(n\) is the sample size.

For a 95% confidence interval and assuming a normal distribution, the critical value is approximately 1.96.

\[ \text{Confidence Interval} = 50 \pm \left(1.96 \times \frac{5}{\sqrt{n}}\right) \]

Without information about the sample size (\(n\)), we cannot provide an exact confidence interval. However, I can provide the general formula, and you can substitute the appropriate sample size to get the interval.

Interpretation: If we were to take multiple samples from the same population and calculate a 95% confidence interval for each sample, we would expect that about 95% of those intervals would contain the true population mean. In this case, we are 95% confident that the true mean lies within the calculated interval.



Q-8

The margin of error in a confidence interval is the range within which we expect the true population parameter (such as the mean) to lie. It quantifies the precision or uncertainty associated with our estimate. The formula for the margin of error in a confidence interval for the mean is:

\[ \text{Margin of Error} = \text{Critical Value} \times \left( \frac{\text{Standard Deviation}}{\sqrt{\text{Sample Size}}} \right) \]

Key components:
- **Critical Value:** Corresponds to the chosen confidence level and is obtained from the standard normal distribution or t-distribution.
- **Standard Deviation:** Represents the variability of the data in the population.
- **Sample Size:** Influences the precision of the estimate.

**Effect of Sample Size on Margin of Error:**
- As the sample size increases, the standard error of the mean (\(\frac{\text{Standard Deviation}}{\sqrt{\text{Sample Size}}}\)) decreases.
- A larger sample size reduces the uncertainty associated with estimating the population parameter, resulting in a smaller margin of error.
- A smaller margin of error indicates a more precise estimate.

**Example Scenario:**
Consider estimating the average height of adult males in a population. If you collect height data from a small sample of 10 males, the margin of error would be relatively large, indicating a wider range within which the true average height might lie. On the other hand, if you collect height data from a larger sample of 1000 males, the margin of error would be smaller, providing a more precise estimate of the average height.

In summary, a larger sample size tends to reduce the margin of error, making the estimate more reliable and precise.

Q-9

The z-score, also known as the standard score, measures how many standard deviations a particular data point is from the mean of a population. The formula for calculating the z-score is:

\[ Z = \frac{{X - \mu}}{{\sigma}} \]

Where:
- \( Z \) is the z-score,
- \( X \) is the value of the data point,
- \( \mu \) is the population mean,
- \( \sigma \) is the population standard deviation.

Using the given values:
\[ Z = \frac{{75 - 70}}{{5}} = 1 \]

Interpretation:
A z-score of 1 means that the data point with a value of 75 is 1 standard deviation above the mean (which is 70) in the population. In other words, it indicates that the data point is relatively higher than the average in the context of the population's distribution.

Q-10

To conduct a hypothesis test to determine if the weight loss drug is significantly effective, we can use a one-sample t-test. The null hypothesis (\(H_0\)) typically assumes no effect, and the alternative hypothesis (\(H_a\)) assumes an effect.

The hypotheses can be stated as follows:
- \(H_0\): The mean weight loss (\(\mu\)) is equal to zero (no effect of the drug).
- \(H_a\): The mean weight loss (\(\mu\)) is not equal to zero (there is an effect of the drug).

The formula for the t-test statistic is given by:

\[ t = \frac{{\bar{x} - \mu}}{{\frac{s}{\sqrt{n}}}} \]

Where:
- \(\bar{x}\) is the sample mean,
- \(\mu\) is the population mean under the null hypothesis,
- \(s\) is the sample standard deviation,
- \(n\) is the sample size.

Given the information from the study:
- Sample mean (\(\bar{x}\)) = 6 pounds
- Sample standard deviation (\(s\)) = 2.5 pounds
- Sample size (\(n\)) = 50

Assuming \(H_0: \mu = 0\) (no weight loss), we can calculate the t-statistic and compare it to the critical t-value to make a decision.

Let's calculate the t-statistic:

t = (6-0)/(2.5/(sqrt(50)))

Once you have the calculated t-statistic, you can compare it to the critical t-value for a two-tailed test at a 95% confidence level with 49 degrees of freedom (50 participants - 1). If the calculated t-statistic is greater than the critical t-value or less than its negative, you would reject the null hypothesis and conclude that there is a significant effect of the drug.

Q-11

To calculate the 95% confidence interval for the true proportion of people who are satisfied with their job, you can use the formula for the confidence interval for a population proportion (\(p\)):

\[ \text{Confidence Interval} = \hat{p} \pm Z \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}} \]

Where:
- \(\hat{p}\) is the sample proportion,
- \(Z\) is the Z-score corresponding to the desired confidence level,
- \(n\) is the sample size.

Given the information from the survey:
- Sample proportion (\(\hat{p}\)) = 0.65 (65% reported being satisfied),
- Sample size (\(n\)) = 500.

We need to find the Z-score for a 95% confidence level. For a 95% confidence level, the critical Z-value is approximately 1.96.

\[ \text{Confidence Interval} = 0.65 \pm 1.96 \sqrt{\frac{0.65(1 - 0.65)}{500}} \]

Now, calculate the confidence interval:

\[ \text{Confidence Interval} = 0.65 \pm 1.96 \sqrt{\frac{0.65 \times 0.35}{500}} \]

This will give you the lower and upper bounds of the 95% confidence interval for the true proportion of people who are satisfied with their job.

Calculate the values to obtain the confidence interval. The result might look like: \( \text{Confidence Interval} = (0.60, 0.70) \). Therefore, you can say that you are 95% confident that the true proportion of people satisfied with their job falls between 60% and 70%.

Q-12

To conduct a hypothesis test to determine if there is a significant difference in student performance between the two teaching methods, you can use a two-sample t-test. The null hypothesis (\(H_0\)) typically assumes no difference, and the alternative hypothesis (\(H_a\)) assumes a difference.

The hypotheses can be stated as follows:
- \(H_0\): The mean difference (\(\mu_A - \mu_B\)) is equal to zero (no difference between the teaching methods).
- \(H_a\): The mean difference (\(\mu_A - \mu_B\)) is not equal to zero (there is a difference between the teaching methods).

The formula for the t-test statistic for two independent samples is given by:

\[ t = \frac{(\bar{x}_A - \bar{x}_B)}{\sqrt{\left(\frac{s_A^2}{n_A}\right) + \left(\frac{s_B^2}{n_B}\right)}} \]

Where:
- \(\bar{x}_A\) and \(\bar{x}_B\) are the sample means for samples A and B, respectively.
- \(s_A\) and \(s_B\) are the sample standard deviations for samples A and B, respectively.
- \(n_A\) and \(n_B\) are the sample sizes for samples A and B, respectively.

Given the information from the study:
- Sample A mean (\(\bar{x}_A\)) = 85
- Sample A standard deviation (\(s_A\)) = 6
- Sample A size (\(n_A\)) - Not provided, but assuming it's the same as sample B.
- Sample B mean (\(\bar{x}_B\)) = 82
- Sample B standard deviation (\(s_B\)) = 5
- Sample B size (\(n_B\)) - Not provided, but assuming it's the same as sample A.

Assuming \(H_0: \mu_A - \mu_B = 0\), you can calculate the t-statistic and compare it to the critical t-value for a two-tailed test at a 0.01 significance level with degrees of freedom equal to \(n_A + n_B - 2\).

Once you have the calculated t-statistic, you can make a decision about whether to reject the null hypothesis based on the critical t-value. If the calculated t-statistic is beyond the critical t-value, you would reject the null hypothesis and conclude that there is a significant difference in student performance between the two teaching methods.

Q-13

To calculate the 90% confidence interval for the true population mean, you can use the formula for the confidence interval for a population mean (\(\mu\)):

\[ \text{Confidence Interval} = \bar{x} \pm Z \left(\frac{s}{\sqrt{n}}\right) \]

Where:
- \(\bar{x}\) is the sample mean,
- \(Z\) is the Z-score corresponding to the desired confidence level,
- \(s\) is the sample standard deviation,
- \(n\) is the sample size.

Given the information from the sample:
- Sample mean (\(\bar{x}\)) = 65,
- Sample standard deviation (\(s\)) - Not provided, but since you have a sample, you would use the sample standard deviation,
- Sample size (\(n\)) = 50.

For a 90% confidence interval, the critical Z-value is approximately 1.645.

\[ \text{Confidence Interval} = 65 \pm 1.645 \left(\frac{s}{\sqrt{50}}\right) \]

Now, if you know the sample standard deviation, you can substitute it into the formula. If not provided, you can use it as an estimate of the population standard deviation.

This will give you the lower and upper bounds of the 90% confidence interval for the true population mean.

Q-14

To conduct a hypothesis test to determine if caffeine has a significant effect on reaction time, you can use a one-sample t-test. The null hypothesis (\(H_0\)) typically assumes no effect, and the alternative hypothesis (\(H_a\)) assumes an effect.

The hypotheses can be stated as follows:
- \(H_0\): The mean reaction time (\(\mu\)) is equal to some reference value (no effect of caffeine).
- \(H_a\): The mean reaction time (\(\mu\)) is different from the reference value (caffeine has an effect).

The formula for the t-test statistic for a one-sample t-test is given by:

\[ t = \frac{(\bar{x} - \mu_0)}{\frac{s}{\sqrt{n}}} \]

Where:
- \(\bar{x}\) is the sample mean,
- \(\mu_0\) is the hypothesized population mean under the null hypothesis,
- \(s\) is the sample standard deviation,
- \(n\) is the sample size.

Given the information from the study:
- Sample mean (\(\bar{x}\)) = 0.25 seconds
- Sample standard deviation (\(s\)) = 0.05 seconds
- Sample size (\(n\)) = 30
- Confidence level = 90% (significance level \(\alpha = 0.10\) for a two-tailed test)

Assuming \(H_0: \mu = \mu_0\), where \(\mu_0\) is the hypothesized population mean reaction time (for example, the reaction time without caffeine), you can calculate the t-statistic.

For a two-tailed test with a 90% confidence level and 29 degrees of freedom (30 participants - 1), you can find the critical t-value. The critical t-value would be approximately \(t_{\text{critical}} = \pm 1.699\).

Now, calculate the t-statistic:

\[ t = \frac{(0.25 - \mu_0)}{\frac{0.05}{\sqrt{30}}} \]

Compare the calculated t-statistic with the critical t-value. If the calculated t-statistic falls outside the range \((-1.699, 1.699)\), you would reject the null hypothesis and conclude that caffeine has a significant effect on reaction time.